From 1a4bf6ec9fe5469bf4b69037f0ef5ac0b598db04 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 13:06:41 +0200 Subject: [PATCH 01/19] initial doc layout --- doc/command-line-flags.md | 0 doc/migrating-with-sbr.md | 0 doc/swapping-tables.md | 0 doc/testing-on-replica.md | 0 doc/triggerless-design.md | 0 5 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 doc/command-line-flags.md create mode 100644 doc/migrating-with-sbr.md create mode 100644 doc/swapping-tables.md create mode 100644 doc/testing-on-replica.md create mode 100644 doc/triggerless-design.md diff --git a/doc/command-line-flags.md b/doc/command-line-flags.md new file mode 100644 index 0000000..e69de29 diff --git a/doc/migrating-with-sbr.md b/doc/migrating-with-sbr.md new file mode 100644 index 0000000..e69de29 diff --git a/doc/swapping-tables.md b/doc/swapping-tables.md new file mode 100644 index 0000000..e69de29 diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md new file mode 100644 index 0000000..e69de29 diff --git a/doc/triggerless-design.md b/doc/triggerless-design.md new file mode 100644 index 0000000..e69de29 From a863ea6b292d156bcef2b4f5d6a9f64ae11bfc26 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 13:08:46 +0200 Subject: [PATCH 02/19] adding templates --- .github/CONTRIBUTING.md | 0 .github/ISSUE_TEMPLATE.md | 0 .github/PULL_REQUEST_TEMPLATE.md | 0 3 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 .github/CONTRIBUTING.md create mode 100644 .github/ISSUE_TEMPLATE.md create mode 100644 .github/PULL_REQUEST_TEMPLATE.md diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md new file mode 100644 index 0000000..e69de29 diff --git a/.github/ISSUE_TEMPLATE.md b/.github/ISSUE_TEMPLATE.md new file mode 100644 index 0000000..e69de29 diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..e69de29 From 79f31631eb5691b6231fa97efc80aae15e4c8574 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 13:10:50 +0200 Subject: [PATCH 03/19] initial CONTRIBUTING.md --- .github/CONTRIBUTING.md | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index e69de29..e681e5a 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -0,0 +1,26 @@ +## Contributing + +Hi there! We're thrilled that you'd like to contribute to this project. Your help is essential for keeping it great. + +This project adheres to the [Open Code of Conduct](http://todogroup.org/opencodeofconduct/#gh-ost/opensource@github.com). By participating, you are expected to uphold this code. + +## Submitting a pull request + +0. [Fork](https://github.com/github/gh-ost/fork) and clone the repository +0. Create a new branch: `git checkout -b my-branch-name` +0. Make your change, add tests, and make sure the tests still pass +0. Push to your fork and [submit a pull request](https://github.com/github/gh-ost/compare) +0. Pat your self on the back and wait for your pull request to be reviewed and merged. + +Here are a few things you can do that will increase the likelihood of your pull request being accepted: + +- Follow the [style guide](https://golang.org/doc/effective_go.html#formatting). +- Write tests. +- Keep your change as focused as possible. If there are multiple changes you would like to make that are not dependent upon each other, consider submitting them as separate pull requests. +- Write a [good commit message](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html). + +## Resources + +- [Contributing to Open Source on GitHub](https://guides.github.com/activities/contributing-to-open-source/) +- [Using Pull Requests](https://help.github.com/articles/using-pull-requests/) +- [GitHub Help](https://help.github.com) From d5f583d6c9de13ead4001b6343d43fd9a52cc5fc Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 13:17:45 +0200 Subject: [PATCH 04/19] more doc template --- doc/understanding-output.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 doc/understanding-output.md diff --git a/doc/understanding-output.md b/doc/understanding-output.md new file mode 100644 index 0000000..e69de29 From 02ddf76da00f0e21da2d7c752d53930be62f001d Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:07:45 +0200 Subject: [PATCH 05/19] adding documentation --- doc/migrating-with-sbr.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/doc/migrating-with-sbr.md b/doc/migrating-with-sbr.md index e69de29..565f437 100644 --- a/doc/migrating-with-sbr.md +++ b/doc/migrating-with-sbr.md @@ -0,0 +1,19 @@ +# Migrating with Statement Based Replication + +Even though `gh-ost` relies on Row Based Replication (RBR), it does not mean you can't keep your Statement Based Replication (SBR). + +`gh-ost` is happy to, and actually prefers and suggests so, connect to a replica. On this replica, it is happy to: +- issue the heavyweight `INFORMATION_SCHEMA` queries that make a table structure analysis +- issue a `select count(*) from mydb.mytable`, should `--exact-rowcount` be provided +- connect itself as a fake replica to get the binary log stream + +All of the above can be executed on the master, but we're more comfortable that they execute on a replica. + +Please note the third item: `gh-ost` connects as a fake replica and pulls the binary logs. This is how `gh-ost` finds the table's changelog: it looks up entries in the binary log. + +The magic is that your master can still produce SRB, but if you have a replica with `log-slave-updates`, you can also configure it to have `binlog_format='ROW'`. Such a replica accepts SBR statements from its master, and produces RBR statements onto its binary logs. + +`gh-ost` is happy to modify the `binlog_format` on the replica for you: +- If you supply `--switch-to-rbr`, `gh-ost` will convert the binlog format for you, and restart replication to make sure this takes effect. +- If your replica is an intermediate master, i.e. further serves as a master to other replicas, `gh-ost` will not convert the `binlog_format`. +- At any case, `gh-ost` **will not** convert back to `STATEMENT` (SBR). This is because you may be running multiple migrations concurrently. Being able to run concurrent migrations is one of the design goals of this tool. It's your own responsibility to switch back to SBR once all pending migrations are complete. From 5180206bc63ffa620c31c55c157b9d4c349585c7 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:10:02 +0200 Subject: [PATCH 06/19] adding documentation --- doc/migrating-with-sbr.md | 5 +++++ doc/testing-on-replica.md | 1 + 2 files changed, 6 insertions(+) diff --git a/doc/migrating-with-sbr.md b/doc/migrating-with-sbr.md index 565f437..3656b25 100644 --- a/doc/migrating-with-sbr.md +++ b/doc/migrating-with-sbr.md @@ -17,3 +17,8 @@ The magic is that your master can still produce SRB, but if you have a replica w - If you supply `--switch-to-rbr`, `gh-ost` will convert the binlog format for you, and restart replication to make sure this takes effect. - If your replica is an intermediate master, i.e. further serves as a master to other replicas, `gh-ost` will not convert the `binlog_format`. - At any case, `gh-ost` **will not** convert back to `STATEMENT` (SBR). This is because you may be running multiple migrations concurrently. Being able to run concurrent migrations is one of the design goals of this tool. It's your own responsibility to switch back to SBR once all pending migrations are complete. + +### Summary + +- If you're already using RBR, all is well for you +- If not, convert one of your replicas to `binlog_format='ROW'`, or let `gh-ost` do this for you. diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index e69de29..ff0d4e4 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -0,0 +1 @@ +# Testing on replica From 1d287a841765de63d84ab920a24415c82e6cd00e Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:22:28 +0200 Subject: [PATCH 07/19] adding documentation --- doc/testing-on-replica.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index ff0d4e4..017ee55 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -1 +1,8 @@ # Testing on replica + +`gh-ost`'s design allows for trusted and reliable tests of the migration without compromising production data integrity. + +Test on replica if you: +- Are unsure of `gh-ost`, have not gained confidence into its workings +- Just want to experiment with a real migration without affecting production (maybe measure migration time?) +- Wish to observe data change impact From 7463079e0da64e1dd3ddac5c10329d1e644e81f0 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:26:47 +0200 Subject: [PATCH 08/19] adding documentation --- doc/testing-on-replica.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index 017ee55..60136b2 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -6,3 +6,33 @@ Test on replica if you: - Are unsure of `gh-ost`, have not gained confidence into its workings - Just want to experiment with a real migration without affecting production (maybe measure migration time?) - Wish to observe data change impact + +### What testing on replica means + +Apply `--test-on-replica --host=`. +- `gh-ost` would connect to the indicated server +- Will verify this is indeed a replica and not a master +- Will perform _everything_ on this replica. Other then checking who the master is, it will otherwise not touch it. + - All `INFORMATION_SCHEMA` and `SELECT` queries run on the replica + - Ghost table is created on the replica + - Rows are copied onto the ghost table on the replica + - Binlog events are read from the replica and applied to ghost table on the replica + - So... everything + +`gh-ost` will sync the ghost table with the original table. +- When it is satisfied, it will issue a `STOP SLAVE IO_THREAD`, effectively stopping replication +- Will finalize last few statements +- Will terminate. No table swap takes place. No table is dropped. + +You are now left with the original table **and** the ghost table. They _should_ be identical. + +You now have the time to verify the tool works correctly. You may checksum the entire table data if you like. +- e.g. +`mysql -e 'select * from mydb.mytable' | md5sum` +`mysql -e 'select * from mydb._mytable_gst' | md5sum` + +### Cleanup + +It's your job to: +- Drop the ghost table (at your leisure, you should be aware that a `DROP` can be a lengthy operation) +- Start replication back (via `START SLAVE`) From a9d4c11aa194e1783583284f4eb230b48ae53788 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:34:52 +0200 Subject: [PATCH 09/19] adding documentation --- doc/testing-on-replica.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index 60136b2..daaffaa 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -7,7 +7,10 @@ Test on replica if you: - Just want to experiment with a real migration without affecting production (maybe measure migration time?) - Wish to observe data change impact -### What testing on replica means +## What testing on replica means + +`gh-ost` will make all changes +## Issuing a test drive Apply `--test-on-replica --host=`. - `gh-ost` would connect to the indicated server @@ -28,8 +31,11 @@ You are now left with the original table **and** the ghost table. They _should_ You now have the time to verify the tool works correctly. You may checksum the entire table data if you like. - e.g. -`mysql -e 'select * from mydb.mytable' | md5sum` -`mysql -e 'select * from mydb._mytable_gst' | md5sum` + `mysql -e 'select * from mydb.mytable order by id' | md5sum` + `mysql -e 'select * from mydb._mytable_gst order by id' | md5sum` +- or of course only select the shared columns before/after the migration +- We use the trivial `engine=innodb` for `alter` when testing. This way the resulting ghost table is identical in structure to the original table (including indexes) and we expect data to be completely identical. We use `md5sum` on the entire dataset to confirm the test result. + ### Cleanup From 8102c90ee235f2871329329e817e43bdf576279e Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 16:35:52 +0200 Subject: [PATCH 10/19] adding documentation --- doc/testing-on-replica.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index daaffaa..1495184 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -9,7 +9,8 @@ Test on replica if you: ## What testing on replica means -`gh-ost` will make all changes +TL;DR `gh-ost` will make all changes on a replica and leave both original and ghost tables for you to compare. + ## Issuing a test drive Apply `--test-on-replica --host=`. @@ -36,7 +37,6 @@ You now have the time to verify the tool works correctly. You may checksum the e - or of course only select the shared columns before/after the migration - We use the trivial `engine=innodb` for `alter` when testing. This way the resulting ghost table is identical in structure to the original table (including indexes) and we expect data to be completely identical. We use `md5sum` on the entire dataset to confirm the test result. - ### Cleanup It's your job to: From 8430dbe878bebb0925bcea70613f19e5b6acc767 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 17:00:16 +0200 Subject: [PATCH 11/19] adding documentation --- doc/understanding-output.md | 138 ++++++++++++++++++++++++++++++++++++ 1 file changed, 138 insertions(+) diff --git a/doc/understanding-output.md b/doc/understanding-output.md index e69de29..ce7afa3 100644 --- a/doc/understanding-output.md +++ b/doc/understanding-output.md @@ -0,0 +1,138 @@ +# Understading gh-ost output + +`gh-ost` attempts to be verbose to the point where you really know what it's doing, without completely spamming you. +You can control output levels: +- `--verbose`: common use. Useful output, not tons of it +- `--debug`: everything. Tons of output. + +Initial output lines may look like this: +``` +2016-05-19 17:57:04 INFO starting gh-ost 0.7.14 +2016-05-19 17:57:04 INFO Migrating `mydb`.`mytable` +2016-05-19 17:57:04 INFO connection validated on 127.0.0.1:3306 +2016-05-19 17:57:04 INFO User has ALL privileges +2016-05-19 17:57:04 INFO binary logs validated on 127.0.0.1:3306 +2016-05-19 17:57:04 INFO Restarting replication on 127.0.0.1:3306 to make sure binlog settings apply to replication thread +2016-05-19 17:57:04 INFO Table found. Engine=InnoDB +2016-05-19 17:57:05 INFO As instructed, I'm issuing a SELECT COUNT(*) on the table. This may take a while +2016-05-19 17:57:11 INFO Exact number of rows via COUNT: 4466810 +2016-05-19 17:57:11 INFO --test-on-replica given. Will not execute on master the.master:3306 but rather on replica 127.0.0.1:3306 itself +2016-05-19 17:57:11 INFO Master found to be 127.0.0.1:3306 +2016-05-19 17:57:11 INFO connection validated on 127.0.0.1:3306 +2016-05-19 17:57:11 INFO Registering replica at 127.0.0.1:3306 +2016-05-19 17:57:11 INFO Connecting binlog streamer at mysql-bin.002587:348694066 +2016-05-19 17:57:11 INFO connection validated on 127.0.0.1:3306 +2016-05-19 17:57:11 INFO rotate to next log name: mysql-bin.002587 +2016-05-19 17:57:11 INFO connection validated on 127.0.0.1:3306 +2016-05-19 17:57:11 INFO Droppping table `mydb`.`_mytable_gst` +2016-05-19 17:57:11 INFO Table dropped +2016-05-19 17:57:11 INFO Droppping table `mydb`.`_mytable_old` +2016-05-19 17:57:11 INFO Table dropped +2016-05-19 17:57:11 INFO Creating ghost table `mydb`.`_mytable_gst` +2016-05-19 17:57:11 INFO Ghost table created +2016-05-19 17:57:11 INFO Altering ghost table `mydb`.`_mytable_gst` +2016-05-19 17:57:11 INFO Ghost table altered +2016-05-19 17:57:11 INFO Droppping table `mydb`.`_mytable_osc` +2016-05-19 17:57:11 INFO Table dropped +2016-05-19 17:57:11 INFO Creating changelog table `mydb`.`_mytable_osc` +2016-05-19 17:57:11 INFO Changelog table created +2016-05-19 17:57:11 INFO Chosen shared unique key is PRIMARY +2016-05-19 17:57:11 INFO Shared columns are id,name,ref,col4,col5,col6 +``` +Those are relatively self explanatory. Mostly they indicate that all goes well. + +You will be mostly interested in following up on the migration and understanding whether it goes well. Once migration actually begins, you will see output as follows: + +``` +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 0/100; Elapsed: 0s(copy), 6s(total); streamer: mysql-bin.002587:348727198; ETA: N/A +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 1s(copy), 7s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=83.000000s +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 2s(copy), 8s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=79.000000s +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 3s(copy), 9s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=74.000000s +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 4s(copy), 10s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=69.000000s +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 5s(copy), 11s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=65.000000s +... +``` +In the above we're mostly interested to see that `ETA: throttled, replica-lag=65.000000s`. + +- Migration is throttled, i.e. `gh-ost` finds that the server is too busy, or replication is too far behind, and so it ceases (or does not start) data copy operation. +- It also provides a reason for the throttling. In out case it seems replication is too far behind. `gh-ost` awaits until replication lag is smaller than `--max-lag-millis`. + +However another thing catches the eye: `Backlog: 0/100` transitions into `Backlog: 100/100` + +- `Backlog` is the binlog events queue. A queue of events read from the binary log which are relevant for the migration. The queue gets emptied as events are applied onto the ghost table. Typically we want to see that queue empty or almost empty. However, due to the fact we're not throttled it makes perfect sense that the queue is full: htrottling means we do not apply events onto the ghost table, hence we do not purge the queue. + +``` +... +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 16s(copy), 22s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=8.000000s +Copy: 0/4466810 0.0%; Applied: 0; Backlog: 100/100; Elapsed: 17s(copy), 23s(total); streamer: mysql-bin.002587:349815124; ETA: throttled, replica-lag=2.000000s +Copy: 0/4466885 0.0%; Applied: 1492; Backlog: 100/100; Elapsed: 18s(copy), 24s(total); streamer: mysql-bin.002587:358722182; ETA: N/A +Copy: 0/4466942 0.0%; Applied: 2966; Backlog: 100/100; Elapsed: 19s(copy), 25s(total); streamer: mysql-bin.002587:367190999; ETA: N/A +Copy: 0/4466993 0.0%; Applied: 4462; Backlog: 1/100; Elapsed: 20s(copy), 26s(total); streamer: mysql-bin.002587:376732190; ETA: N/A +Copy: 12500/4466994 0.3%; Applied: 4496; Backlog: 2/100; Elapsed: 21s(copy), 27s(total); streamer: mysql-bin.002587:381475469; ETA: N/A +Copy: 25000/4466997 0.6%; Applied: 4535; Backlog: 6/100; Elapsed: 22s(copy), 28s(total); streamer: mysql-bin.002587:386747649; ETA: N/A +Copy: 40000/4467001 0.9%; Applied: 4582; Backlog: 3/100; Elapsed: 23s(copy), 29s(total); streamer: mysql-bin.002587:393017028; ETA: N/A +``` + +In the above, `gh-ost` found replication to be caught up and began operation. We note: +- `Backlog` goes down to `1` or `2` or otherwise smaller numbers. This means we are good with processing the binlog events and applying them onto the ghost table. +- `Applied` is the incrementing number of events we have applied from the binary log onto the ghost table, since the migration began. +- `Copy`: at the beginning the tool estimated `4466810` rows already existing in the table. Initially `0` of them are copied, hence `0/4466810`. But as `gh-ost` makes progress, this number grows: + - `12500/4466994 0.3%` + - `25000/4466997 0.6%` + - `40000/4467001 0.9%` + - You can also observe that the number of rows changes. This is implied by the flag `--exact-rowcount`, where we try and keep an updated amount of rows were are going to process throughout the migration, even as new rows are added and old rows deleted. This is not an exact number, but turns out to be a pretty good estimate. +- `Elapsed: 23s(copy), 29s(total)`: `total` stands for total time from executing of `gh-ost`. `copy` stands for the time elapsed since `gh-ost` finished making preparations and was good to go with copy. +``` +Copy: 40000/4467001 0.9%; Applied: 4582; Backlog: 3/100; Elapsed: 23s(copy), 29s(total); streamer: mysql-bin.002587:393017028; ETA: N/A +Copy: 50000/4467001 1.1%; Applied: 4620; Backlog: 6/100; Elapsed: 24s(copy), 30s(total); streamer: mysql-bin.002587:396414283; ETA: 35m20s +Copy: 62500/4467002 1.4%; Applied: 4671; Backlog: 3/100; Elapsed: 25s(copy), 31s(total); streamer: mysql-bin.002587:402582372; ETA: 29m21s +Copy: 75000/4467003 1.7%; Applied: 4703; Backlog: 3/100; Elapsed: 26s(copy), 32s(total); streamer: mysql-bin.002587:407864888; ETA: 25m22s +Copy: 87500/4467004 2.0%; Applied: 4751; Backlog: 6/100; Elapsed: 27s(copy), 33s(total); streamer: mysql-bin.002587:413142992; ETA: 22m31s +Copy: 100000/4467004 2.2%; Applied: 4795; Backlog: 6/100; Elapsed: 28s(copy), 34s(total); streamer: mysql-bin.002587:418380729; ETA: 20m22s +Copy: 112500/4467005 2.5%; Applied: 4835; Backlog: 1/100; Elapsed: 29s(copy), 35s(total); streamer: mysql-bin.002587:423592450; ETA: 18m42s +``` + +``` +Copy: 602500/4467053 13.5%; Applied: 6770; Backlog: 0/100; Elapsed: 1m14s(copy), 1m20s(total); streamer: mysql-bin.002587:630949369; ETA: 7m54s +Copy: 655000/4467060 14.7%; Applied: 6985; Backlog: 6/100; Elapsed: 1m19s(copy), 1m25s(total); streamer: mysql-bin.002587:652696032; ETA: 7m39s +Copy: 707500/4467066 15.8%; Applied: 7207; Backlog: 0/100; Elapsed: 1m24s(copy), 1m30s(total); streamer: mysql-bin.002587:674577141; ETA: 7m26s +Copy: 760000/4467075 17.0%; Applied: 7400; Backlog: 4/100; Elapsed: 1m29s(copy), 1m35s(total); streamer: mysql-bin.002587:696383305; ETA: 7m14s +Copy: 812500/4467083 18.2%; Applied: 7614; Backlog: 4/100; Elapsed: 1m34s(copy), 1m40s(total); streamer: mysql-bin.002587:718075114; ETA: 7m2s +Copy: 867500/4467089 19.4%; Applied: 7836; Backlog: 3/100; Elapsed: 1m39s(copy), 1m45s(total); streamer: mysql-bin.002587:740812984; ETA: 6m50s +``` + +``` +Copy: 1975000/4466798 44.2%; Applied: 12919; Backlog: 2/100; Elapsed: 3m24s(copy), 3m30s(total); streamer: mysql-bin.002588:119901391; ETA: 4m17s +Copy: 2285000/4466855 51.2%; Applied: 14234; Backlog: 13/100; Elapsed: 3m54s(copy), 4m0s(total); streamer: mysql-bin.002588:243346615; ETA: 3m43s +``` + +``` +Copy: 4320000/4467220 96.7%; Applied: 22716; Backlog: 4/100; Elapsed: 7m4s(copy), 7m10s(total); streamer: mysql-bin.002588:1034380840; ETA: 14s +Copy: 4332500/4467220 97.0%; Applied: 22760; Backlog: 0/100; Elapsed: 7m5s(copy), 7m11s(total); streamer: mysql-bin.002588:1038716298; ETA: 13s +Copy: 4342500/4467221 97.2%; Applied: 22800; Backlog: 8/100; Elapsed: 7m6s(copy), 7m12s(total); streamer: mysql-bin.002588:1043083347; ETA: 12s +Copy: 4352500/4467222 97.4%; Applied: 22841; Backlog: 3/100; Elapsed: 7m7s(copy), 7m13s(total); streamer: mysql-bin.002588:1046885242; ETA: 11s +Copy: 4365000/4467224 97.7%; Applied: 22878; Backlog: 3/100; Elapsed: 7m8s(copy), 7m14s(total); streamer: mysql-bin.002588:1051545604; ETA: 10s +Copy: 4377500/4467224 98.0%; Applied: 22915; Backlog: 0/100; Elapsed: 7m9s(copy), 7m15s(total); streamer: mysql-bin.002588:1055784141; ETA: 8s +Copy: 4387500/4467225 98.2%; Applied: 22949; Backlog: 6/100; Elapsed: 7m10s(copy), 7m16s(total); streamer: mysql-bin.002588:1060089849; ETA: 7s +Copy: 4397500/4467226 98.4%; Applied: 22996; Backlog: 8/100; Elapsed: 7m11s(copy), 7m17s(total); streamer: mysql-bin.002588:1063945589; ETA: 6s +Copy: 4410000/4467227 98.7%; Applied: 23045; Backlog: 5/100; Elapsed: 7m12s(copy), 7m18s(total); streamer: mysql-bin.002588:1068763841; ETA: 5s +Copy: 4420000/4467229 98.9%; Applied: 23086; Backlog: 5/100; Elapsed: 7m13s(copy), 7m19s(total); streamer: mysql-bin.002588:1072751966; ETA: 4s +2016-05-19 18:04:25 INFO rotate to next log name: mysql-bin.002589 +2016-05-19 18:04:25 INFO rotate to next log name: mysql-bin.002589 +Copy: 4430000/4467231 99.2%; Applied: 23124; Backlog: 3/100; Elapsed: 7m14s(copy), 7m20s(total); streamer: mysql-bin.002589:2944139; ETA: 3s +Copy: 4442500/4467231 99.4%; Applied: 23181; Backlog: 2/100; Elapsed: 7m15s(copy), 7m21s(total); streamer: mysql-bin.002589:8042490; ETA: 2s +Copy: 4452500/4467232 99.7%; Applied: 23235; Backlog: 5/100; Elapsed: 7m16s(copy), 7m22s(total); streamer: mysql-bin.002589:12084190; ETA: 1s +Copy: 4462500/4467235 99.9%; Applied: 23295; Backlog: 8/100; Elapsed: 7m17s(copy), 7m23s(total); streamer: mysql-bin.002589:16174016; ETA: 0s +2016-05-19 18:04:29 INFO Row copy complete +Copy: 4466492/4467235 100.0%; Applied: 23309; Backlog: 0/100; Elapsed: 7m17s(copy), 7m24s(total); streamer: mysql-bin.002589:17255091; ETA: 0s +2016-05-19 18:04:29 INFO Stopping replication +2016-05-19 18:04:29 INFO Replication stopped +2016-05-19 18:04:29 INFO Verifying SQL thread is running +2016-05-19 18:04:29 INFO SQL thread started +2016-05-19 18:04:29 INFO Replication IO thread at mysql-bin.001801:719204179. SQL thread is at mysql-bin.001801:719204179 +2016-05-19 18:04:29 INFO Writing changelog state: AllEventsUpToLockProcessed +2016-05-19 18:04:29 INFO Waiting for events up to lock +Copy: 4466492/4467235 100.0%; Applied: 23309; Backlog: 1/100; Elapsed: 7m18s(copy), 7m24s(total); streamer: mysql-bin.002589:17702369; ETA: 0s +2016-05-19 18:04:30 INFO Done waiting for events up to lock +Copy: 4466492/4467235 100.0%; Applied: 23309; Backlog: 0/100; Elapsed: 7m18s(copy), 7m25s(total); streamer: mysql-bin.002589:17703056; ETA: 0s +``` From 5a5f43d15bebecefe4d760158a12062f8abbd676 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 20 May 2016 17:08:31 +0200 Subject: [PATCH 12/19] adding documentation --- doc/understanding-output.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/doc/understanding-output.md b/doc/understanding-output.md index ce7afa3..b4e0584 100644 --- a/doc/understanding-output.md +++ b/doc/understanding-output.md @@ -82,8 +82,12 @@ In the above, `gh-ost` found replication to be caught up and began operation. We - `40000/4467001 0.9%` - You can also observe that the number of rows changes. This is implied by the flag `--exact-rowcount`, where we try and keep an updated amount of rows were are going to process throughout the migration, even as new rows are added and old rows deleted. This is not an exact number, but turns out to be a pretty good estimate. - `Elapsed: 23s(copy), 29s(total)`: `total` stands for total time from executing of `gh-ost`. `copy` stands for the time elapsed since `gh-ost` finished making preparations and was good to go with copy. +- `streamer: mysql-bin.002587:393017028` tells us which binary log entry is `gh-ost` processing at this time. +- `ETA`: Estimated Time of Arrival, is still `N/A` since `gh-ost` has not collected enough data to make an estimate. + +Some time later, we will have: + ``` -Copy: 40000/4467001 0.9%; Applied: 4582; Backlog: 3/100; Elapsed: 23s(copy), 29s(total); streamer: mysql-bin.002587:393017028; ETA: N/A Copy: 50000/4467001 1.1%; Applied: 4620; Backlog: 6/100; Elapsed: 24s(copy), 30s(total); streamer: mysql-bin.002587:396414283; ETA: 35m20s Copy: 62500/4467002 1.4%; Applied: 4671; Backlog: 3/100; Elapsed: 25s(copy), 31s(total); streamer: mysql-bin.002587:402582372; ETA: 29m21s Copy: 75000/4467003 1.7%; Applied: 4703; Backlog: 3/100; Elapsed: 26s(copy), 32s(total); streamer: mysql-bin.002587:407864888; ETA: 25m22s @@ -91,29 +95,23 @@ Copy: 87500/4467004 2.0%; Applied: 4751; Backlog: 6/100; Elapsed: 27s(copy), 33s Copy: 100000/4467004 2.2%; Applied: 4795; Backlog: 6/100; Elapsed: 28s(copy), 34s(total); streamer: mysql-bin.002587:418380729; ETA: 20m22s Copy: 112500/4467005 2.5%; Applied: 4835; Backlog: 1/100; Elapsed: 29s(copy), 35s(total); streamer: mysql-bin.002587:423592450; ETA: 18m42s ``` +And `gh-ost` progressively provides an ETA. + +Status frequency: +- In the first `60` seconds `gh-ost` emits a status entry every `1` second. +- Then, up till `3` miinutes into operation, status shows every `5` seconds. +- It then drops down to once per `30` seconds +- But goes into once-per-`5`-seconds again when it estimates < `3` minutes ETA +- And once per `1` second when it estimates < `1` minute ETA ``` Copy: 602500/4467053 13.5%; Applied: 6770; Backlog: 0/100; Elapsed: 1m14s(copy), 1m20s(total); streamer: mysql-bin.002587:630949369; ETA: 7m54s Copy: 655000/4467060 14.7%; Applied: 6985; Backlog: 6/100; Elapsed: 1m19s(copy), 1m25s(total); streamer: mysql-bin.002587:652696032; ETA: 7m39s Copy: 707500/4467066 15.8%; Applied: 7207; Backlog: 0/100; Elapsed: 1m24s(copy), 1m30s(total); streamer: mysql-bin.002587:674577141; ETA: 7m26s -Copy: 760000/4467075 17.0%; Applied: 7400; Backlog: 4/100; Elapsed: 1m29s(copy), 1m35s(total); streamer: mysql-bin.002587:696383305; ETA: 7m14s -Copy: 812500/4467083 18.2%; Applied: 7614; Backlog: 4/100; Elapsed: 1m34s(copy), 1m40s(total); streamer: mysql-bin.002587:718075114; ETA: 7m2s -Copy: 867500/4467089 19.4%; Applied: 7836; Backlog: 3/100; Elapsed: 1m39s(copy), 1m45s(total); streamer: mysql-bin.002587:740812984; ETA: 6m50s -``` - -``` +... Copy: 1975000/4466798 44.2%; Applied: 12919; Backlog: 2/100; Elapsed: 3m24s(copy), 3m30s(total); streamer: mysql-bin.002588:119901391; ETA: 4m17s Copy: 2285000/4466855 51.2%; Applied: 14234; Backlog: 13/100; Elapsed: 3m54s(copy), 4m0s(total); streamer: mysql-bin.002588:243346615; ETA: 3m43s -``` - -``` -Copy: 4320000/4467220 96.7%; Applied: 22716; Backlog: 4/100; Elapsed: 7m4s(copy), 7m10s(total); streamer: mysql-bin.002588:1034380840; ETA: 14s -Copy: 4332500/4467220 97.0%; Applied: 22760; Backlog: 0/100; Elapsed: 7m5s(copy), 7m11s(total); streamer: mysql-bin.002588:1038716298; ETA: 13s -Copy: 4342500/4467221 97.2%; Applied: 22800; Backlog: 8/100; Elapsed: 7m6s(copy), 7m12s(total); streamer: mysql-bin.002588:1043083347; ETA: 12s -Copy: 4352500/4467222 97.4%; Applied: 22841; Backlog: 3/100; Elapsed: 7m7s(copy), 7m13s(total); streamer: mysql-bin.002588:1046885242; ETA: 11s -Copy: 4365000/4467224 97.7%; Applied: 22878; Backlog: 3/100; Elapsed: 7m8s(copy), 7m14s(total); streamer: mysql-bin.002588:1051545604; ETA: 10s -Copy: 4377500/4467224 98.0%; Applied: 22915; Backlog: 0/100; Elapsed: 7m9s(copy), 7m15s(total); streamer: mysql-bin.002588:1055784141; ETA: 8s -Copy: 4387500/4467225 98.2%; Applied: 22949; Backlog: 6/100; Elapsed: 7m10s(copy), 7m16s(total); streamer: mysql-bin.002588:1060089849; ETA: 7s +... Copy: 4397500/4467226 98.4%; Applied: 22996; Backlog: 8/100; Elapsed: 7m11s(copy), 7m17s(total); streamer: mysql-bin.002588:1063945589; ETA: 6s Copy: 4410000/4467227 98.7%; Applied: 23045; Backlog: 5/100; Elapsed: 7m12s(copy), 7m18s(total); streamer: mysql-bin.002588:1068763841; ETA: 5s Copy: 4420000/4467229 98.9%; Applied: 23086; Backlog: 5/100; Elapsed: 7m13s(copy), 7m19s(total); streamer: mysql-bin.002588:1072751966; ETA: 4s @@ -136,3 +134,5 @@ Copy: 4466492/4467235 100.0%; Applied: 23309; Backlog: 1/100; Elapsed: 7m18s(cop 2016-05-19 18:04:30 INFO Done waiting for events up to lock Copy: 4466492/4467235 100.0%; Applied: 23309; Backlog: 0/100; Elapsed: 7m18s(copy), 7m25s(total); streamer: mysql-bin.002589:17703056; ETA: 0s ``` +This migration took, till this point, `7m25s`, had applied `23309` events from the binary log and has copied `4466492` rows onto the ghost table. + From 0050665393bf43d6cab5fb2e4ad2513e8e01d11f Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 23 May 2016 11:35:04 +0200 Subject: [PATCH 13/19] adding documentation --- doc/testing-on-replica.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index 1495184..ea3bbfd 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -42,3 +42,18 @@ You now have the time to verify the tool works correctly. You may checksum the e It's your job to: - Drop the ghost table (at your leisure, you should be aware that a `DROP` can be a lengthy operation) - Start replication back (via `START SLAVE`) + +### Examples + +Simple: +```shell +$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --initially-drop-ghost-table --initially-drop-old-table --test-on-replica --verbose --execute +``` + +Elaborate: +```shell +$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --switch-to-rbr --initially-drop-ghost-table --initially-drop-old-table --test-on-replica --postpone-swap-tables-flag-file=/tmp/ghost-postpone.flag --exact-rowcount --allow-nullable-unique-key --verbose --execute +``` +- Count exact number of rows (makes ETA estimation very good). This goes at the expense of paying the time for issuing a `SELECT COUNT(*)` on your table. We use this lovingly. +- Automatically switch to `RBR` if replica is configured as `SBR`. See also: [migrating with SBR](migrating-with-sbr.md) +- allow iterating on a `UNIQUE KEY` that has `NULL`able columns (at your own risk) From aae0f5cee48929887f432c04524730c7a165b47d Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 23 May 2016 11:59:42 +0200 Subject: [PATCH 14/19] adding documentation --- doc/swapping-tables.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/doc/swapping-tables.md b/doc/swapping-tables.md index e69de29..ab798c6 100644 --- a/doc/swapping-tables.md +++ b/doc/swapping-tables.md @@ -0,0 +1,14 @@ +# Swapping the tables + +The table-swap is the final major step of the migration: it's the moment where your original table is pushed aside, and the ghost table (the one we secretly altered and operated on throughout the process) takes its place. + +MySQL poses some limitations on how the table swap can take place. While it supports an atomic swap, it does not allow for a swap under controlled lock. + +The [facebook OSC](https://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932/) tool documents this nicely. Look for **"Cut-over phase"**. + +`gh-ost` supports various types of table-swap / cut-over options: + +- `--quick-and-bumpy-swap-tables` - this method is similar to the one taken by the facebook OSC. It's non-blocking but also non-atomic. The original table is first renames and pushed aside, then the ghost table is renamed to take its place. In between the two renames there's a brief period of time where your table just does not exist, and queries will fail. +- Voluntary lock based solution (default at this time): as depicted in [Solving the Facebook-OSC non-atomic table swap problem](http://code.openark.org/blog/mysql/solving-the-facebook-osc-non-atomic-table-swap-problem), this solution uses voluntary MySQL locks, and makes for a blocking swap, where your queries do not fail, but block until operation is complete. This effect is desired. There is danger in this solution, since connection failure of the two sessions involved in creating the lock, would result in a premature swap of the tables, hence with potentially corrupted data. +- We are working at this time on a blocking, safe, atomic solution, using wait conditions and via User Defined Functions which will need to be dynamically loaded onto your MySQL server. +- With [`--test-on-replica`](testing-on-replica.md) there is no table swap. From 493b8512acc95b74da52f942cf913ef95b97f734 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 23 May 2016 12:13:54 +0200 Subject: [PATCH 15/19] adding documentation --- doc/command-line-flags.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/doc/command-line-flags.md b/doc/command-line-flags.md index e69de29..c039d83 100644 --- a/doc/command-line-flags.md +++ b/doc/command-line-flags.md @@ -0,0 +1,16 @@ +# Command line flags + +A more in-depth discussion of various `gh-ost` command line flags: implementation, implication, use cases. + +##### exact-rowcount + +A `gh-ost` execution need to copy whatever rows you have in your existing table onto the ghost table. This can, and often be, a large number. Exactly what that number is? +`gh-ost` initially estimates the number of rows in your table by issuing an `explain select * from your_table`. This will use statistics on your table and return with a rough estimate. How rough? It might go as low as half or as high as double the actual number of rows in your table. This is the same method as used in [`pt-online-schema-change`](https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html). + +`gh-ost` also supports the `--exact-rowcount` flag. When this flag is given, two things happen: +- An initial, authoritative `select count(*) from your_table`. + This query may take a long time to complete, but is performed before we begin the massive operations. +- A continuous update to the estimate as we make progress applying events. + We heuristically update the number of rows based on the queries we process from the binlogs. + +While the ongoing estimated number of rows is still heuristic, it's almost exact, such that the reported [ETA](understanding-output.md) or percentage progress is typically accurate to the second throughout a multiple-hour operation. From 026cd122baa66eb1f457bc40955ea0c1c5324bc0 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 23 May 2016 12:32:43 +0200 Subject: [PATCH 16/19] adding documentation --- README.md | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 2fb5fc3..7034496 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,19 @@ -# gh-ost -GitHub's Online Schema Change for MySQL +# gh-ost: GitHub's online schema migration for MySQL + +`gh-ost` allows for online schema migrations in MySQL + + +## What's in a name? + +Originally this was named `gh-osc`: GitHub Online Schema Change, in the likes of [Facebook online schema change](https://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932/) and [pt-online-schema-change](https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html). + +But then a rare genetic mutation happened, and the `s` transformed into `t`. And that sent us down the path of trying to figure out a new acronym. Right now, `gh-ost` (pronounce: _Ghost_), stands for: +- GitHub Online Schema Translator/Transformer/Transfigurator + +## Authors + +`gh-ost` was designed, authored, reviewed and tested by the database infrastructure team at GitHub: +- @jonahberquist +- @ggunson +- @tomkrouper +- @shlomi-noach From 059a59939ec2ee3079d681939ab6909967a84a7f Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 23 May 2016 12:33:28 +0200 Subject: [PATCH 17/19] adding documentation --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 7034496..66bc12c 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,5 @@ -# gh-ost: GitHub's online schema migration for MySQL +# gh-ost +#### GitHub's online schema migration for MySQL `gh-ost` allows for online schema migrations in MySQL From b4d5115187627970a3f65d0dd8477ba2d5a53306 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 25 May 2016 12:31:33 +0200 Subject: [PATCH 18/19] adding documentation --- README.md | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 66bc12c..7d92d73 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,18 @@ # gh-ost + #### GitHub's online schema migration for MySQL -`gh-ost` allows for online schema migrations in MySQL +`gh-ost` allows for online schema migrations in MySQL which are: +- Triggerless +- Testable +- Pausable +- Operations-friendly +## How? + +WORK IN PROGRESS + +Please meanwhile refer to the [docs](doc) for more information. ## What's in a name? @@ -14,7 +24,7 @@ But then a rare genetic mutation happened, and the `s` transformed into `t`. And ## Authors `gh-ost` was designed, authored, reviewed and tested by the database infrastructure team at GitHub: -- @jonahberquist -- @ggunson -- @tomkrouper -- @shlomi-noach +- [@jonahberquist](https://github.com/jonahberquist) +- [@ggunson](https://github.com/ggunson) +- [@tomkrouper](https://github.com/tomkrouper) +- [@shlomi-noach](https://github.com/shlomi-noach) From ed81a42e863de1a3d8b7e0a6abef468c3c424957 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 25 May 2016 12:34:37 +0200 Subject: [PATCH 19/19] adding documentation --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7d92d73..fe7efe6 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ But then a rare genetic mutation happened, and the `s` transformed into `t`. And ## Authors -`gh-ost` was designed, authored, reviewed and tested by the database infrastructure team at GitHub: +`gh-ost` is designed, authored, reviewed and tested by the database infrastructure team at GitHub: - [@jonahberquist](https://github.com/jonahberquist) - [@ggunson](https://github.com/ggunson) - [@tomkrouper](https://github.com/tomkrouper)