From d5161c6a8920ebcd542e9f0cf84da6855477f0d3 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 12:46:54 +0200 Subject: [PATCH 1/7] updating documentation following recent developments describing `--concurrent-rowcount` --- doc/cheatsheet.md | 3 +++ doc/command-line-flags.md | 5 +++++ doc/testing-on-replica.md | 2 +- 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/doc/cheatsheet.md b/doc/cheatsheet.md index 77b2809..21bd2eb 100644 --- a/doc/cheatsheet.md +++ b/doc/cheatsheet.md @@ -37,6 +37,7 @@ gh-ost \ --allow-master-master \ --cut-over=default \ --exact-rowcount \ +--concurrent-rowcount \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ @@ -72,6 +73,7 @@ gh-ost \ --allow-master-master \ --cut-over=default \ --exact-rowcount \ +--concurrent-rowcount \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \ @@ -105,6 +107,7 @@ gh-ost \ --chunk-size=2500 \ --cut-over=default \ --exact-rowcount \ + --concurrent-rowcount \ --serve-socket-file=/tmp/gh-ost.test.sock \ --panic-flag-file=/tmp/gh-ost.panic.flag \ --execute diff --git a/doc/command-line-flags.md b/doc/command-line-flags.md index 3ede1ff..83b0da0 100644 --- a/doc/command-line-flags.md +++ b/doc/command-line-flags.md @@ -32,6 +32,10 @@ user=gromit password=123456 ``` +### concurrent-rowcount + +See `exact-rowcount` + ### cut-over Optional. Default is `safe`. See more discussion in [cut-over](cut-over.md) @@ -44,6 +48,7 @@ A `gh-ost` execution need to copy whatever rows you have in your existing table `gh-ost` also supports the `--exact-rowcount` flag. When this flag is given, two things happen: - An initial, authoritative `select count(*) from your_table`. This query may take a long time to complete, but is performed before we begin the massive operations. + When `--concurrent-rowcount` is also specified, this runs in paralell to row copy. - A continuous update to the estimate as we make progress applying events. We heuristically update the number of rows based on the queries we process from the binlogs. diff --git a/doc/testing-on-replica.md b/doc/testing-on-replica.md index cfc3c24..cd3cfd1 100644 --- a/doc/testing-on-replica.md +++ b/doc/testing-on-replica.md @@ -54,7 +54,7 @@ $ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample Elaborate: ```shell -$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --switch-to-rbr --initially-drop-ghost-table --initially-drop-old-table --test-on-replica --postpone-cut-over-flag-file=/tmp/ghost-postpone.flag --exact-rowcount --allow-nullable-unique-key --verbose --execute +$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --switch-to-rbr --initially-drop-ghost-table --initially-drop-old-table --test-on-replica --postpone-cut-over-flag-file=/tmp/ghost-postpone.flag --exact-rowcount --concurrent-rowcount --allow-nullable-unique-key --verbose --execute ``` - Count exact number of rows (makes ETA estimation very good). This goes at the expense of paying the time for issuing a `SELECT COUNT(*)` on your table. We use this lovingly. - Automatically switch to `RBR` if replica is configured as `SBR`. See also: [migrating with SBR](migrating-with-sbr.md) From 5773fd22aeb9240d79de63a4e4664762db8745e7 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:12:24 +0200 Subject: [PATCH 2/7] more comments on cut-over --- doc/cut-over.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/cut-over.md b/doc/cut-over.md index b1aaf6d..ada5100 100644 --- a/doc/cut-over.md +++ b/doc/cut-over.md @@ -15,3 +15,7 @@ This solution either: Also note: - With `--migrate-on-replica` the cut-over is executed in exactly the same way as on master. - With `--test-on-replica` the replication is first stopped; then the cut-over is executed just as on master, but then reverted (tables rename forth then back again). + +Internals of the atomic cut-over are discussed in [Issue #82](https://github.com/github/gh-ost/issues/82). + +At this time the command-line argument `--cut-over` is supported, and defaults to the atomic cut-over algorithm described above. Also supported is `--cut-over=two-step`, which uses the FB non-atomic algorithm. We recommend using the default cut-over that has been battle tested in our production environments. From 9c927799392a60c4cde22fb16bf88988d0bc639c Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:13:04 +0200 Subject: [PATCH 3/7] begin documentation on sub-second replication lag throttling --- doc/cheatsheet.md | 2 +- doc/command-line-flags.md | 13 +++++++++++++ doc/perks.md | 4 ++++ doc/subsecond-lag.md | 3 +++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 doc/subsecond-lag.md diff --git a/doc/cheatsheet.md b/doc/cheatsheet.md index 21bd2eb..d51dafc 100644 --- a/doc/cheatsheet.md +++ b/doc/cheatsheet.md @@ -104,7 +104,7 @@ gh-ost \ --initially-drop-old-table \ --max-load=Threads_running=30 \ --switch-to-rbr \ - --chunk-size=2500 \ + --chunk-size=500 \ --cut-over=default \ --exact-rowcount \ --concurrent-rowcount \ diff --git a/doc/command-line-flags.md b/doc/command-line-flags.md index 83b0da0..cd6a14f 100644 --- a/doc/command-line-flags.md +++ b/doc/command-line-flags.md @@ -68,6 +68,19 @@ We think `gh-ost` should not take chances or make assumptions about the user's t See #initially-drop-ghost-table +### max-lag-millis + +On a replication topology, this is perhaps the most important migration throttling factor: the maximum lag allowed for migration to work. If lag exceeds this value, migration throttles. + +When using [Connect to replica, migrate on master](cheatsheet.md), this lag is primarily tested on the very replica `gh-ost` operates on. Lag is measured by checking the heartbeat events injected by `gh-ost` itself on the utility changelog table. That is, to measure this replica's lag, `gh-ost` doesn't need to issue `show slave status` nor have any external heartbeat mechanism. + +When `--throttle-control-replicas` is provided, throttling also considers lag on specified hosts. Measuring lag on these hosts works as follows: + +- If `--replication-lag-query` is provided, use the query, trust its result to indicate lag seconds (fraction, i.e. float, allowed) +- Otherwise, issue `show slave status` and read `Seconds_behind_master` (`1sec` granularity) + +See also: [Sub-second replication lag throttling](subsecond-lag.md) + ### migrate-on-replica Typically `gh-ost` is used to migrate tables on a master. If you wish to only perform the migration in full on a replica, connect `gh-ost` to said replica and pass `--migrate-on-replica`. `gh-ost` will briefly connect to the master but other issue no changes on the master. Migration will be fully executed on the replica, while making sure to maintain a small replication lag. diff --git a/doc/perks.md b/doc/perks.md index a80c193..bb8c710 100644 --- a/doc/perks.md +++ b/doc/perks.md @@ -58,3 +58,7 @@ You begin a migration, and the ETA is for it to complete at 04:00am. Not a good Today, DBAs are coordinating the migration start time such that it completes in a convenient hour. `gh-ost` offers an alternative: postpone the final cut-over phase till you're ready. Execute `gh-ost` with `--postpone-cut-over-flag-file=/path/to/flag.file`. As long as this file exists, `gh-ost` will not take the final cut-over step. It will complete the row copy, and continue to synchronize the tables by continuously applying changes made on the original table onto the ghost table. It can do so on and on and on. When you're finally ready, remove the file and cut-over will take place. + +### Sub-second lag throttling + +With sub-second replication lag measurements, `gh-ost` is able to keep a fleet of replicas well below `1sec` lag throughout the migration. We encourage you to issue sub-second heartbeats. Read more on [sub-second replication lag throttling](subsecond-lag.md) diff --git a/doc/subsecond-lag.md b/doc/subsecond-lag.md new file mode 100644 index 0000000..e9dacb7 --- /dev/null +++ b/doc/subsecond-lag.md @@ -0,0 +1,3 @@ +# Sub-second replication lag throttling + +`gh-ost` is able to utilize sub-second replication lag measurements. We strongly suggest From 25400cdf96cdedf58c2e8a0259d61bf729f4c4c3 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:20:39 +0200 Subject: [PATCH 4/7] clarified throttling logic; indicating sub-second lag --- doc/throttle.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/doc/throttle.md b/doc/throttle.md index e02d207..8c28f01 100644 --- a/doc/throttle.md +++ b/doc/throttle.md @@ -28,11 +28,15 @@ Otherwise you may specify your own list of replica servers you wish it to observ - `--max-lag-millis`: maximum allowed lag; any controlled replica lagging more than this value will cause throttling to kick in. When all control replicas have smaller lag than indicated, operation resumes. -- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single `int` value indicating replication lag. +- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single decimal (floating point) value indicating replication lag. - Example: `--replication-lag-query="SELECT ROUND(NOW() - MAX(UNIX_TIMESTAMP(ts))) AS lag FROM mydb.heartbeat"` + Example: `--replication-lag-query="SELECT UNIX_TIMESTAMP() - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"` -Note that you may dynamically change the `throttle-control-replicas` list via [interactive commands](interactive-commands.md) + We encourage you to use [sub-second replication lag throttling](subsecond-lag.md). Your query may then look like: + + `--replication-lag-query="SELECT UNIX_TIMESTAMP(6) - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"` + +Note that you may dynamically change both `replication-lag-query` and the `throttle-control-replicas` list via [interactive commands](interactive-commands.md) #### Status thresholds @@ -76,9 +80,9 @@ In addition to the above, you are able to take control and throttle the operatio Any single factor in the above that suggests the migration should throttle - causes throttling. That is, once some component decides to throttle, you cannot override it; you cannot force continued execution of the migration. -`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, then status thesholds, and lastly it will check the throttle-query. +`gh-ost` collects different throttle-related metrics at different times, independently. It asynchronously reads the collected metrics and checks if they satisfy conditions/threasholds. -The first check to suggest throttling stops the search; the status message will note the reason for throttling as the first satisfied check. +The first check to suggest throttling stops the check; the status message will note the reason for throttling as the first satisfied check. ### Throttle status From 34a7306f4bf4741fad9bd02cbae8592b587d5d1e Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:44:30 +0200 Subject: [PATCH 5/7] elaborate sub-second lag throttling --- doc/subsecond-lag.md | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/doc/subsecond-lag.md b/doc/subsecond-lag.md index e9dacb7..bb00435 100644 --- a/doc/subsecond-lag.md +++ b/doc/subsecond-lag.md @@ -1,3 +1,31 @@ # Sub-second replication lag throttling -`gh-ost` is able to utilize sub-second replication lag measurements. We strongly suggest +`gh-ost` is able to utilize sub-second replication lag measurements. + +At GitHub, small replication lag is crucial, and we like to keep it below `1s` at all times. If you have similar concern, we strongly urge you to proceed to implement sub-second lag throttling. + +`gh-ost` will do sub-second throttling when `--max-lag-millis` is smaller than `1000`, i.e. smaller than `1sec`. +Replication lag is measured on: + +- The "inspected" server (the server `gh-ost` connects to; replica is desired but not mandatory) +- The `throttle-control-replicas` list + +For the inspected server, `gh-ost` uses an internal heartbeat mechanism. It injects heartbeat events onto the utility changelog table, then reads those events in the binary log, and compares times. This measurement is by default adn by definition sub-second enabled. + +You can explicitly define how frequently will `gh-ost` inject heartbeat events, via `heartbeat-interval-millis`. You should set `heartbeat-interval-millis <= max-lag-millis`. It still works if not, but loses granularity and effect. + +On the `throttle-control-replicas`, `gh-ost` only issues SQL queries, and does not attempt to read the binary log stream. Perhaps thsoe other replicas don't have binary logs in the first place. + +The standard way of getting replication lag on a replica is to issue `SHOW SLAVE STATUS`, then reading `Seconds_behind_master` value. But that value has a `1sec` granularity. + +To be able to throttle on your production replicas fleet when replication lag exceeds a sub-second threshold, you must provide with a `replication-lag-query` that returns a sub-second resolution lag. + +As a common example, many use [pt-heartbeat](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html) to inject heartbeat events on the master. You would issue something like: + + /usr/bin/pt-heartbeat -- -D your_schema --create-table --update --replace --interval=0.1 --daemonize --pid ... + +Note `--interval=0.1` to indicate `10` heartbeats per second. + +You would then provide `--replication-lag-query="select unix_timestamp(now(6)) - unix_timestamp(ts) as ghost_lag_check from your_schema.heartbeat order by ts desc limit 1"` + +Our production migrations use sub-second lag throttling and are able to keep our entire fleet of replicas well below `1sec` lag. From ad3d1b23846fe340fbbdf6cd83ce2f65179f37bf Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:45:37 +0200 Subject: [PATCH 6/7] beautify --- doc/subsecond-lag.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/subsecond-lag.md b/doc/subsecond-lag.md index bb00435..0309c76 100644 --- a/doc/subsecond-lag.md +++ b/doc/subsecond-lag.md @@ -26,6 +26,8 @@ As a common example, many use [pt-heartbeat](https://www.percona.com/doc/percona Note `--interval=0.1` to indicate `10` heartbeats per second. -You would then provide `--replication-lag-query="select unix_timestamp(now(6)) - unix_timestamp(ts) as ghost_lag_check from your_schema.heartbeat order by ts desc limit 1"` +You would then provide + + gh-ost ... --replication-lag-query="select unix_timestamp(now(6)) - unix_timestamp(ts) as ghost_lag_check from your_schema.heartbeat order by ts desc limit 1" Our production migrations use sub-second lag throttling and are able to keep our entire fleet of replicas well below `1sec` lag. From 736c8a042b86da13692155d0e0e2fefd800096ee Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 2 Sep 2016 08:54:21 +0200 Subject: [PATCH 7/7] typos --- doc/subsecond-lag.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/subsecond-lag.md b/doc/subsecond-lag.md index 0309c76..00f67cb 100644 --- a/doc/subsecond-lag.md +++ b/doc/subsecond-lag.md @@ -10,11 +10,11 @@ Replication lag is measured on: - The "inspected" server (the server `gh-ost` connects to; replica is desired but not mandatory) - The `throttle-control-replicas` list -For the inspected server, `gh-ost` uses an internal heartbeat mechanism. It injects heartbeat events onto the utility changelog table, then reads those events in the binary log, and compares times. This measurement is by default adn by definition sub-second enabled. +For the inspected server, `gh-ost` uses an internal heartbeat mechanism. It injects heartbeat events onto the utility changelog table, then reads those events in the binary log, and compares times. This measurement is by default and by definition sub-second enabled. You can explicitly define how frequently will `gh-ost` inject heartbeat events, via `heartbeat-interval-millis`. You should set `heartbeat-interval-millis <= max-lag-millis`. It still works if not, but loses granularity and effect. -On the `throttle-control-replicas`, `gh-ost` only issues SQL queries, and does not attempt to read the binary log stream. Perhaps thsoe other replicas don't have binary logs in the first place. +On the `throttle-control-replicas`, `gh-ost` only issues SQL queries, and does not attempt to read the binary log stream. Perhaps those other replicas don't have binary logs in the first place. The standard way of getting replication lag on a replica is to issue `SHOW SLAVE STATUS`, then reading `Seconds_behind_master` value. But that value has a `1sec` granularity.