From 25400cdf96cdedf58c2e8a0259d61bf729f4c4c3 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 1 Sep 2016 13:20:39 +0200 Subject: [PATCH] clarified throttling logic; indicating sub-second lag --- doc/throttle.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/doc/throttle.md b/doc/throttle.md index e02d207..8c28f01 100644 --- a/doc/throttle.md +++ b/doc/throttle.md @@ -28,11 +28,15 @@ Otherwise you may specify your own list of replica servers you wish it to observ - `--max-lag-millis`: maximum allowed lag; any controlled replica lagging more than this value will cause throttling to kick in. When all control replicas have smaller lag than indicated, operation resumes. -- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single `int` value indicating replication lag. +- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single decimal (floating point) value indicating replication lag. - Example: `--replication-lag-query="SELECT ROUND(NOW() - MAX(UNIX_TIMESTAMP(ts))) AS lag FROM mydb.heartbeat"` + Example: `--replication-lag-query="SELECT UNIX_TIMESTAMP() - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"` -Note that you may dynamically change the `throttle-control-replicas` list via [interactive commands](interactive-commands.md) + We encourage you to use [sub-second replication lag throttling](subsecond-lag.md). Your query may then look like: + + `--replication-lag-query="SELECT UNIX_TIMESTAMP(6) - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"` + +Note that you may dynamically change both `replication-lag-query` and the `throttle-control-replicas` list via [interactive commands](interactive-commands.md) #### Status thresholds @@ -76,9 +80,9 @@ In addition to the above, you are able to take control and throttle the operatio Any single factor in the above that suggests the migration should throttle - causes throttling. That is, once some component decides to throttle, you cannot override it; you cannot force continued execution of the migration. -`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, then status thesholds, and lastly it will check the throttle-query. +`gh-ost` collects different throttle-related metrics at different times, independently. It asynchronously reads the collected metrics and checks if they satisfy conditions/threasholds. -The first check to suggest throttling stops the search; the status message will note the reason for throttling as the first satisfied check. +The first check to suggest throttling stops the check; the status message will note the reason for throttling as the first satisfied check. ### Throttle status