clarified throttling logic; indicating sub-second lag

This commit is contained in:
Shlomi Noach 2016-09-01 13:20:39 +02:00
parent 9c92779939
commit 25400cdf96

View File

@ -28,11 +28,15 @@ Otherwise you may specify your own list of replica servers you wish it to observ
- `--max-lag-millis`: maximum allowed lag; any controlled replica lagging more than this value will cause throttling to kick in. When all control replicas have smaller lag than indicated, operation resumes.
- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single `int` value indicating replication lag.
- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single decimal (floating point) value indicating replication lag.
Example: `--replication-lag-query="SELECT ROUND(NOW() - MAX(UNIX_TIMESTAMP(ts))) AS lag FROM mydb.heartbeat"`
Example: `--replication-lag-query="SELECT UNIX_TIMESTAMP() - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"`
Note that you may dynamically change the `throttle-control-replicas` list via [interactive commands](interactive-commands.md)
We encourage you to use [sub-second replication lag throttling](subsecond-lag.md). Your query may then look like:
`--replication-lag-query="SELECT UNIX_TIMESTAMP(6) - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"`
Note that you may dynamically change both `replication-lag-query` and the `throttle-control-replicas` list via [interactive commands](interactive-commands.md)
#### Status thresholds
@ -76,9 +80,9 @@ In addition to the above, you are able to take control and throttle the operatio
Any single factor in the above that suggests the migration should throttle - causes throttling. That is, once some component decides to throttle, you cannot override it; you cannot force continued execution of the migration.
`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, then status thesholds, and lastly it will check the throttle-query.
`gh-ost` collects different throttle-related metrics at different times, independently. It asynchronously reads the collected metrics and checks if they satisfy conditions/threasholds.
The first check to suggest throttling stops the search; the status message will note the reason for throttling as the first satisfied check.
The first check to suggest throttling stops the check; the status message will note the reason for throttling as the first satisfied check.
### Throttle status