throttle-query, unpostpone

This commit is contained in:
Shlomi Noach 2016-07-01 13:19:02 +02:00
parent b7def18b20
commit 16de269de4
3 changed files with 12 additions and 5 deletions

View File

@ -25,6 +25,7 @@ replication lag on to determine throttling
- `throttle-control-replicas`: change list of throttle-control replicas, these are replicas `gh-ost` will cehck
- `throttle`: force migration suspend
- `no-throttle`: cancel forced suspension (though other throttling reasons may still apply)
- `unpostpone`: at a time where `gh-ost` is postponing the [cut-over](cut-over.md) phase, instruct `gh-ost` to stop postponing and proceed immediately to cut-over.
- `panic`: immediately panic and abort operation
### Examples

View File

@ -32,6 +32,8 @@ Otherwise you may specify your own list of replica servers you wish it to observ
Example: `--replication-lag-query="SELECT ROUND(NOW() - MAX(UNIX_TIMESTAMP(ts))) AS lag FROM mydb.heartbeat"`
Note that you may dynamically change the `throttle-control-replicas` list via [interactive commands](interactive-commands.md)
#### Status thresholds
- `--max-load`: list of metrics and threshold values; topping the threshold of any will cause throttler to kick in.
@ -42,6 +44,12 @@ Otherwise you may specify your own list of replica servers you wish it to observ
Metrics must be valid, numeric [statis variables](http://dev.mysql.com/doc/refman/5.6/en/server-status-variables.html)
#### Throttle query
- When provided, the `--throttle-query` is expected to return a scalar integer. A return value `> 0` implies `gh-ost` should throttle. A return value `<= 0` implied `gh-ost` is free to proceed (pending other throttling factors).
An example query could be: `--throttle-query="select hour(now()) between 8 and 17"` which implies throttling auto-starts `8:00am` and migration auto-resumes at `18:00pm`.
#### Manual control
In addition to the above, you are able to take control and throttle the operation any time you like.
@ -68,7 +76,7 @@ In addition to the above, you are able to take control and throttle the operatio
Any single factor in the above that suggests the migration should throttle - causes throttling. That is, once some component decides to throttle, you cannot override it; you cannot force continued execution of the migration.
`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, and lastly it will check for status thresholds.
`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, then status thesholds, and lastly it will check the throttle-query.
The first check to suggest throttling stops the search; the status message will note the reason for throttling as the first satisfied check.

View File

@ -7,7 +7,7 @@ Existing MySQL schema migration tools:
- [LHM](https://github.com/soundcloud/lhm)
- [oak-online-alter-table](https://github.com/shlomi-noach/openarkkit)
are all using [triggers](http://dev.mysql.com/doc/refman/5.6/en/triggers.html) to propagate live changes on your table onto a ghost/shadow table that is slowly being synchronized. The tools not all work the same: while most use a synchronous approach (all changes applied on the ghost table), the Facebook tool uses an asynchronous approach (changes are appended to a changelog table, later reviewed and appleid on ghost table).
are all using [triggers](http://dev.mysql.com/doc/refman/5.6/en/triggers.html) to propagate live changes on your table onto a ghost/shadow table that is slowly being synchronized. The tools not all work the same: while most use a synchronous approach (all changes applied on the ghost table), the Facebook tool uses an asynchronous approach (changes are appended to a changelog table, later reviewed and applied on ghost table).
Use of triggers simplifies a lot of the flow in doing a live table migration, but also poses some limitations or difficulties. Here are reasons why we choose to [design a triggerless solution](triggerless-design.md) to schema migrations.
@ -27,12 +27,10 @@ We know this to be a visible overhead on very busy or very large tables.
### Triggers, locks
When a table with trigger is concurrently being written two, the triggers, being in same transaction space as the incoming queries, are also executed concurrently. While concurrent queries compete for resources via locks (e.g. the `auto_increment` value), the triggers need to _simultaneously_ compete for their own locks (e.g., likewise on the `auto_increment` value on the ghost table, in a synchronous solution). The competitions are un coordinated.
When a table with triggers is concurrently being written to, the triggers, being in same transaction space as the incoming queries, are also executed concurrently. While concurrent queries compete for resources via locks (e.g. the `auto_increment` value), the triggers need to _simultaneously_ compete for their own locks (e.g., likewise on the `auto_increment` value on the ghost table, in a synchronous solution). These competitions are non-coordinated.
We have evidenced near or complete lock downs in production, to the effect of rendering the table or the entire database inaccessible due to lock contention.
Some lock contention can be reduced; in the case of `auto_increment`, an `innodb_autoinc_lock_mode=2` on Row Based Replication is helpful. But this is but a single example of a locking issue, and even so we have seen lockdowns on busy tables.
### Trigger based migration, no pause
The triggers are used to either apply or record ongoing changes to your original table.