Resurrection documentation

This commit is contained in:
Shlomi Noach 2016-12-25 11:46:14 +02:00
parent 0e8e5de7aa
commit af74e8c6cd
4 changed files with 55 additions and 6 deletions

View File

@ -30,6 +30,7 @@ In addition, it offers many [operational perks](doc/perks.md) that make it safer
- Auditing: you may query `gh-ost` for status. `gh-ost` listens on unix socket or TCP. - Auditing: you may query `gh-ost` for status. `gh-ost` listens on unix socket or TCP.
- Control over cut-over phase: `gh-ost` can be instructed to postpone what is probably the most critical step: the swap of tables, until such time that you're comfortably available. No need to worry about ETA being outside office hours. - Control over cut-over phase: `gh-ost` can be instructed to postpone what is probably the most critical step: the swap of tables, until such time that you're comfortably available. No need to worry about ETA being outside office hours.
- External [hooks](doc/hooks.md) can couple `gh-ost` with your particular environment. - External [hooks](doc/hooks.md) can couple `gh-ost` with your particular environment.
- [Resurrection](doc/resurrect.md) can resume a failed migration, proceeding from last known good position.
Please refer to the [docs](doc) for more information. No, really, read the [docs](doc). Please refer to the [docs](doc) for more information. No, really, read the [docs](doc).
@ -76,19 +77,17 @@ But then a rare genetic mutation happened, and the `c` transformed into `t`. And
## Community ## Community
`gh-ost` is released at a stable state, but with mileage to go. We are [open to pull requests](https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md). Please first discuss your intentions via [Issues](https://github.com/github/gh-ost/issues). `gh-ost` is released at a stable state, and still with mileage to go. We are [open to pull requests](https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md). Please first discuss your intentions via [Issues](https://github.com/github/gh-ost/issues).
We develop `gh-ost` at GitHub and for the community. We may have different priorities than others. From time to time we may suggest a contribution that is not on our immediate roadmap but which may appeal to others. We develop `gh-ost` at GitHub and for the community. We may have different priorities than others. From time to time we may suggest a contribution that is not on our immediate roadmap but which may appeal to others.
## Download/binaries/source ## Download/binaries/source
`gh-ost` is now GA and stable. `gh-ost` is GA and stable, available in binary format for Linux and Mac OS/X
`gh-ost` is available in binary format for Linux and Mac OS/X
[Download latest release here](https://github.com/github/gh-ost/releases/latest) [Download latest release here](https://github.com/github/gh-ost/releases/latest)
`gh-ost` is a Go project; it is built with Go 1.5 with "experimental vendor". Soon to migrate to Go 1.6. See and use [build file](https://github.com/github/gh-ost/blob/master/build.sh) for compiling it on your own. `gh-ost` is a Go project; it is built with Go 1.7. See and use [build file](https://github.com/github/gh-ost/blob/master/build.sh) for compiling it on your own.
Generally speaking, `master` branch is stable, but only [releases](https://github.com/github/gh-ost/releases) are to be used in production. Generally speaking, `master` branch is stable, but only [releases](https://github.com/github/gh-ost/releases) are to be used in production.

View File

@ -1 +1 @@
1.0.32 1.1.0

View File

@ -111,6 +111,14 @@ See also: [Sub-second replication lag throttling](subsecond-lag.md)
Typically `gh-ost` is used to migrate tables on a master. If you wish to only perform the migration in full on a replica, connect `gh-ost` to said replica and pass `--migrate-on-replica`. `gh-ost` will briefly connect to the master but other issue no changes on the master. Migration will be fully executed on the replica, while making sure to maintain a small replication lag. Typically `gh-ost` is used to migrate tables on a master. If you wish to only perform the migration in full on a replica, connect `gh-ost` to said replica and pass `--migrate-on-replica`. `gh-ost` will briefly connect to the master but other issue no changes on the master. Migration will be fully executed on the replica, while making sure to maintain a small replication lag.
### resurrect
It is possible to resurrect/resume a failed migration. Such a migration would be a valid execution, which bailed out throughout the migration process. A migration would bail out on meeting with `--critical-load`, or perhaps a user `kill -9`'d it.
Use `--resurrect` with exact same other flags (same `--database, --table, --alter`) to resume a failed migration.
Read more on [resurrection docs](resurrect.md)
### skip-foreign-key-checks ### skip-foreign-key-checks
By default `gh-ost` verifies no foreign keys exist on the migrated table. On servers with large number of tables this check can take a long time. If you're absolutely certain no foreign keys exist (table does not referenece other table nor is referenced by other tables) and wish to save the check time, provide with `--skip-foreign-key-checks`. By default `gh-ost` verifies no foreign keys exist on the migrated table. On servers with large number of tables this check can take a long time. If you're absolutely certain no foreign keys exist (table does not referenece other table nor is referenced by other tables) and wish to save the check time, provide with `--skip-foreign-key-checks`.

42
doc/resurrect.md Normal file
View File

@ -0,0 +1,42 @@
# Resurrection
`gh-ost` supports resurrection of a failed migration, continuing the migration from last known good position, potentially saving hours of clock-time.
A migration may fail as follows:
- On meeting with `--critical-load`
- On successively meeting with a specific error (e.g. recurring locks)
- Being `kill -9`'d by a user
- MySQL crash
- Server crash
- Robots taking over the world and other reasons.
### --resurrect
One may resurrect such a migration by running the exact same command, adding the `--resurrect` flag.
The terms for resurrection are:
- Exact same database/table/alter
- Previous migration ran for at least one minute
- Previous migration began looking at row-copy and event handling (by `1` minute of execution you may expect this to be the case)
### How does it work?
`gh-ost` dumps its migration status (context) once per minute, onto the _changelog table_. The changelog table is used for internal bookkeeping, and manages heartbeat and internal message passing.
When `--resurrect` is provided,`gh-ost` attempts to find such status dump in the changelog table. Most interestingly this status included:
- Last handled binlog event coordinates (any event up to that point has been applied to _ghost_ table)
- Last copied chunk range
- Other useful information
Resurrection reconnects the streamer at last handled binlog coordinates, and skips rowcopy to proceed from last copied chunk range.
Noteworthy is that it is not important to resume from _exact same_ coordinates and chunk as last applied; the context dump only runs once per minute, and resurrection may re-apply a minute's worth of binary logs, and re-iterate a minute's work of copied chunks.
Row-based replication has the property of being idempotent for DML events. There is no damage in reapplying contiguous binlog events starting at some point in the past.
Chunk-reiteration likewise poses no integrity concern and there is no harm in re-copying same range of rows.
The only concern is to never skip binlog events, and never skip a row range. By virtue of only dumping events and ranges that have been applied, and by virtue of only proceessing binlog events and chunks moving forward, `gh-ost` keeps integrity intact.