From a6030898131f697bfcf0ecc91ac1bffb14e3cce0 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 13:42:09 +1100 Subject: [PATCH 01/35] AWS RDS! --- doc/rds.md | 1 + 1 file changed, 1 insertion(+) create mode 100644 doc/rds.md diff --git a/doc/rds.md b/doc/rds.md new file mode 100644 index 0000000..9a1256d --- /dev/null +++ b/doc/rds.md @@ -0,0 +1 @@ +# Amazon RDS \ No newline at end of file From 8a854b8e6f03bdb8665e5db4119da9b3e06fc695 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 13:42:33 +1100 Subject: [PATCH 02/35] Add known limitations of AWS RDS --- doc/rds.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/doc/rds.md b/doc/rds.md index 9a1256d..cf85f49 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -1 +1,7 @@ -# Amazon RDS \ No newline at end of file +# Amazon RDS# Amazon RDS + +## Limitations + +- No `SUPER` privileges. +- `gh-ost` runs should be setup use [`--assume-rbr`][assume_rbr_docs] and use `binlog_format=ROW`. +- Aurora does not allow editing of the `read_only` parameter. While it is defined as `{TrueIfReplica}`, the parameter is non-modifiable field. From 0b93e3697ea3a906e015670a9e5433d06f13c226 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 13:44:15 +1100 Subject: [PATCH 03/35] Add notes for aurora replication Due to the way aurora is architected, replication means something slightly different to the traditional sense. This includes a work around to use test/migrate on replica instead of only master migrations. --- doc/rds.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/doc/rds.md b/doc/rds.md index cf85f49..3864ea9 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -5,3 +5,14 @@ - No `SUPER` privileges. - `gh-ost` runs should be setup use [`--assume-rbr`][assume_rbr_docs] and use `binlog_format=ROW`. - Aurora does not allow editing of the `read_only` parameter. While it is defined as `{TrueIfReplica}`, the parameter is non-modifiable field. + +## Aurora + +#### Replication + +In Aurora replication, you have separate reader and writer endpoints however because the cluster shares the underlying storage layer, `gh-ost` will detect it is running on the master. This becomes an issue when you wish to use [migrate/test on replica][migrate_test_on_replica_docs] because you won't be able to use a single cluster in the same way you would with MySQL RDS. + +To work around this, you can follow along the [AWS replication between clusters documentation][aws_replication_docs] for Aurora with one small caveat. For the "Create a Snapshot of Your Replication Master" step, the binlog position is not available in the AWS console. You will need to issue the SQL query `SHOW SLAVE STATUS` or `aws rds describe-events` API call to get the correct position. + +[migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica +[aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html \ No newline at end of file From 99f7b8d8c75c1a80666592a20e680778eeb23bdf Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 13:44:55 +1100 Subject: [PATCH 04/35] Add link to `pt-table-checksum` patch --- doc/rds.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/doc/rds.md b/doc/rds.md index 3864ea9..9490a7e 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -14,5 +14,16 @@ In Aurora replication, you have separate reader and writer endpoints however bec To work around this, you can follow along the [AWS replication between clusters documentation][aws_replication_docs] for Aurora with one small caveat. For the "Create a Snapshot of Your Replication Master" step, the binlog position is not available in the AWS console. You will need to issue the SQL query `SHOW SLAVE STATUS` or `aws rds describe-events` API call to get the correct position. +#### Percona Toolkit + +If you use `pt-table-checksum` as a part of your data integrity checks, you might want to check out [this patch][percona_toolkit_patch] which will enable you to run `pt-table-checksum` with the `--no-binlog-format-check` flag and prevent errors like the following: + +``` +03-24T12:51:06 Failed to /*!50108 SET @@binlog_format := 'STATEMENT'*/: DBD::mysql::db do failed: Access denied; you need (at least one of) the SUPER privilege(s) for this operation [for Statement "/*!50108 SET @@binlog_format := 'STATEMENT'*/"] at pt-table-checksum line 9292. + +This tool requires binlog_format=STATEMENT, but the current binlog_format is set to ROW and an error occurred while attempting to change it. If running MySQL 5.1.29 or newer, setting binlog_format requires the SUPER privilege. You will need to manually set binlog_format to 'STATEMENT' before running this tool. +``` + [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica -[aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html \ No newline at end of file +[aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html +[percona_toolkit_patch]: https://github.com/jacobbednarz/percona-toolkit/commit/0271ba6a094da446a5e5bb8d99b5c26f1777f2b9 \ No newline at end of file From fb3993e56031d9f895e931f827b8a1a4cc05cfcd Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 13:45:01 +1100 Subject: [PATCH 05/35] Fix line endings --- doc/rds.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/doc/rds.md b/doc/rds.md index 9490a7e..baec8fb 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -1,8 +1,8 @@ -# Amazon RDS# Amazon RDS +# Amazon RDS ## Limitations -- No `SUPER` privileges. +- No `SUPER` privileges. - `gh-ost` runs should be setup use [`--assume-rbr`][assume_rbr_docs] and use `binlog_format=ROW`. - Aurora does not allow editing of the `read_only` parameter. While it is defined as `{TrueIfReplica}`, the parameter is non-modifiable field. @@ -10,7 +10,7 @@ #### Replication -In Aurora replication, you have separate reader and writer endpoints however because the cluster shares the underlying storage layer, `gh-ost` will detect it is running on the master. This becomes an issue when you wish to use [migrate/test on replica][migrate_test_on_replica_docs] because you won't be able to use a single cluster in the same way you would with MySQL RDS. +In Aurora replication, you have separate reader and writer endpoints however because the cluster shares the underlying storage layer, `gh-ost` will detect it is running on the master. This becomes an issue when you wish to use [migrate/test on replica][migrate_test_on_replica_docs] because you won't be able to use a single cluster in the same way you would with MySQL RDS. To work around this, you can follow along the [AWS replication between clusters documentation][aws_replication_docs] for Aurora with one small caveat. For the "Create a Snapshot of Your Replication Master" step, the binlog position is not available in the AWS console. You will need to issue the SQL query `SHOW SLAVE STATUS` or `aws rds describe-events` API call to get the correct position. @@ -24,6 +24,7 @@ If you use `pt-table-checksum` as a part of your data integrity checks, you migh This tool requires binlog_format=STATEMENT, but the current binlog_format is set to ROW and an error occurred while attempting to change it. If running MySQL 5.1.29 or newer, setting binlog_format requires the SUPER privilege. You will need to manually set binlog_format to 'STATEMENT' before running this tool. ``` +[assume_rbr_docs]: https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#assume-rbr [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica [aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html -[percona_toolkit_patch]: https://github.com/jacobbednarz/percona-toolkit/commit/0271ba6a094da446a5e5bb8d99b5c26f1777f2b9 \ No newline at end of file +[percona_toolkit_patch]: https://github.com/jacobbednarz/percona-toolkit/commit/0271ba6a094da446a5e5bb8d99b5c26f1777f2b9 From bf864e0e0c534494c4ff99de0abb917fd9dec9bf Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Fri, 24 Mar 2017 14:00:26 +1100 Subject: [PATCH 06/35] Add note on binlogs requiring backup > 1d --- doc/rds.md | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/rds.md b/doc/rds.md index baec8fb..928fb83 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -5,6 +5,7 @@ - No `SUPER` privileges. - `gh-ost` runs should be setup use [`--assume-rbr`][assume_rbr_docs] and use `binlog_format=ROW`. - Aurora does not allow editing of the `read_only` parameter. While it is defined as `{TrueIfReplica}`, the parameter is non-modifiable field. +- In order to have binlogs enabled, the backup window must be set to greater than 1 day. ## Aurora From 23ce390d6973e470800c391f0e136914d13283ba Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 26 Mar 2017 13:10:34 +0300 Subject: [PATCH 07/35] supporting throttle-http --- go/base/context.go | 27 ++++++++++++++++++++++++--- go/cmd/gh-ost/main.go | 2 ++ go/logic/migrator.go | 5 +++-- go/logic/server.go | 11 +++++++++++ go/logic/throttler.go | 33 +++++++++++++++++++++++++++++++++ 5 files changed, 73 insertions(+), 5 deletions(-) diff --git a/go/base/context.go b/go/base/context.go index d6cd6ce..5ab6d2a 100644 --- a/go/base/context.go +++ b/go/base/context.go @@ -44,6 +44,10 @@ const ( UserCommandThrottleReasonHint = "UserCommandThrottleReasonHint" ) +const ( + HTTPStatusOK = 200 +) + var ( envVariableRegexp = regexp.MustCompile("[$][{](.*)[}]") ) @@ -99,6 +103,7 @@ type MigrationContext struct { ThrottleFlagFile string ThrottleAdditionalFlagFile string throttleQuery string + throttleHTTP string ThrottleCommandedByUser int64 maxLoad LoadMap criticalLoad LoadMap @@ -148,6 +153,7 @@ type MigrationContext struct { pointOfInterestTime time.Time pointOfInterestTimeMutex *sync.Mutex CurrentLag int64 + ThrottleHTTPStatusCode int64 controlReplicasLagResult mysql.ReplicationLagResult TotalRowsCopied int64 TotalDMLEventsApplied int64 @@ -157,6 +163,7 @@ type MigrationContext struct { throttleReasonHint ThrottleReasonHint throttleGeneralCheckResult ThrottleCheckResult throttleMutex *sync.Mutex + throttleHTTPMutex *sync.Mutex IsPostponingCutOver int64 CountingRowsFlag int64 AllEventsUpToLockProcessedInjectedFlag int64 @@ -215,6 +222,7 @@ func newMigrationContext() *MigrationContext { maxLoad: NewLoadMap(), criticalLoad: NewLoadMap(), throttleMutex: &sync.Mutex{}, + throttleHTTPMutex: &sync.Mutex{}, throttleControlReplicaKeys: mysql.NewInstanceKeyMap(), configMutex: &sync.Mutex{}, pointOfInterestTimeMutex: &sync.Mutex{}, @@ -472,12 +480,10 @@ func (this *MigrationContext) IsThrottled() (bool, string, ThrottleReasonHint) { } func (this *MigrationContext) GetThrottleQuery() string { - var query string - this.throttleMutex.Lock() defer this.throttleMutex.Unlock() - query = this.throttleQuery + var query = this.throttleQuery return query } @@ -488,6 +494,21 @@ func (this *MigrationContext) SetThrottleQuery(newQuery string) { this.throttleQuery = newQuery } +func (this *MigrationContext) GetThrottleHTTP() string { + this.throttleHTTPMutex.Lock() + defer this.throttleHTTPMutex.Unlock() + + var throttleHTTP = this.throttleHTTP + return throttleHTTP +} + +func (this *MigrationContext) SetThrottleHTTP(throttleHTTP string) { + this.throttleHTTPMutex.Lock() + defer this.throttleHTTPMutex.Unlock() + + this.throttleHTTP = throttleHTTP +} + func (this *MigrationContext) GetMaxLoad() LoadMap { this.throttleMutex.Lock() defer this.throttleMutex.Unlock() diff --git a/go/cmd/gh-ost/main.go b/go/cmd/gh-ost/main.go index 238dc81..f27e12b 100644 --- a/go/cmd/gh-ost/main.go +++ b/go/cmd/gh-ost/main.go @@ -93,6 +93,7 @@ func main() { replicationLagQuery := flag.String("replication-lag-query", "", "Deprecated. gh-ost uses an internal, subsecond resolution query") throttleControlReplicas := flag.String("throttle-control-replicas", "", "List of replicas on which to check for lag; comma delimited. Example: myhost1.com:3306,myhost2.com,myhost3.com:3307") throttleQuery := flag.String("throttle-query", "", "when given, issued (every second) to check if operation should throttle. Expecting to return zero for no-throttle, >0 for throttle. Query is issued on the migrated server. Make sure this query is lightweight") + throttleHTTP := flag.String("throttle-http", "", "when given, gh-ost checks given URL via HEAD request; any response code other than 200 (OK) causes throttling; make sure it has low latency response") heartbeatIntervalMillis := flag.Int64("heartbeat-interval-millis", 100, "how frequently would gh-ost inject a heartbeat value") flag.StringVar(&migrationContext.ThrottleFlagFile, "throttle-flag-file", "", "operation pauses when this file exists; hint: use a file that is specific to the table being altered") flag.StringVar(&migrationContext.ThrottleAdditionalFlagFile, "throttle-additional-flag-file", "/tmp/gh-ost.throttle", "operation pauses when this file exists; hint: keep default, use for throttling multiple gh-ost operations") @@ -228,6 +229,7 @@ func main() { migrationContext.SetDMLBatchSize(*dmlBatchSize) migrationContext.SetMaxLagMillisecondsThrottleThreshold(*maxLagMillis) migrationContext.SetThrottleQuery(*throttleQuery) + migrationContext.SetThrottleHTTP(*throttleHTTP) migrationContext.SetDefaultNumRetries(*defaultRetries) migrationContext.ApplyCredentials() if err := migrationContext.SetCutOverLockTimeoutSeconds(*cutOverLockTimeoutSeconds); err != nil { diff --git a/go/logic/migrator.go b/go/logic/migrator.go index 549dd1d..59432a3 100644 --- a/go/logic/migrator.go +++ b/go/logic/migrator.go @@ -96,7 +96,7 @@ func NewMigrator() *Migrator { migrationContext: base.GetMigrationContext(), parser: sql.NewParser(), ghostTableMigrated: make(chan bool), - firstThrottlingCollected: make(chan bool, 1), + firstThrottlingCollected: make(chan bool, 3), rowCopyComplete: make(chan bool), allEventsUpToLockProcessed: make(chan string), @@ -977,7 +977,8 @@ func (this *Migrator) initiateThrottler() error { go this.throttler.initiateThrottlerCollection(this.firstThrottlingCollected) log.Infof("Waiting for first throttle metrics to be collected") <-this.firstThrottlingCollected // replication lag - <-this.firstThrottlingCollected // other metrics + <-this.firstThrottlingCollected // HTTP status + <-this.firstThrottlingCollected // other, general metrics log.Infof("First throttle metrics collected") go this.throttler.initiateThrottlerChecks() diff --git a/go/logic/server.go b/go/logic/server.go index b1246c2..3faeb98 100644 --- a/go/logic/server.go +++ b/go/logic/server.go @@ -146,6 +146,7 @@ max-lag-millis= # Set a new replication lag threshold replication-lag-query= # Set a new query that determines replication lag (no quotes) max-load= # Set a new set of max-load thresholds throttle-query= # Set a new throttle-query (no quotes) +throttle-http= # Set a new throttle URL throttle-control-replicas= # Set a new comma delimited list of throttle control replicas throttle # Force throttling no-throttle # End forced throttling (other throttling may still apply) @@ -236,6 +237,16 @@ help # This message fmt.Fprintf(writer, throttleHint) return ForcePrintStatusAndHintRule, nil } + case "throttle-http": + { + if argIsQuestion { + fmt.Fprintf(writer, "%+v\n", this.migrationContext.GetThrottleHTTP()) + return NoPrintStatusRule, nil + } + this.migrationContext.SetThrottleHTTP(arg) + fmt.Fprintf(writer, throttleHint) + return ForcePrintStatusAndHintRule, nil + } case "throttle-control-replicas": { if argIsQuestion { diff --git a/go/logic/throttler.go b/go/logic/throttler.go index 33d3f79..49ad83a 100644 --- a/go/logic/throttler.go +++ b/go/logic/throttler.go @@ -7,6 +7,7 @@ package logic import ( "fmt" + "net/http" "sync/atomic" "time" @@ -41,6 +42,11 @@ func (this *Throttler) shouldThrottle() (result bool, reason string, reasonHint if generalCheckResult.ShouldThrottle { return generalCheckResult.ShouldThrottle, generalCheckResult.Reason, generalCheckResult.ReasonHint } + // HTTP throttle + statusCode := atomic.LoadInt64(&this.migrationContext.ThrottleHTTPStatusCode) + if statusCode != 0 && statusCode != http.StatusOK { + return true, fmt.Sprintf("http=%d", statusCode), base.NoThrottleReasonHint + } // Replication lag throttle maxLagMillisecondsThrottleThreshold := atomic.LoadInt64(&this.migrationContext.MaxLagMillisecondsThrottleThreshold) lag := atomic.LoadInt64(&this.migrationContext.CurrentLag) @@ -213,6 +219,32 @@ func (this *Throttler) criticalLoadIsMet() (met bool, variableName string, value return false, variableName, value, threshold, nil } +// collectReplicationLag reads the latest changelog heartbeat value +func (this *Throttler) collectThrottleHTTPStatus(firstThrottlingCollected chan<- bool) { + collectFunc := func() (sleep bool, err error) { + url := this.migrationContext.GetThrottleHTTP() + if url == "" { + return true, nil + } + resp, err := http.Get(url) + if err != nil { + return false, err + } + atomic.StoreInt64(&this.migrationContext.ThrottleHTTPStatusCode, int64(resp.StatusCode)) + return false, nil + } + + collectFunc() + firstThrottlingCollected <- true + + ticker := time.Tick(100 * time.Millisecond) + for range ticker { + if sleep, _ := collectFunc(); sleep { + time.Sleep(1 * time.Second) + } + } +} + // collectGeneralThrottleMetrics reads the once-per-sec metrics, and stores them onto this.migrationContext func (this *Throttler) collectGeneralThrottleMetrics() error { @@ -290,6 +322,7 @@ func (this *Throttler) collectGeneralThrottleMetrics() error { func (this *Throttler) initiateThrottlerCollection(firstThrottlingCollected chan<- bool) { go this.collectReplicationLag(firstThrottlingCollected) go this.collectControlReplicasLag() + go this.collectThrottleHTTPStatus(firstThrottlingCollected) go func() { this.collectGeneralThrottleMetrics() From c413d508cc37e5af519fe35f22e311096ad05333 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 26 Mar 2017 13:12:56 +0300 Subject: [PATCH 08/35] HEAD instead of GET --- go/logic/throttler.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/go/logic/throttler.go b/go/logic/throttler.go index 49ad83a..1c2c62a 100644 --- a/go/logic/throttler.go +++ b/go/logic/throttler.go @@ -226,7 +226,7 @@ func (this *Throttler) collectThrottleHTTPStatus(firstThrottlingCollected chan<- if url == "" { return true, nil } - resp, err := http.Get(url) + resp, err := http.Head(url) if err != nil { return false, err } From 8d4d9cbaeca86f01f0e575d7a0e37e9593c17366 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 26 Mar 2017 15:14:36 +0300 Subject: [PATCH 09/35] throttle-http docuemntation --- doc/command-line-flags.md | 8 ++++++++ doc/interactive-commands.md | 1 + 2 files changed, 9 insertions(+) diff --git a/doc/command-line-flags.md b/doc/command-line-flags.md index d266f2c..5f92cc0 100644 --- a/doc/command-line-flags.md +++ b/doc/command-line-flags.md @@ -131,6 +131,14 @@ See `approve-renamed-columns` Issue the migration on a replica; do not modify data on master. Useful for validating, testing and benchmarking. See [testing-on-replica](testing-on-replica.md) +### throttle-control-replicas + +Provide a command delimited list of replicas; `gh-ost` will throttle when any of the given replicas lag beyond `--max-lag-millis`. The list can be queried and updated dynamically via [interactive commands](interactive-commands.md) + +### throttle-http + +Provide a HTTP endpoint; `gh-ost` will issue `HEAD` requests on given URL and throttle whenever response status code is not `200`. The URL can be queried and updated dynamically via [interactive commands](interactive-commands.md). Empty URL disables the HTTP check. + ### timestamp-old-table Makes the _old_ table include a timestamp value. The _old_ table is what the original table is renamed to at the end of a successful migration. For example, if the table is `gh_ost_test`, then the _old_ table would normally be `_gh_ost_test_del`. With `--timestamp-old-table` it would be, for example, `_gh_ost_test_20170221103147_del`. diff --git a/doc/interactive-commands.md b/doc/interactive-commands.md index c6398c5..c0389e1 100644 --- a/doc/interactive-commands.md +++ b/doc/interactive-commands.md @@ -31,6 +31,7 @@ Both interfaces may serve at the same time. Both respond to simple text command, - `nice-ratio=0.5` will cause `gh-ost` to sleep for `50ms` immediately following. - `nice-ratio=1` will cause `gh-ost` to sleep for `100ms`, effectively doubling runtime - value of `2` will effectively triple the runtime; etc. +- `throttle-http`: change throttle HTTP endpoint - `throttle-query`: change throttle query - `throttle-control-replicas='replica1,replica2'`: change list of throttle-control replicas, these are replicas `gh-ost` will check. This takes a comma separated list of replica's to check and replaces the previous list. - `throttle`: force migration suspend From 7a3912da80d396e77e301b9a69e181b362269df2 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 27 Mar 2017 08:33:06 +0300 Subject: [PATCH 10/35] fixing binlog syncer double-close() --- go/binlog/gomysql_reader.go | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/go/binlog/gomysql_reader.go b/go/binlog/gomysql_reader.go index 445617a..9feca87 100644 --- a/go/binlog/gomysql_reader.go +++ b/go/binlog/gomysql_reader.go @@ -160,6 +160,10 @@ func (this *GoMySQLReader) StreamEvents(canStopStreaming func() bool, entriesCha } func (this *GoMySQLReader) Close() error { - this.binlogSyncer.Close() + // Historically there was a: + // this.binlogSyncer.Close() + // here. A new go-mysql version closes the binlog syncer connection independently. + // I will go against the sacred rules of comments and just leave this here. + // This is the year 2017. Let's see what year these comments get deleted. return nil } From 1aaf47ec6a09f2f653a0c08c0a3ec6779125df33 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Tue, 28 Mar 2017 12:02:39 +1100 Subject: [PATCH 11/35] Add note that `gh-ost` _does_ work with RDS --- doc/rds.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/rds.md b/doc/rds.md index 928fb83..130a7ed 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -1,3 +1,5 @@ +`gh-ost` has been updated to work with Amazon RDS however due to GitHub not relying using AWS for databases, this documentation is community driven so if you find a bug please [open an issue][new_issue]! + # Amazon RDS ## Limitations @@ -25,6 +27,7 @@ If you use `pt-table-checksum` as a part of your data integrity checks, you migh This tool requires binlog_format=STATEMENT, but the current binlog_format is set to ROW and an error occurred while attempting to change it. If running MySQL 5.1.29 or newer, setting binlog_format requires the SUPER privilege. You will need to manually set binlog_format to 'STATEMENT' before running this tool. ``` +[new_issue]: https://github.com/github/gh-ost/issues/new [assume_rbr_docs]: https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#assume-rbr [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica [aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html From 55d8b3a188015254514dc6b84b0316d7a39efb38 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Tue, 28 Mar 2017 17:03:32 +1100 Subject: [PATCH 12/35] Add preflight checklist for aurora docs --- doc/rds.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/doc/rds.md b/doc/rds.md index 130a7ed..3aee02b 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -7,7 +7,6 @@ - No `SUPER` privileges. - `gh-ost` runs should be setup use [`--assume-rbr`][assume_rbr_docs] and use `binlog_format=ROW`. - Aurora does not allow editing of the `read_only` parameter. While it is defined as `{TrueIfReplica}`, the parameter is non-modifiable field. -- In order to have binlogs enabled, the backup window must be set to greater than 1 day. ## Aurora @@ -27,6 +26,15 @@ If you use `pt-table-checksum` as a part of your data integrity checks, you migh This tool requires binlog_format=STATEMENT, but the current binlog_format is set to ROW and an error occurred while attempting to change it. If running MySQL 5.1.29 or newer, setting binlog_format requires the SUPER privilege. You will need to manually set binlog_format to 'STATEMENT' before running this tool. ``` +#### Preflight checklist + +Before trying to run any `gh-ost` migrations you will want to confirm the following: + +- [ ] You have a secondary cluster available that will act as a replica. Rule of thumb here has been a 1 instance per cluster to mimic MySQL-style replication as opposed to Aurora style. +- [ ] The database instance parameters and database cluster parameters are consistent between your master and replicas +- [ ] Executing `SHOW SLAVE STATUS\G` on your replica cluster displays the correct master host, binlog position, etc. +- [ ] Database backup retention is greater than 1 day to enable binlogs + [new_issue]: https://github.com/github/gh-ost/issues/new [assume_rbr_docs]: https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#assume-rbr [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica From 3b5132bf2cd6b345989a492bf1525d4fef928d44 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 2 Apr 2017 10:43:45 +0300 Subject: [PATCH 13/35] supporting TravisCI --- .travis.yml | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/.travis.yml b/.travis.yml index 01078c0..079e425 100644 --- a/.travis.yml +++ b/.travis.yml @@ -1,7 +1,20 @@ +# http://docs.travis-ci.com/user/languages/go/ language: go -go: - - 1.6 - - tip +go: 1.8 -script: ./test.sh +os: + - linux + +env: +- MYSQL_USER=root + +before_install: + - mysql -e 'CREATE DATABASE IF NOT EXISTS test;' + +install: true + +script: script/cibuild + +notifications: + email: false From 0d799b67ea692bf05a6ed5910e29ac6a0a045108 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 2 Apr 2017 10:46:36 +0300 Subject: [PATCH 14/35] Readme badges --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index f312d2f..04c032a 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ # gh-ost +[![build status](https://travis-ci.org/github/gh-ost.svg)](https://travis-ci.org/github/gh-ost) [![downloads](https://img.shields.io/github/downloads/github/gh-ost/total.svg)](https://github.com/github/gh-ost/releases) [![release](https://img.shields.io/github/release/github/gh-ost.svg)](https://github.com/github/gh-ost/releases) + #### GitHub's online schema migration for MySQL `gh-ost` is a triggerless online schema migration solution for MySQL. It is testable and provides pausability, dynamic control/reconfiguration, auditing, and many operational perks. From dfb9f888af3b8a2009cc77f16bbfae14fca48750 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Mon, 3 Apr 2017 15:52:57 +1000 Subject: [PATCH 15/35] Add note for using hooks for stopping and starting RDS replication --- doc/rds.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/rds.md b/doc/rds.md index 3aee02b..540284e 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -32,11 +32,14 @@ Before trying to run any `gh-ost` migrations you will want to confirm the follow - [ ] You have a secondary cluster available that will act as a replica. Rule of thumb here has been a 1 instance per cluster to mimic MySQL-style replication as opposed to Aurora style. - [ ] The database instance parameters and database cluster parameters are consistent between your master and replicas -- [ ] Executing `SHOW SLAVE STATUS\G` on your replica cluster displays the correct master host, binlog position, etc. +- [ ] Executing `SHOW SLAVE STATUS\G` on your replica cluster displays the correct master host, binlog position, etc. - [ ] Database backup retention is greater than 1 day to enable binlogs +- [ ] You have setup [`hooks`][ghost_hooks] to issue RDS procedures for stopping and starting replication. (see [github/gh-ost#163][ghost_rds_issue_tracking] for examples) [new_issue]: https://github.com/github/gh-ost/issues/new [assume_rbr_docs]: https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#assume-rbr [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica [aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html [percona_toolkit_patch]: https://github.com/jacobbednarz/percona-toolkit/commit/0271ba6a094da446a5e5bb8d99b5c26f1777f2b9 +[ghost_hooks]: https://envato.slack.com/archives/C08KD4AQJ/p1491197511461443 +[ghost_rds_issue_tracking]: https://github.com/github/gh-ost/issues/163 From 2f48e89ea0c6cf856d7c25c57e311d021b4ecbf2 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Mon, 3 Apr 2017 15:59:11 +1000 Subject: [PATCH 16/35] This isn't a ghost hook URL, use the _real_ one --- doc/rds.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/rds.md b/doc/rds.md index 540284e..889d480 100644 --- a/doc/rds.md +++ b/doc/rds.md @@ -41,5 +41,5 @@ Before trying to run any `gh-ost` migrations you will want to confirm the follow [migrate_test_on_replica_docs]: https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#c-migratetest-on-replica [aws_replication_docs]: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.Replication.MySQLReplication.html [percona_toolkit_patch]: https://github.com/jacobbednarz/percona-toolkit/commit/0271ba6a094da446a5e5bb8d99b5c26f1777f2b9 -[ghost_hooks]: https://envato.slack.com/archives/C08KD4AQJ/p1491197511461443 +[ghost_hooks]: https://github.com/github/gh-ost/blob/master/doc/hooks.md [ghost_rds_issue_tracking]: https://github.com/github/gh-ost/issues/163 From ebd4af1328197bc4490b4b101a5fd24a45c60565 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Mon, 3 Apr 2017 12:57:57 +0300 Subject: [PATCH 17/35] release 1.0.36 --- RELEASE_VERSION | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE_VERSION b/RELEASE_VERSION index 28dff43..2e9116b 100644 --- a/RELEASE_VERSION +++ b/RELEASE_VERSION @@ -1 +1 @@ -1.0.35 +1.0.36 From 098994452881f88811c3b3a1629d3b91ec3ab414 Mon Sep 17 00:00:00 2001 From: Jacob Bednarz Date: Mon, 10 Apr 2017 13:58:35 +1000 Subject: [PATCH 18/35] Add RDS link in README usage --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 04c032a..afc4cf6 100644 --- a/README.md +++ b/README.md @@ -64,6 +64,7 @@ Also see: - [what if?](doc/what-if.md) - [the fine print](doc/the-fine-print.md) - [Community questions](https://github.com/github/gh-ost/issues?q=label%3Aquestion) +- [Using `gh-ost` on AWS RDS](doc/rds.md) ## What's in a name? From 3092e9c5c0f15578e62abc1049f42e2c8145589c Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Thu, 13 Apr 2017 08:27:42 +0300 Subject: [PATCH 19/35] error and nil checks on socket connection --- go/logic/server.go | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/go/logic/server.go b/go/logic/server.go index 3faeb98..12c26b8 100644 --- a/go/logic/server.go +++ b/go/logic/server.go @@ -98,8 +98,13 @@ func (this *Server) Serve() (err error) { } func (this *Server) handleConnection(conn net.Conn) (err error) { - defer conn.Close() + if conn != nil { + defer conn.Close() + } command, _, err := bufio.NewReader(conn).ReadLine() + if err != nil { + return err + } return this.onServerCommand(string(command), bufio.NewWriter(conn)) } From acd78b392f44f166ddb9b6c36ca4d099ea8b94fc Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 23 Apr 2017 08:00:28 +0300 Subject: [PATCH 20/35] adding tests for dropping-then-adding a column --- localtests/drop-null-add-not-null/create.sql | 30 ++++++++++++++++++++ localtests/drop-null-add-not-null/extra_args | 1 + 2 files changed, 31 insertions(+) create mode 100644 localtests/drop-null-add-not-null/create.sql create mode 100644 localtests/drop-null-add-not-null/extra_args diff --git a/localtests/drop-null-add-not-null/create.sql b/localtests/drop-null-add-not-null/create.sql new file mode 100644 index 0000000..cf54559 --- /dev/null +++ b/localtests/drop-null-add-not-null/create.sql @@ -0,0 +1,30 @@ +drop table if exists gh_ost_test; +create table gh_ost_test ( + id int auto_increment, + c1 int null, + c2 int not null, + primary key (id) +) auto_increment=1; + +insert into gh_ost_test values (null, null, 17); +insert into gh_ost_test values (null, null, 19); + +drop event if exists gh_ost_test; +delimiter ;; +create event gh_ost_test + on schedule every 1 second + starts current_timestamp + ends current_timestamp + interval 60 second + on completion not preserve + enable + do +begin + insert ignore into gh_ost_test values (101, 11, 23); + insert ignore into gh_ost_test values (102, 13, 23); + insert into gh_ost_test values (null, 17, 23); + insert into gh_ost_test values (null, null, 29); + set @last_insert_id := last_insert_id(); + -- update gh_ost_test set c2=c2+@last_insert_id where id=@last_insert_id order by id desc limit 1; + delete from gh_ost_test where id=1; + delete from gh_ost_test where c1=13; -- id=2 +end ;; diff --git a/localtests/drop-null-add-not-null/extra_args b/localtests/drop-null-add-not-null/extra_args new file mode 100644 index 0000000..948e978 --- /dev/null +++ b/localtests/drop-null-add-not-null/extra_args @@ -0,0 +1 @@ +--alter="drop column c1, add column c1 int not null after id" From 85c498511e4132c61b9188e650ce43b45b33f1e5 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 23 Apr 2017 08:23:56 +0300 Subject: [PATCH 21/35] parser recognizes DROP COLUMN tokens --- go/sql/parser.go | 30 +++++++++++++++++++++++++----- go/sql/parser_test.go | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 5 deletions(-) diff --git a/go/sql/parser.go b/go/sql/parser.go index b81b31c..7875fcc 100644 --- a/go/sql/parser.go +++ b/go/sql/parser.go @@ -14,15 +14,18 @@ import ( var ( sanitizeQuotesRegexp = regexp.MustCompile("('[^']*')") renameColumnRegexp = regexp.MustCompile(`(?i)\bchange\s+(column\s+|)([\S]+)\s+([\S]+)\s+`) + dropColumnRegexp = regexp.MustCompile(`(?i)\bdrop\s+(column\s+|)([\S]+)$`) ) type Parser struct { columnRenameMap map[string]string + droppedColumns map[string]bool } func NewParser() *Parser { return &Parser{ columnRenameMap: make(map[string]string), + droppedColumns: make(map[string]bool), } } @@ -59,10 +62,9 @@ func (this *Parser) sanitizeQuotesFromAlterStatement(alterStatement string) (str return strippedStatement } -func (this *Parser) ParseAlterStatement(alterStatement string) (err error) { - alterTokens, _ := this.tokenizeAlterStatement(alterStatement) - for _, alterToken := range alterTokens { - alterToken = this.sanitizeQuotesFromAlterStatement(alterToken) +func (this *Parser) parseAlterToken(alterToken string) (err error) { + { + // rename allStringSubmatch := renameColumnRegexp.FindAllStringSubmatch(alterToken, -1) for _, submatch := range allStringSubmatch { if unquoted, err := strconv.Unquote(submatch[2]); err == nil { @@ -71,10 +73,28 @@ func (this *Parser) ParseAlterStatement(alterStatement string) (err error) { if unquoted, err := strconv.Unquote(submatch[3]); err == nil { submatch[3] = unquoted } - this.columnRenameMap[submatch[2]] = submatch[3] } } + { + // drop + allStringSubmatch := dropColumnRegexp.FindAllStringSubmatch(alterToken, -1) + for _, submatch := range allStringSubmatch { + if unquoted, err := strconv.Unquote(submatch[2]); err == nil { + submatch[2] = unquoted + } + this.droppedColumns[submatch[2]] = true + } + } + return nil +} + +func (this *Parser) ParseAlterStatement(alterStatement string) (err error) { + alterTokens, _ := this.tokenizeAlterStatement(alterStatement) + for _, alterToken := range alterTokens { + alterToken = this.sanitizeQuotesFromAlterStatement(alterToken) + this.parseAlterToken(alterToken) + } return nil } diff --git a/go/sql/parser_test.go b/go/sql/parser_test.go index 8039f5f..3e1d845 100644 --- a/go/sql/parser_test.go +++ b/go/sql/parser_test.go @@ -120,3 +120,42 @@ func TestSanitizeQuotesFromAlterStatement(t *testing.T) { test.S(t).ExpectEquals(strippedStatement, "change column i int ''") } } + +func TestParseAlterStatementDroppedColumns(t *testing.T) { + + { + parser := NewParser() + statement := "drop column b" + err := parser.ParseAlterStatement(statement) + test.S(t).ExpectNil(err) + test.S(t).ExpectEquals(len(parser.droppedColumns), 1) + test.S(t).ExpectTrue(parser.droppedColumns["b"]) + } + { + parser := NewParser() + statement := "drop column b, drop key c_idx, drop column `d`" + err := parser.ParseAlterStatement(statement) + test.S(t).ExpectNil(err) + test.S(t).ExpectEquals(len(parser.droppedColumns), 2) + test.S(t).ExpectTrue(parser.droppedColumns["b"]) + test.S(t).ExpectTrue(parser.droppedColumns["d"]) + } + { + parser := NewParser() + statement := "drop column b, drop key c_idx, drop column `d`, drop `e`, drop primary key, drop foreign key fk_1" + err := parser.ParseAlterStatement(statement) + test.S(t).ExpectNil(err) + test.S(t).ExpectEquals(len(parser.droppedColumns), 3) + test.S(t).ExpectTrue(parser.droppedColumns["b"]) + test.S(t).ExpectTrue(parser.droppedColumns["d"]) + test.S(t).ExpectTrue(parser.droppedColumns["e"]) + } + { + parser := NewParser() + statement := "drop column b, drop bad statement, add column i int" + err := parser.ParseAlterStatement(statement) + test.S(t).ExpectNil(err) + test.S(t).ExpectEquals(len(parser.droppedColumns), 1) + test.S(t).ExpectTrue(parser.droppedColumns["b"]) + } +} From b0469b95b58983d444b7a9966844417e6a6e1199 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 23 Apr 2017 08:37:48 +0300 Subject: [PATCH 22/35] further fixes to the test case --- localtests/drop-null-add-not-null/extra_args | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localtests/drop-null-add-not-null/extra_args b/localtests/drop-null-add-not-null/extra_args index 948e978..8219c7d 100644 --- a/localtests/drop-null-add-not-null/extra_args +++ b/localtests/drop-null-add-not-null/extra_args @@ -1 +1 @@ ---alter="drop column c1, add column c1 int not null after id" +--alter="drop column c1, add column c1 int not null default 47" From 8a0f1413eb06d2d08c4c2f55fa302a1c06c05ee7 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 23 Apr 2017 08:38:35 +0300 Subject: [PATCH 23/35] dropped columns are not 'shared' and no data copy attempted for such columns --- go/base/context.go | 1 + go/logic/inspect.go | 7 +++++++ go/logic/migrator.go | 1 + go/sql/parser.go | 4 ++++ 4 files changed, 13 insertions(+) diff --git a/go/base/context.go b/go/base/context.go index 5ab6d2a..f5a7bca 100644 --- a/go/base/context.go +++ b/go/base/context.go @@ -181,6 +181,7 @@ type MigrationContext struct { UniqueKey *sql.UniqueKey SharedColumns *sql.ColumnList ColumnRenameMap map[string]string + DroppedColumnsMap map[string]bool MappedSharedColumns *sql.ColumnList MigrationRangeMinValues *sql.ColumnValues MigrationRangeMaxValues *sql.ColumnValues diff --git a/go/logic/inspect.go b/go/logic/inspect.go index 1d30cb5..181ed0b 100644 --- a/go/logic/inspect.go +++ b/go/logic/inspect.go @@ -662,7 +662,14 @@ func (this *Inspector) getSharedColumns(originalColumns, ghostColumns *sql.Colum } sharedColumnNames := []string{} for _, originalColumn := range originalColumns.Names() { + isSharedColumn := false if columnsInGhost[originalColumn] || columnsInGhost[columnRenameMap[originalColumn]] { + isSharedColumn = true + } + if this.migrationContext.DroppedColumnsMap[originalColumn] { + isSharedColumn = false + } + if isSharedColumn { sharedColumnNames = append(sharedColumnNames, originalColumn) } } diff --git a/go/logic/migrator.go b/go/logic/migrator.go index 59432a3..092039e 100644 --- a/go/logic/migrator.go +++ b/go/logic/migrator.go @@ -248,6 +248,7 @@ func (this *Migrator) validateStatement() (err error) { } log.Infof("Alter statement has column(s) renamed. gh-ost finds the following renames: %v; --approve-renamed-columns is given and so migration proceeds.", this.parser.GetNonTrivialRenames()) } + this.migrationContext.DroppedColumnsMap = this.parser.DroppedColumnsMap() return nil } diff --git a/go/sql/parser.go b/go/sql/parser.go index 7875fcc..7114e10 100644 --- a/go/sql/parser.go +++ b/go/sql/parser.go @@ -111,3 +111,7 @@ func (this *Parser) GetNonTrivialRenames() map[string]string { func (this *Parser) HasNonTrivialRenames() bool { return len(this.GetNonTrivialRenames()) > 0 } + +func (this *Parser) DroppedColumnsMap() map[string]bool { + return this.droppedColumns +} From 7f62efba910b082331a2915da4aafc67737c0aad Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Sun, 23 Apr 2017 08:48:06 +0300 Subject: [PATCH 24/35] tests to only check non dropped columns --- localtests/drop-null-add-not-null/ghost_columns | 1 + localtests/drop-null-add-not-null/orig_columns | 1 + 2 files changed, 2 insertions(+) create mode 100644 localtests/drop-null-add-not-null/ghost_columns create mode 100644 localtests/drop-null-add-not-null/orig_columns diff --git a/localtests/drop-null-add-not-null/ghost_columns b/localtests/drop-null-add-not-null/ghost_columns new file mode 100644 index 0000000..16f9ec0 --- /dev/null +++ b/localtests/drop-null-add-not-null/ghost_columns @@ -0,0 +1 @@ +c2 diff --git a/localtests/drop-null-add-not-null/orig_columns b/localtests/drop-null-add-not-null/orig_columns new file mode 100644 index 0000000..16f9ec0 --- /dev/null +++ b/localtests/drop-null-add-not-null/orig_columns @@ -0,0 +1 @@ +c2 From 79cee99f57f5f341bf6ff141b3c162113747fb04 Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Fri, 28 Apr 2017 15:50:51 -0700 Subject: [PATCH 25/35] support for 'coordinates' command --- doc/interactive-commands.md | 1 + go/base/context.go | 15 +++++++++++++++ go/logic/migrator.go | 7 +++++++ go/logic/server.go | 9 +++++++++ 4 files changed, 32 insertions(+) diff --git a/doc/interactive-commands.md b/doc/interactive-commands.md index c0389e1..9e94373 100644 --- a/doc/interactive-commands.md +++ b/doc/interactive-commands.md @@ -17,6 +17,7 @@ Both interfaces may serve at the same time. Both respond to simple text command, - `help`: shows a brief list of available commands - `status`: returns a detailed status summary of migration progress and configuration - `sup`: returns a brief status summary of migration progress +- `coordinates`: returns recent (though not exactly up to date) binary log coordinates of the inspected server - `chunk-size=`: modify the `chunk-size`; applies on next running copy-iteration - `max-lag-millis=`: modify the maximum replication lag threshold (milliseconds, minimum value is `100`, i.e. `0.1` second) - `max-load=`: modify the `max-load` config; applies on next running copy-iteration diff --git a/go/base/context.go b/go/base/context.go index 5ab6d2a..9d941c0 100644 --- a/go/base/context.go +++ b/go/base/context.go @@ -188,6 +188,8 @@ type MigrationContext struct { MigrationIterationRangeMinValues *sql.ColumnValues MigrationIterationRangeMaxValues *sql.ColumnValues + recentBinlogCoordinates mysql.BinlogCoordinates + CanStopStreaming func() bool } @@ -543,6 +545,19 @@ func (this *MigrationContext) SetNiceRatio(newRatio float64) { this.niceRatio = newRatio } +func (this *MigrationContext) GetRecentBinlogCoordinates() mysql.BinlogCoordinates { + this.throttleMutex.Lock() + defer this.throttleMutex.Unlock() + + return this.recentBinlogCoordinates +} + +func (this *MigrationContext) SetRecentBinlogCoordinates(coordinates mysql.BinlogCoordinates) { + this.throttleMutex.Lock() + defer this.throttleMutex.Unlock() + this.recentBinlogCoordinates = coordinates +} + // ReadMaxLoad parses the `--max-load` flag, which is in multiple key-value format, // such as: 'Threads_running=100,Threads_connected=500' // It only applies changes in case there's no parsing error. diff --git a/go/logic/migrator.go b/go/logic/migrator.go index 59432a3..3ee94cb 100644 --- a/go/logic/migrator.go +++ b/go/logic/migrator.go @@ -952,6 +952,13 @@ func (this *Migrator) initiateStreaming() error { } log.Debugf("Done streaming") }() + + go func() { + ticker := time.Tick(1 * time.Second) + for range ticker { + this.migrationContext.SetRecentBinlogCoordinates(*this.eventsStreamer.GetCurrentBinlogCoordinates()) + } + }() return nil } diff --git a/go/logic/server.go b/go/logic/server.go index 12c26b8..95fd898 100644 --- a/go/logic/server.go +++ b/go/logic/server.go @@ -144,6 +144,7 @@ func (this *Server) applyServerCommand(command string, writer *bufio.Writer) (pr fmt.Fprintln(writer, `available commands: status # Print a detailed status message sup # Print a short status message +coordinates # Print the currently inspected coordinates chunk-size= # Set a new chunk-size nice-ratio= # Set a new nice-ratio, immediate sleep after each row-copy operation, float (examples: 0 is agrressive, 0.7 adds 70% runtime, 1.0 doubles runtime, 2.0 triples runtime, ...) critical-load= # Set a new set of max-load thresholds @@ -165,6 +166,14 @@ help # This message return ForcePrintStatusOnlyRule, nil case "info", "status": return ForcePrintStatusAndHintRule, nil + case "coordinates": + { + if argIsQuestion || arg == "" { + fmt.Fprintf(writer, "%+v\n", this.migrationContext.GetRecentBinlogCoordinates()) + return NoPrintStatusRule, nil + } + return NoPrintStatusRule, fmt.Errorf("coordinates are read-only") + } case "chunk-size": { if argIsQuestion { From e017ec18e466dffb40b30c19c99b418e6345a033 Mon Sep 17 00:00:00 2001 From: Jess Breckenridge Date: Wed, 3 May 2017 14:25:31 -0600 Subject: [PATCH 26/35] - Initial documentation on contributing to gh-ost. - It is a bit sparse currently, but will give beginners an idea how on to setup the environment and run tests. - A good starting point for further PR's. --- doc/codingghost.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) create mode 100644 doc/codingghost.md diff --git a/doc/codingghost.md b/doc/codingghost.md new file mode 100644 index 0000000..9828dac --- /dev/null +++ b/doc/codingghost.md @@ -0,0 +1,23 @@ +# Getting started with gh-ost development. + +## Overview + +Getting started with gh-ost development is simple! + +- First clone the repository. +- From inside of the repository run `script/cibuild` +- This will bootstrap the environment if needed, format the code, build the code, and then run the unit test. + +## CI build workflow + +`script/cibuild` performs the following actions: + +- It runs `script/bootstrap` +- `script/bootstrap` run `script/ensure-go-installed` +- `script/ensure-go-installed` installs go locally if (go is not installed) || (go is not version 1.7). It also will not install go if it is already installed locally. +- `script/build` builds the binary and places in in `bin/` + +## Notes: + +Currently, `script/ensure-go-installed` will install `go` for Mac OS X and Linux. We welcome PR's to add other platforms. + From 95c9547d541e5926894418771cd4f3479b68e38e Mon Sep 17 00:00:00 2001 From: Jess Breckenridge Date: Wed, 3 May 2017 14:31:45 -0600 Subject: [PATCH 27/35] - Initial documentation on getting started developing for gh-ost. --- doc/{codingghost.md => coding-ghost.md} | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) rename doc/{codingghost.md => coding-ghost.md} (70%) diff --git a/doc/codingghost.md b/doc/coding-ghost.md similarity index 70% rename from doc/codingghost.md rename to doc/coding-ghost.md index 9828dac..0023b9f 100644 --- a/doc/codingghost.md +++ b/doc/coding-ghost.md @@ -13,11 +13,10 @@ Getting started with gh-ost development is simple! `script/cibuild` performs the following actions: - It runs `script/bootstrap` -- `script/bootstrap` run `script/ensure-go-installed` -- `script/ensure-go-installed` installs go locally if (go is not installed) || (go is not version 1.7). It also will not install go if it is already installed locally. -- `script/build` builds the binary and places in in `bin/` +- `script/bootstrap` runs `script/ensure-go-installed` +- `script/ensure-go-installed` installs go locally if (go is not installed) || (go is not version 1.7). It will not install go if it is already installed locally and at the correct version. +- `script/build` builds the `gh-ost` binary and places in in `bin/` ## Notes: Currently, `script/ensure-go-installed` will install `go` for Mac OS X and Linux. We welcome PR's to add other platforms. - From 13491a0d0bf1f26b73c8cdcff73902b3024d42d9 Mon Sep 17 00:00:00 2001 From: Jess Breckenridge Date: Wed, 3 May 2017 14:35:31 -0600 Subject: [PATCH 28/35] - Adding link to `coding-ghost.md` documentation. --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index afc4cf6..086fa06 100644 --- a/README.md +++ b/README.md @@ -84,6 +84,8 @@ But then a rare genetic mutation happened, and the `c` transformed into `t`. And We develop `gh-ost` at GitHub and for the community. We may have different priorities than others. From time to time we may suggest a contribution that is not on our immediate roadmap but which may appeal to others. +Please see [Coding gh-ost](https://github.com/github/gh-ost/blob/develdocs/doc/command-line-flags.md) for a guide to getting started developing with gh-ost. + ## Download/binaries/source `gh-ost` is now GA and stable. From 7df2e0d433d5943a0e56a9574e6791adb6dd8319 Mon Sep 17 00:00:00 2001 From: Jess Breckenridge Date: Thu, 4 May 2017 12:51:00 -0600 Subject: [PATCH 29/35] - Updating PR template to reflect current build workflow --- .github/PULL_REQUEST_TEMPLATE.md | 4 +--- doc/coding-ghost.md | 9 +++------ 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 4dc48fd..8301e70 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -16,6 +16,4 @@ This PR [briefly explain what is does] > In case this PR introduced Go code changes: - [ ] contributed code is using same conventions as original code -- [ ] code is formatted via `gofmt` (please avoid `goimports`) -- [ ] code is built via `./build.sh` -- [ ] code is tested via `./test.sh` +- [ ] `script/cibuild` returns with no formatting errors, build errors or unit test errors. diff --git a/doc/coding-ghost.md b/doc/coding-ghost.md index 0023b9f..ee26f0c 100644 --- a/doc/coding-ghost.md +++ b/doc/coding-ghost.md @@ -4,18 +4,15 @@ Getting started with gh-ost development is simple! -- First clone the repository. +- First obtain the repository with `git clone` or `go get`. - From inside of the repository run `script/cibuild` - This will bootstrap the environment if needed, format the code, build the code, and then run the unit test. ## CI build workflow -`script/cibuild` performs the following actions: +`script/cibuild` performs the following actions will bootstrap the environment to build `gh-ost` correctly, build, perform syntax checks and run unit tests. -- It runs `script/bootstrap` -- `script/bootstrap` runs `script/ensure-go-installed` -- `script/ensure-go-installed` installs go locally if (go is not installed) || (go is not version 1.7). It will not install go if it is already installed locally and at the correct version. -- `script/build` builds the `gh-ost` binary and places in in `bin/` +If additional steps are needed, please add them into this workflow so that the workflow remains simple. ## Notes: From b79185ab6471a13f11e87540e0580c6da27af897 Mon Sep 17 00:00:00 2001 From: Jess Breckenridge Date: Thu, 4 May 2017 12:58:22 -0600 Subject: [PATCH 30/35] - Bad copy and paste. Coding gh-ost in README.md actually pointed to command-line-flags.md. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 086fa06..8de361b 100644 --- a/README.md +++ b/README.md @@ -84,7 +84,7 @@ But then a rare genetic mutation happened, and the `c` transformed into `t`. And We develop `gh-ost` at GitHub and for the community. We may have different priorities than others. From time to time we may suggest a contribution that is not on our immediate roadmap but which may appeal to others. -Please see [Coding gh-ost](https://github.com/github/gh-ost/blob/develdocs/doc/command-line-flags.md) for a guide to getting started developing with gh-ost. +Please see [Coding gh-ost](https://github.com/github/gh-ost/blob/develdocs/doc/coding-ghost.md) for a guide to getting started developing with gh-ost. ## Download/binaries/source From 3955a6d67f80404d1f2611b82023ba4daeda423c Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 24 May 2017 08:32:13 +0300 Subject: [PATCH 31/35] hibernate on critical-load --- go/base/context.go | 2 ++ go/cmd/gh-ost/main.go | 1 + go/logic/applier.go | 3 +++ go/logic/throttler.go | 30 ++++++++++++++++++++++++++++++ 4 files changed, 36 insertions(+) diff --git a/go/base/context.go b/go/base/context.go index c300df1..357afab 100644 --- a/go/base/context.go +++ b/go/base/context.go @@ -105,9 +105,11 @@ type MigrationContext struct { throttleQuery string throttleHTTP string ThrottleCommandedByUser int64 + HibernateUntil int64 maxLoad LoadMap criticalLoad LoadMap CriticalLoadIntervalMilliseconds int64 + CriticalLoadHibernateSeconds int64 PostponeCutOverFlagFile string CutOverLockTimeoutSeconds int64 ForceNamedCutOverCommand bool diff --git a/go/cmd/gh-ost/main.go b/go/cmd/gh-ost/main.go index f27e12b..a4f4f3e 100644 --- a/go/cmd/gh-ost/main.go +++ b/go/cmd/gh-ost/main.go @@ -112,6 +112,7 @@ func main() { maxLoad := flag.String("max-load", "", "Comma delimited status-name=threshold. e.g: 'Threads_running=100,Threads_connected=500'. When status exceeds threshold, app throttles writes") criticalLoad := flag.String("critical-load", "", "Comma delimited status-name=threshold, same format as --max-load. When status exceeds threshold, app panics and quits") flag.Int64Var(&migrationContext.CriticalLoadIntervalMilliseconds, "critical-load-interval-millis", 0, "When 0, migration immediately bails out upon meeting critical-load. When non-zero, a second check is done after given interval, and migration only bails out if 2nd check still meets critical load") + flag.Int64Var(&migrationContext.CriticalLoadHibernateSeconds, "critical-load-hibernate-seconds", 0, "When nonzero, critical-load does not panic and bail out; instead, gh-ost goes into hibernate for the specified duration. It will not read/write anything to from/to any server") quiet := flag.Bool("quiet", false, "quiet") verbose := flag.Bool("verbose", false, "verbose") debug := flag.Bool("debug", false, "debug mode (very verbose)") diff --git a/go/logic/applier.go b/go/logic/applier.go index 4e3f783..b167de8 100644 --- a/go/logic/applier.go +++ b/go/logic/applier.go @@ -293,6 +293,9 @@ func (this *Applier) WriteChangelogState(value string) (string, error) { func (this *Applier) InitiateHeartbeat() { var numSuccessiveFailures int64 injectHeartbeat := func() error { + if atomic.LoadInt64(&this.migrationContext.HibernateUntil) > 0 { + return nil + } if _, err := this.WriteChangelog("heartbeat", time.Now().Format(time.RFC3339Nano)); err != nil { numSuccessiveFailures++ if numSuccessiveFailures > this.migrationContext.MaxRetries() { diff --git a/go/logic/throttler.go b/go/logic/throttler.go index 1c2c62a..808fab4 100644 --- a/go/logic/throttler.go +++ b/go/logic/throttler.go @@ -38,6 +38,10 @@ func NewThrottler(applier *Applier, inspector *Inspector) *Throttler { // It merely observes the metrics collected by other components, it does not issue // its own metric collection. func (this *Throttler) shouldThrottle() (result bool, reason string, reasonHint base.ThrottleReasonHint) { + if hibernateUntil := atomic.LoadInt64(&this.migrationContext.HibernateUntil); hibernateUntil > 0 { + hibernateUntilTime := time.Unix(0, hibernateUntil) + return true, fmt.Sprintf("critical-load-hibernate until %+v", hibernateUntilTime), base.NoThrottleReasonHint + } generalCheckResult := this.migrationContext.GetThrottleGeneralCheckResult() if generalCheckResult.ShouldThrottle { return generalCheckResult.ShouldThrottle, generalCheckResult.Reason, generalCheckResult.ReasonHint @@ -96,6 +100,9 @@ func (this *Throttler) collectReplicationLag(firstThrottlingCollected chan<- boo if atomic.LoadInt64(&this.migrationContext.CleanupImminentFlag) > 0 { return nil } + if atomic.LoadInt64(&this.migrationContext.HibernateUntil) > 0 { + return nil + } if this.migrationContext.TestOnReplica || this.migrationContext.MigrateOnReplica { // when running on replica, the heartbeat injection is also done on the replica. @@ -128,6 +135,10 @@ func (this *Throttler) collectReplicationLag(firstThrottlingCollected chan<- boo // collectControlReplicasLag polls all the control replicas to get maximum lag value func (this *Throttler) collectControlReplicasLag() { + if atomic.LoadInt64(&this.migrationContext.HibernateUntil) > 0 { + return + } + replicationLagQuery := fmt.Sprintf(` select value from %s.%s where hint = 'heartbeat' and id <= 255 `, @@ -222,6 +233,9 @@ func (this *Throttler) criticalLoadIsMet() (met bool, variableName string, value // collectReplicationLag reads the latest changelog heartbeat value func (this *Throttler) collectThrottleHTTPStatus(firstThrottlingCollected chan<- bool) { collectFunc := func() (sleep bool, err error) { + if atomic.LoadInt64(&this.migrationContext.HibernateUntil) > 0 { + return true, nil + } url := this.migrationContext.GetThrottleHTTP() if url == "" { return true, nil @@ -247,6 +261,9 @@ func (this *Throttler) collectThrottleHTTPStatus(firstThrottlingCollected chan<- // collectGeneralThrottleMetrics reads the once-per-sec metrics, and stores them onto this.migrationContext func (this *Throttler) collectGeneralThrottleMetrics() error { + if atomic.LoadInt64(&this.migrationContext.HibernateUntil) > 0 { + return nil + } setThrottle := func(throttle bool, reason string, reasonHint base.ThrottleReasonHint) error { this.migrationContext.SetThrottleGeneralCheckResult(base.NewThrottleCheckResult(throttle, reason, reasonHint)) @@ -264,6 +281,19 @@ func (this *Throttler) collectGeneralThrottleMetrics() error { if err != nil { return setThrottle(true, fmt.Sprintf("%s %s", variableName, err), base.NoThrottleReasonHint) } + + if this.migrationContext.CriticalLoadHibernateSeconds > 0 { + hibernateDuration := time.Duration(this.migrationContext.CriticalLoadHibernateSeconds) * time.Second + hibernateUntilTime := time.Now().Add(hibernateDuration) + atomic.StoreInt64(&this.migrationContext.HibernateUntil, hibernateUntilTime.UnixNano()) + log.Errorf("critical-load met. Will hibernate for the duration of %+v, until %+v", hibernateDuration, hibernateUntilTime) + go func() { + time.Sleep(hibernateDuration) + atomic.StoreInt64(&this.migrationContext.HibernateUntil, 0) + }() + return nil + } + if criticalLoadMet && this.migrationContext.CriticalLoadIntervalMilliseconds == 0 { this.migrationContext.PanicAbort <- fmt.Errorf("critical-load met: %s=%d, >=%d", variableName, value, threshold) } From ad47f7c1477780020631627514d3ef168e8f71ff Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 24 May 2017 10:42:47 +0300 Subject: [PATCH 32/35] throttling just prior to leaving hibernation, so as to allow re-throttle checks to apply --- go/base/context.go | 5 +++-- go/logic/throttler.go | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/go/base/context.go b/go/base/context.go index 357afab..d82b22e 100644 --- a/go/base/context.go +++ b/go/base/context.go @@ -40,8 +40,9 @@ const ( type ThrottleReasonHint string const ( - NoThrottleReasonHint ThrottleReasonHint = "NoThrottleReasonHint" - UserCommandThrottleReasonHint = "UserCommandThrottleReasonHint" + NoThrottleReasonHint ThrottleReasonHint = "NoThrottleReasonHint" + UserCommandThrottleReasonHint = "UserCommandThrottleReasonHint" + LeavingHibernationThrottleReasonHint = "LeavingHibernationThrottleReasonHint" ) const ( diff --git a/go/logic/throttler.go b/go/logic/throttler.go index 808fab4..8f21c0d 100644 --- a/go/logic/throttler.go +++ b/go/logic/throttler.go @@ -289,6 +289,7 @@ func (this *Throttler) collectGeneralThrottleMetrics() error { log.Errorf("critical-load met. Will hibernate for the duration of %+v, until %+v", hibernateDuration, hibernateUntilTime) go func() { time.Sleep(hibernateDuration) + this.migrationContext.SetThrottleGeneralCheckResult(base.NewThrottleCheckResult(true, "leaving hibernation", base.LeavingHibernationThrottleReasonHint)) atomic.StoreInt64(&this.migrationContext.HibernateUntil, 0) }() return nil From 8da0f60582770b13e024efa65ec4f823c1e595ca Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 24 May 2017 10:53:00 +0300 Subject: [PATCH 33/35] fixed critical-load check for hibernation --- go/logic/throttler.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/go/logic/throttler.go b/go/logic/throttler.go index 8f21c0d..ecc1f2b 100644 --- a/go/logic/throttler.go +++ b/go/logic/throttler.go @@ -282,11 +282,11 @@ func (this *Throttler) collectGeneralThrottleMetrics() error { return setThrottle(true, fmt.Sprintf("%s %s", variableName, err), base.NoThrottleReasonHint) } - if this.migrationContext.CriticalLoadHibernateSeconds > 0 { + if criticalLoadMet && this.migrationContext.CriticalLoadHibernateSeconds > 0 { hibernateDuration := time.Duration(this.migrationContext.CriticalLoadHibernateSeconds) * time.Second hibernateUntilTime := time.Now().Add(hibernateDuration) atomic.StoreInt64(&this.migrationContext.HibernateUntil, hibernateUntilTime.UnixNano()) - log.Errorf("critical-load met. Will hibernate for the duration of %+v, until %+v", hibernateDuration, hibernateUntilTime) + log.Errorf("critical-load met: %s=%d, >=%d. Will hibernate for the duration of %+v, until %+v", variableName, value, threshold, hibernateDuration, hibernateUntilTime) go func() { time.Sleep(hibernateDuration) this.migrationContext.SetThrottleGeneralCheckResult(base.NewThrottleCheckResult(true, "leaving hibernation", base.LeavingHibernationThrottleReasonHint)) From 83c2c7dc230e6019697835b4251433fb06fd3571 Mon Sep 17 00:00:00 2001 From: Greg Roodt Date: Mon, 5 Jun 2017 15:59:24 +1000 Subject: [PATCH 34/35] Update requirements-and-limitations.md --- doc/requirements-and-limitations.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/requirements-and-limitations.md b/doc/requirements-and-limitations.md index 7d246a8..c961706 100644 --- a/doc/requirements-and-limitations.md +++ b/doc/requirements-and-limitations.md @@ -40,8 +40,8 @@ The `SUPER` privilege is required for `STOP SLAVE`, `START SLAVE` operations. Th - It is not allowed to migrate a table where another table exists with same name and different upper/lower case. - For example, you may not migrate `MyTable` if another table called `MYtable` exists in the same schema. -- Amazon RDS and Google Cloud SQL are currently not supported - - We began working towards removing this limitation. See tracking issue: https://github.com/github/gh-ost/issues/163 +- Amazon RDS works, but has it's own [limitations](rds.md). +- Google Cloud SQL is currently not supported - Multisource is not supported when migrating via replica. It _should_ work (but never tested) when connecting directly to master (`--allow-on-master`) From d153402438cd76bce16815e5a5f324260fe36e3b Mon Sep 17 00:00:00 2001 From: Shlomi Noach Date: Wed, 21 Jun 2017 08:44:26 +0300 Subject: [PATCH 35/35] Doc update for building from source --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 8de361b..67d83de 100644 --- a/README.md +++ b/README.md @@ -94,7 +94,9 @@ Please see [Coding gh-ost](https://github.com/github/gh-ost/blob/develdocs/doc/c [Download latest release here](https://github.com/github/gh-ost/releases/latest) -`gh-ost` is a Go project; it is built with Go 1.5 with "experimental vendor". Soon to migrate to Go 1.6. See and use [build file](https://github.com/github/gh-ost/blob/master/build.sh) for compiling it on your own. +`gh-ost` is a Go project; it is built with Go `1.8` (though `1.7` should work as well). To build on your own, use either: +- [script/build](https://github.com/github/gh-ost/blob/master/script/build) - this is the same build script used by CI hence the authoritative; artifact is `./bin/gh-ost` binary. +- [build.sh](https://github.com/github/gh-ost/blob/master/build.sh) for building `tar.gz` artifacts in `/tmp/gh-ost` Generally speaking, `master` branch is stable, but only [releases](https://github.com/github/gh-ost/releases) are to be used in production.