gh-ost

Author	SHA1	Message	Date
Shlomi Noach	b00cae11fa	retry cut-over	2016-11-17 17:10:17 +01:00
Shlomi Noach	7fa5e405d4	avoid writing heartbeat when throttle commanded by user when throttling on user command there really is no need for injecting heartbeat. The user commanded, therefore gh-ost complies and trusts the reasoning for throttling. What this will allow is complete quiet time. This, in turn, will allow such features as relocating via orchestrator/pseudo-gtid at time of throttling	2016-10-27 14:51:38 +02:00
Shlomi Noach	ac6159791d	merged master, resolved conflicts	2016-10-26 09:57:59 +02:00
Shlomi Noach	bf92eec214	validating table structure on applier and migrator - reading column list on applier - comparing original table on applier and migrator, expecting exact column list - or else bailing out	2016-10-20 11:29:30 +02:00
Shlomi Noach	25166e33c7	solving the enum-as-part-of-pk bug	2016-10-19 15:22:29 +02:00
Shlomi Noach	f9c15127cd	simplified applier read of timezone	2016-10-14 12:56:43 +02:00
Shlomi Noach	ac304def4d	applier always uses UTC	2016-10-13 13:08:02 +02:00
Shlomi Noach	dbf50afbc7	reading time_zone settings for Inspector and Applier separately. --time-zone overrides both of them, if given	2016-10-11 16:00:26 +02:00
Shlomi Noach	6750959e1a	configurable time-zone, native time parsing	2016-10-10 11:39:57 +02:00
Shlomi Noach	0c35a811f7	setting time_zoe='+00:00' on rowcopy	2016-10-08 11:06:27 +02:00
Shlomi Noach	791d963ea0	Character set recognition and manipulation - Identifying textual characters sets; converting into specific type when applying dml events - Refactored `ColumnsList`: introducing `Column` type - Refactored `unsigned` handling, as part of `Column` - `Column` type supports `convertArg()`: converting value of argument according to column data type - DB URI attempts `utf8mb4,utf8,latin1` charsets in that order (first one to be recognized wins) - Local tests filter by pattern - Local tests append table schema on failure - Local tests do not have postpone flag file - Added character set local tests: `utf8`, `utf8mb4`, `latin1`	2016-09-07 14:24:11 +02:00
Shlomi Noach	2afb86b9e4	support for millisecond throttling - `--max-lag-millis` is at least `100ms` - `--heartbeat-interval-millis` introduced; defaults `500ms`, can range `100ms` - `1s` - Control replicas lag calculated asynchronously to throttle test - aggressive when `max-lag-millis < 1000` and when `replication-lag-query` is given	2016-08-30 09:41:59 +02:00
Shlomi Noach	c7edd1ef84	Merge branch 'master' into concurrent-rowcount	2016-08-25 09:44:12 +02:00
Shlomi Noach	1773f338c2	keeping track of delta rows on concurrent count(*) this means we re-apply delta onto new estimate	2016-08-24 12:16:34 +02:00
Shlomi Noach	4c972184a8	Merge branch 'master' into sql-mode-strict	2016-08-24 09:34:00 +02:00
Shlomi Noach	56fd82a824	Merge pull request #174 from Wattpad/test-on-replica-manual-replication-control outstanding. Thank you!	2016-08-24 09:12:21 +02:00
Paulo Bittencourt	6b21ade6d0	Check for --test-on-replica-skip-replica-stop in cutOver method	2016-08-23 18:34:10 -04:00
Shlomi Noach	8b76d0e75b	DML write sets sql_mode to STRICT ALL TABLES	2016-08-23 11:58:52 +02:00
Shlomi Noach	b63cc3e75e	fix INSERT DML handling on renamed column	2016-08-22 16:00:15 +02:00
Shlomi Noach	9cf4819a98	Merge branch 'master' into fix-rename Wish to incorporate important time_zone fix	2016-08-22 11:54:52 +02:00
Shlomi Noach	1376f0af23	fixed UPDATE dml on renamed column	2016-08-22 08:49:27 +02:00
Paulo Bittencourt	2e43718ef3	Add --test-on-replica-skip-replica-stop flag	2016-08-19 17:34:08 -04:00
Shlomi Noach	6d80340e4f	setting time_zone on DML apply	2016-08-19 09:06:00 +02:00
Paulo Bittencourt	a62f9e0754	Add --test-on-replica-manual-replication-control flag This will wait indefinitely for the replication status to change. This allows us to run test schema changes in RDS without needing custom RDS commands in gh-ost.	2016-08-18 11:53:25 -04:00
Shlomi Noach	75e0d12302	simplified error logic; fixed incorrect RowsEstimate handling on error	2016-08-18 13:38:23 +02:00
Shlomi Noach	74593ec010	DML write wrapped in transaction - solving the golang problem: 'sql: converting Exec argument #2's type: uint64 values with high bit set are not supported'	2016-08-18 13:31:53 +02:00
Shlomi Noach	3a0ee9b4a5	clarified commented transactional apply	2016-08-17 10:50:41 +02:00
Shlomi Noach	4c8edf6372	elaborate error message on applying event data: printing out the error, query and args	2016-08-17 06:50:40 +02:00
Shlomi Noach	596dce5993	elaborate output on error in apply dml	2016-08-15 15:23:30 +02:00
Damian Gryski	e02a49449e	all: use time.Since() instead of time.Now().Sub Patch created with: gofmt -w -r 'time.Now().Sub(a) -> time.Since(a)' .	2016-08-02 08:38:56 -04:00
Shlomi Noach	ef59a866d8	Removed legacy 'safe cut-over' Now that we have the atomic cut-over, the former is redundant	2016-07-16 08:12:19 -06:00
Shlomi Noach	8217536898	supporting --cut-over-lock-timeout-seconds	2016-07-08 10:14:58 +02:00
Shlomi Noach	0191b2897d	an atomic cut-over implementation, as per issue #82	2016-06-27 11:08:06 +02:00
Shlomi Noach	96e8419a35	Solved cut-over stall; change of table names - Cutover would stall after `lock tables` wait-timeout due do waiting on a channel that would never be written to. This has been identified, reproduced, fixed, confirmed. - Change of table names. Heres the story: - Because were testing this even while `pt-online-schema-change` is being used in production, the `_tbl_old` naming convention makes for a collision. - "old" table name is now `_tbl_del`, "del" standing for "delete" - ghost table name is now `_tbl_gho` - when issuing `--test-on-replica`, we keep the ghost table around, and were also briefly renaming original table to "old". Well this collides with a potentially existing "old" table on master (one that hasnt been dropped yet). `--test-on-replica` uses `_tbl_ght` (ghost-test) - similar problem with `--execute-on-replica`, and in this case the table doesnt stick around; calling it `_tbl_ghr` (ghost-replica) - changelog table is now `_tbl_ghc` (ghost-changelog) - To clarify, I dont want to go down the path of creating "old" tables with 2 or 3 or 4 or 5 or infinite leading underscored. I think this is very confusing and actually not operations friendly. Its OK that the migration will fail saying "hey, you ALREADY have an old table here, why dont you take care of it first", rather than create _yet_another_ `____tbl_old` table. Were always confused on which table it actually is that gets migrated, which is safe to `drop`, etc. - just after rowcopy completing, just before cutover, during cutover: marking as point in time _of interest_ so as to increase logging frequency.	2016-06-21 12:56:01 +02:00
Shlomi Noach	62b8a897e3	Retries, better visibility, documentation - Rowcopy time is bounded by copy end-time - Retries are configurable via `--default-retries` (default: `60`) - `migrator` notes the hostname - `applier` and `inspector` note `impliedKey` (`@@hostname` and `@@port`) - Added lots of code comments - Adding documentation for "triggerless design"	2016-06-19 17:55:37 +02:00
Shlomi Noach	23cb8ea7e9	Throttling & critical load - Added `--throttle-query` param (when returns > 0, throttling applies) - Added `--critical-load`, similar to `--max-load` but implies panic and quit - Recoded -load as `LoadMap` - More info on -load throttle/panic - `printStatus()` now gets printing heuristic. Always shows up on interactive `"status"` - Fixed `change column` (aka rename) handling with quotes - Removed legacy `mysqlbinlog` parser code - Added tests	2016-06-18 21:12:07 +02:00
Shlomi Noach	836d0fe119	Supporting column rename - Parsing `alter` statement to catch `change old_name new_name ...` statements - Auto deducing renamed columns - When suspecting renamed columns, requesting explicit `--approve-renamed-columns` or `--skip-renamed-columns` - updated tests	2016-06-17 08:03:18 +02:00
Shlomi Noach	96bc3804eb	test-on-replica stops replication completely	2016-06-14 12:50:07 +02:00
Shlomi Noach	97adbf1ff8	- `--cut-over` no longer mandatory; default to `safe` - Removed `CutOverVoluntaryLock` and associated code - Removed `CutOverUdfWait` - `RenameTablesRollback()` first attempts an atomic swap	2016-06-14 09:01:06 +02:00
Shlomi Noach	cb1c61ac47	- `--cut-over` no longer mandatory; default to `safe` - Removed `CutOverVoluntaryLock` and associated code - Removed `CutOverUdfWait` - `RenameTablesRollback()` first attempts an atomic swap	2016-06-14 09:00:56 +02:00
Shlomi Noach	8292f5608f	Safe cut-over - Supporting multi-step, safe cut-over phase, where queries are blocked throughout the phase, and worst case scenario is table outage (no data corruption) - Self-rollsback in case of failure (restored original table)	2016-06-14 08:35:07 +02:00
Shlomi Noach	b8c7e046a1	test-on-replica to invoke cut-over swap	2016-06-10 11:15:11 +02:00
Shlomi Noach	fc00cb2289	adding interactive user commands	2016-06-07 11:59:17 +02:00
Shlomi Noach	5375aa4f69	- Removed use of `master_pos_wait()`. It was unneccessary in the first place and introduced new problems. - Supporting `--allow-nullable-unique-key` - Tool will bail out if chosen key has nullable columns and the above is not provided - Fixed `OriginalBinlogRowImage` comaprison (lower/upper case issue) - Introduced reasonable streamer reconnect sleep time	2016-05-20 12:52:14 +02:00
Shlomi Noach	9b54d0208f	- Handling gomysql.replication connection timeouts: reconnecting on last known position - `printStatus()` takes ETA into account - More info around `master_pos_wait()`	2016-05-19 15:11:36 +02:00
Shlomi Noach	ec34a5ef75	master_pos_wait is now OK to return NULL. We only care if it returns with -1	2016-05-18 15:08:47 +02:00
Shlomi Noach	9f56a84b57	Fixing single-row table migration - `BuildUniqueKeyRangeEndPreparedQuery` supports `includeRangeStartValues` argument - `applier` sends `this.migrationContext.GetIteration() == 0` as argument	2016-05-18 14:53:09 +02:00
Shlomi Noach	065d9c40ec	some messagages are now Info instead of Debug	2016-05-17 11:57:43 +02:00
Shlomi Noach	9d055dbda7	renaming to gh-ost	2016-05-16 11:09:17 +02:00
Shlomi Noach	1e10f1f29e	Solved various race conditions: - Operation would terminate after events lock noticed but before applying all events: race condition where the event would be captured asynchronously. The event is now handled sequentially with the DML events, hence now safe. - Multiple rowcopy operations would still write to `rowCopyComplete` channel. This is still the case, but now we only wait for the first and then just flush (read and discard) any others, to avoid blocking - Events DML listener is only added after table creation: the problem was that with very busy tables, the events func buffer would fill up, and the "tables-created" event would be blocked. - `waitForEventsUpToLock()` unifies the waiting on all variants of complete-migration - With `--test-on-replica`, now stopping replication "nicely", using `master_pos_wait()` - With `--test-on-replica`, not throttling on replication after replication is stopped (duh) - More debug output	2016-05-16 11:03:15 +02:00

1 2

66 Commits