addresses #290
Note: there is currently no issue backfilling the ghost table when the
characterset changes, likely because it's a insert-into-select-from and
it all occurs within mysql.
However, when applying DML events (UPDATE, DELETE, etc) the values are
sprintf'd into a prepared statement and due to the possibility of
migrating text column data containing invalid characters in the
destination charset, a conversion step is often necessary.
For example, when migrating a table/column from latin1 to utf8mb4, the
latin1 column may contain characters that are invalid single-byte utf8
characters. Characters in the \x80-\xFF range are most common. When
written to utf8mb4 column without conversion, they fail as they do not
exist in the utf8 codepage.
Converting these texts/characters to the destination charset using
convert(? using {charset}) will convert appropriately and the
update/replace will succeed.
I only point out the "Note:" above because there are two tests added
for this: latin1text-to-utf8mb4 and latin1text-to-ut8mb4-insert
The former is a test that fails prior to this commit. The latter is a
test that succeeds prior to this comment. Both are affected by the code
in this commit.
convert text to original charset, then destination
converting text first to the original charset and then to the
destination charset produces the most consistent results, as inserting
the binary into a utf8-charset column may encounter an error if there is
no prior context of latin1 encoding.
mysql> select hex(convert(char(189) using utf8mb4));
+---------------------------------------+
| hex(convert(char(189) using utf8mb4)) |
+---------------------------------------+
| |
+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> select hex(convert(convert(char(189) using latin1) using utf8mb4));
+-------------------------------------------------------------+
| hex(convert(convert(char(189) using latin1) using utf8mb4)) |
+-------------------------------------------------------------+
| C2BD |
+-------------------------------------------------------------+
1 row in set (0.00 sec)
as seen in this failure on 5.5.62
Error 1300: Invalid utf8mb4 character string: 'BD'; query=
replace /* gh-ost `test`.`_gh_ost_test_gho` */ into
`test`.`_gh_ost_test_gho`
(`id`, `t`)
values
(?, convert(? using utf8mb4))
this test assumes a latin1-encoded table with content containing bytes
in the \x80-\xFF, which are invalid single-byte characters in utf8 and
cannot be inserted in the altered table when the column containing these
characters is changed to utf8(mb4).
since these characters cannot be inserted, gh-ost fails.
* Add a go.mod file
* run go mod vendor again
* Move to a well-supported ini file reader
* Remove GO111MODULE=off
* Use go 1.16
* Rename github.com/outbrain/golib -> github.com/openark/golib
* Remove *.go-e files
* Fix for `strconv.ParseInt: parsing "": invalid syntax` error
* Add test for '[osc]' section
Co-authored-by: Nate Wernimont <nate.wernimont@workiva.com>
* v1.1.0
* WIP: copying AUTO_INCREMENT value to ghost table
Initial commit: towards setting up a test suite
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* greping for 'expect_table_structure' content
* Adding simple test for 'expect_table_structure' scenario
* adding tests for AUTO_INCREMENT value after row deletes. Should initially fail
* clear event beforehand
* parsing AUTO_INCREMENT from alter query, reading AUTO_INCREMENT from original table, applying AUTO_INCREMENT value onto ghost table if applicable and user has not specified AUTO_INCREMENT in alter statement
* support GetUint64
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* minor update to test
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* adding test for user defined AUTO_INCREMENT statement
* Generated column as part of UNIQUE (or PRIMARY) KEY
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* skip analysis of generated column data type in unique key
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* All MySQL DBs limited to max 3 concurrent/idle connections
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* hooks: reporting GH_OST_ETA_SECONDS. ETA stored as part of migration context
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* GH_OST_ETA_NANOSECONDS
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* N/A denoted by negative value
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* ETAUnknown constant
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* Convering enum to varchar
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* test: not null
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* first attempt at setting enum-to-string right
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* fix insert query
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* store enum values, use when populating
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* apply EnumValues to mapped column
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* fix compilation error
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* gofmt
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* Add GO111MODULE=off to build.sh
* Use golang 1.16
* Update go version in README.md
* Add missing GO111MODULE=off
* Add missing GO111MODULE=off again
* Use go1.16.3 explicitly
* Use 1.16 for CI test
* Update min go version
* Use go 1.16.4
* v1.1.0
* WIP: copying AUTO_INCREMENT value to ghost table
Initial commit: towards setting up a test suite
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* greping for 'expect_table_structure' content
* Adding simple test for 'expect_table_structure' scenario
* adding tests for AUTO_INCREMENT value after row deletes. Should initially fail
* clear event beforehand
* parsing AUTO_INCREMENT from alter query, reading AUTO_INCREMENT from original table, applying AUTO_INCREMENT value onto ghost table if applicable and user has not specified AUTO_INCREMENT in alter statement
* support GetUint64
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* minor update to test
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* adding test for user defined AUTO_INCREMENT statement
* Generated column as part of UNIQUE (or PRIMARY) KEY
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* skip analysis of generated column data type in unique key
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* v1.1.0
* WIP: copying AUTO_INCREMENT value to ghost table
Initial commit: towards setting up a test suite
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* greping for 'expect_table_structure' content
* Adding simple test for 'expect_table_structure' scenario
* adding tests for AUTO_INCREMENT value after row deletes. Should initially fail
* clear event beforehand
* parsing AUTO_INCREMENT from alter query, reading AUTO_INCREMENT from original table, applying AUTO_INCREMENT value onto ghost table if applicable and user has not specified AUTO_INCREMENT in alter statement
* support GetUint64
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* minor update to test
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* adding test for user defined AUTO_INCREMENT statement
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
There are some legitimate retries that can occur during testing. Namely
`logic.ExpectProcess()` (in `applier.go`). We'll look for a process that
does exist, but timing-wise doesn't have the `state` or `info` columns
populated.
Without this, the test will fail abruptly.