pt-online-schema-change VS gh-ost

Blocking Use Cases of pt-online-schema-change:

  • It is not possible to reduce the overhead of the tool to 0. Pt-online-schema-change gives you an option to define the maximum allowed replication lag and, if that threshold is crossed, it stops to copy data between the old and new table. It is also possible to pause the background process entirely. The problem is that we are talking only about the background process of running INSERTs. It is not possible to reduce the overhead caused by the fact that every operation in “yourtable” is duplicated in “yourtable_new” through triggers. If you remove the triggers, the old and new table would go out of sync without any means to sync them again. Therefore, when you run pt-online-schema-change on your system, it always adds some overhead, even if it is paused or throttled. How big overhead depends on how many writes hit the table which is undergoing a schema change.
  • Another issue is caused again by triggers – precisely by the fact that, to create triggers, one has to acquire a lock on MySQL’s metadata. This can become a serious problem if you have highly concurrent traffic or if you use longer transactions. Under such load, it may be virtually impossible to use pt-online-schema-change due to the fact that it is not able to acquire metadata lock to create the required triggers. Additionally, the process of acquiring metadata can also lock further transactions, basically grinding all database operations to halt.
  • Rename table requires lock on table any long transcation or high load keep rename table operation in waiting for mtadata lock, which cause further transactions to be on hold till completion of rename table, which may cause max connection limit reached in database.
  • Yet another problem are foreign keys – unfortunately, there is no simple way of handling them. Pt-online-schema-change gives you two methods to approach this issue. Neither of those are really good. The main issue here is that a foreign key of a given name can only refer to a single table and it sticks to it – even if you rename the table referred to, the foreign key will follow this change. This leads to the problem: after RENAME TABLE, the foreign key will point to ‘yourtable_old’, not ‘yourtable’.

 

Blocking Use Cases of gho-ost:

  • Does not support SBR(Statment based Replication) or Mixed, Requires atleast one slave with RBR(Row Based Replication).
  • Does not Support Foreign Keys;
  • Does not Support tables schema change with triggers.
  • Does not support Galera Cluster.
  • Minimal row image is not supported (which makes your binlogs grow larger), JSON and generated columns in 5.7 are not supported.
  • Network traffic is increased compared to pt-online-schema-change – not only gh-ost has to copy data but it also has to copy binary logs.

 

Benefits of gho-ost:

  • You case stop 100% load of schema change process.

 

Supporting Links:

https://severalnines.com/blog/online-schema-change-mysql-mariadb-comparing-github-s-gh-ost-vs-pt-online-schema-change

renamecollection in initial sync mongodb

renamecollection or any action using renamecollection like aggregation in oplog drop all data on node in startup2 mode and restarts initial sync from mongo version 3.2.12 and this is up to all versions in 3.4. issue has fix in version 3.6

Related Jira :
https://jira.mongodb.org/browse/SERVER-38524
https://jira.mongodb.org/browse/SERVER-4941

Logs in mongo error log..

Tue Dec 11 11:16:42.601 E REPL [repl writer worker 9] Error applying command ({ ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o: { renameCollection: “xdb.tmp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }): OplogOperationUnsupported: Applying renameCollection not supported in initial sync: { ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o: { renameCollection: “xdb.t
mp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }
Tue Dec 11 11:16:42.601 E REPL [repl writer worker 9] Error applying command ({ ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o: { renameCollection: “xdb.tmp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }): OplogOperationUnsupported: Applying renameCollection not supported in initial sync: { ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o: { renameCollection: “xdb.t
mp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }
Tue Dec 11 11:16:42.601 E REPL [replication-221] Failed to apply batch due to ‘OplogOperationUnsupported: error applying batch: Applying renameCollection not supported in initial sync: { ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o: { renameCollection: “xdb.tmp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }’
Tue Dec 11 11:16:42.601 I REPL [replication-223] Finished fetching oplog during initial sync: CallbackCanceled: Callback canceled. Last fetched optime and hash: { ts: Timestamp 1544507199000|3, t: 18 }[-5401731233052630226]
Tue Dec 11 11:16:42.602 E REPL [replication-223] Initial sync attempt failed — attempts left: 7 cause: OplogOperationUnsupported: error applying batch: Applying renameCollection not supported in initial sync: { ts: Timestamp 1544373391000|135, t: 17, h: -4526450696875989552, v: 2, op: “c”, ns: “xdb.$cmd”, o:{ renameCollection: “xdb.tmp.agg_out.136”, to: “xdb.xcollection”, stayTemp: false, dropTarget: true } }
Tue Dec 11 11:16:43.602 I REPL [replication-222] Starting initial sync (attempt 4 of 10)
Tue Dec 11 11:16:43.620 I REPL [replication-222] sync source candidate: 172.29.126.139:27017
Tue Dec 11 11:16:43.689 I STORAGE [replication-222] dropAllDatabasesExceptLocal 4
Tue Dec 11 11:16:50.019 I REPL [replication-222] ******
Tue Dec 11 11:16:50.019 I REPL [replication-222] ******

MariaDB Replication Errno: 1677 Last_Error: Column 0 of table ‘xxx.xxx’ cannot be converted from type

Replication Broken on slave having different database version and character set with master. Below are table format on master and slave.

Master Details

  • Database version:  mysql 5.5.20
  • Table character Set : latin1
  • Table Structure :
    • CREATE TABLE `abc` (
      `a` varchar(100) DEFAULT NULL,
      `b` varchar(100) DEFAULT NULL,
      `c` varchar(100) DEFAULT NULL,
      `d` varchar(100) DEFAULT NULL,
      `e` varchar(100) DEFAULT NULL,
      `f` varchar(100) DEFAULT NULL,
      `g` varchar(100) DEFAULT NULL,
      `h` varchar(100) DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1

Slave Details

  • Database version:  MariaDB 10.2.14
  • Table character Set : utf8mb4
    • CREATE TABLE `abc ` (
      `a` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `b` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `c` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `d` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `e` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `f` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `g` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `h` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Replication on slave broken with below error :

MariaDB [xxx]> show slave status\G

*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: XYZ
Master_User: XYZ

Last_Errno: 1677
Last_Error: Column 0 of table ‘xxx.xxx’ cannot be converted from type ‘varchar(100)’ to type ‘varchar(100)’

SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:

To resolve the same, changed variable “slave_type_conversions” to “ALL_NON_LOSSY” and restarted replication.

MariaDB [xxx]> show global variables like ‘slave_type_conversions’;
+————————+——-+
| Variable_name | Value |
+————————+——-+
| slave_type_conversions | |
+————————+——-+

MariaDB [xxx]> show global variables like ‘slave_type_conversions’;
+————————+—————+
| Variable_name | Value |
+————————+—————+
| slave_type_conversions | ALL_NON_LOSSY |
+————————+—————+

MariaDB [xxx]> stop slave; start slave;

MariaDB [xxx]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: XYZ
Master_User: XYZ
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000163
Read_Master_Log_Pos: 451597492
Relay_Log_File: relay-log.000050
Relay_Log_Pos: 402
Relay_Master_Log_File: mysql-bin.000163
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 451597492
Relay_Log_Space: 2411116
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 34103
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
1 row in set (0.00 sec)

Note : To know more about variable refer below links.

Issue related to variable :