Replication is one of the most flexible technologies available for databases. We are implementing a new open-source, database-neutral replication product that works with MySQL, Oracle, and PostgreSQL. Naturally we've done a lot of thinking about the feature set. It's tough to pick any single feature as the most important, but one that really stands out is optional statement replication. Here's why.
Database replication products tend to replicate row changes and DDL. However, Mark Callaghan has a great example of why you want to replicate statements as well--it enables Maatkit distributed consistency checking to work. If you dissect the mk-table-checksum --replicate command you will see that it uses a nice trick. The SQL queries generate checksums into the master table and then replicate as statements rather than row updates out to slaves. That way the slaves recompute the checksum locally at the same point in the overall transaction history. Very elegant!
Replicated consistency checks are a wonderful feature for large systems that can't afford to stop in order to compare tables between servers. However, you cannot use it if your database cannot replicate statements. As Mark points out, not even all MySQL engines do this. The proposed replication additions for PostgreSQL won't support it either.
Optional statement replication is really the best kind of feature: it is useful on its own, but also enables features like consistency checking and other nice administrative tricks. We're going to put a "worm-hole" in our replication engine that allows applications to invoke statement replication at the SQL level. Can you guess how we are going to do it? If not, you'll have to wait until we release. :)
So what's your favorite database replication feature?
Measuring Docker IO overhead
17 hours ago