hacking with MySQL replication

Posted by Julian Dunn on February 04, 2006
Databases

First off, I have to be honest about my biases: I am not a big fan of MySQL. It has a long list of gotchas; in many cases, it violates what I term “the principle of least surprise”. One quick example: what is the difference between this:

$ mysql -h localhost -u root mysql

and

$ mysql -h 127.0.0.1 -u root mysql

?

Answer: The first command will use the local domain socket (mysql.sock) to connect to the database, and the latter will actually open a TCP/IP connection to localhost.

Why this is relevant will become clear in a moment.

My challenge this week: Try to replicate our production databases using MySQL replication in order to get a read-only backup at the office. Actually, setting this up wasn’t too hard. I don’t think it’s a very high quality solution because the slave is kind of a dumb listener; you can’t write to the slave copy, so it’s not really clustering. MySQL Max has clustering features, but first of all, we’re not running 5.x, and second of all, I don’t really need two writeable copies.

Problems arose when I tried to use secure connections with MySQL. Frankly, this doesn’t work. I didn’t even try to do this with the replication because even manually-created client connections did not work despite the fact that I followed the very lengthy certificate generation instructions to the letter. All it ever seemed to say at me was ERROR 2026 (HY000): SSL connection error.

As you can see in my forum posting, I even tried to strace(1) the mysqld process, but because the main thread just hands off control to a child thread, I couldn’t see what it was doing. If you know how to strace a process and follow forked children, I’d love to know. (strace -f didn’t work for me — it hung mysqld — but I suspect this is because it forks a child and wait(2)s for it to complete before returning, which the strace man page says will suspend the main thread.)

Alrighty, I thought, MySQL is broken. Let’s try to use stunnel to create an SSL-secured tunnel across the Internet, and make the local slave connect to some local port — say, 43306 — which stunnel will accept, encrypt the traffic, and fire it to the master which also has an stunnel running in daemon mode. (Now you can see why I was tearing my hair out over the failure to obey the “principle of least surprise” with the -h flag to mysql. Typing

mysql -h localhost --port 99999999999999999 -u root mysql

would presumably still get you a connection through the mysql.sock even though you’d asked for a totally bogus TCP/IP port.)

I quickly discovered a bug in MySQL: the slave will not connect to any port other than 3306, even if you specify it in a CHANGE MASTER statement! This causes real problems if you’re trying to use stunnel to work around such problems as SSL being seriously broken in MySQL, and need to run it on a local port which forwards to the remote.

Of course, I filed a bug report. But it really makes me nervous to think that SSL simply does not work — I suppose you could set up an SSH tunnel, but again, you have the problem where the slave won’t connect to a port that’s not 3306 (a real problem if you have mysqld running on both ends). (Just for the record, I decided to abandon the replication exercise, so there’s no need to start sniffing our inbound office traffic for unencrypted SQL data!)

The whole manner in which SSL connections to MySQL servers works (or is supposed to work) smacks of a bolt-on solution, too. Apparently the way it works is as follows:

  1. Client connects to MySQL server on port 3306 using regular MySQL binary protocol
  2. Server asks client if it can speak SSL (or maybe it’s the other way around)
  3. If not, server may disconnect client based on the GRANT options in the user table (e.g. if REQUIRE SSL was set)

Contrast this to the manner in which PostgreSQL establishes SSL connections. Note that SSL is built into PostgreSQL and that the server will automatically detect what is being spoken. Presumably this means you could connect to PostgreSQL with openssl s_client and debug the connection that way, a method which fails with MySQL because it starts speaking its own binary protocol first even before doing any sort of SSL negotiation.

The bottom line is that it’s yet another thing, like stored procedures and transactions, which MySQL has implemented late or poorly (I’ll save my comments about why MySQL doesn’t have transactions by default for another day), and which cause me to question its robustness as an enterprise database solution.

Sadly, poor design doesn’t seem to stop most people.