MySQL MyISAM table lock issue

Debugging War Stories

MyISAM is one of the fastest storage engines out there. It lacks a lot of things some might consider cruicial but it's good enough for most of us (Using MyISAM in production) not lucky enough to be into banking business.

Designers of MyISAM made a few corner stone decisions: use table level locking and throw out transcation support.

Table locks are by their words the fastest method of doing things in a general public relational database. What this means is that we can expect some very problematic behaviour at times.

We recently had a profiling case that involved troubleshooting a centralized DB solution with 10 web worker PHP-FPM/Nginx machines. The app was running fine most of the day and occasionally it just got stuck for like 30 seconds and nothing was being able to load. We narrowed the problem down to the database server (as usual) but it still seemed weird enough. Database server was giving them high I/O for months but since the machine was quite powerful (16x 2.4Ghz SSD RAID MySQL dedicated server with 50 Gigs of RAM) they got away with it until now. They've ruled out the backup process. (Check this out for a great resources on how to use LVM to backup MySQL consistently (here and here)

Recently however the machine started experiencing frequent hiccups leading them to belive it was a hardware issue having to do with faulty disks or SSD stuttering. After ruling the impossible out we turned back at the code. It turned out we were having a classic table locking issue.

We had a table of millions of records being constatly read and written to (SELECT and INSERT). And all queries were working out great. Except that sometimes an INSERT took about 30 seconds to complete!!! Blocking INSERTs were the key discovery that lead us to isolating the problem. After doing a SET profiling=1, then a "slow" INSERT, and SHOW PROFILE it turned out that the 30 seconds freeze was in fact 99% due to a table locking issue ( Table lock | 34.200235 ). We did a bit of googling and found no tools for MyISAM deadlocks as they exist for InnoDb so we were basically left on our own.

We did a quick look at the mysql-slow.log looking for long running queries. Turned out that an old xxxxxx xxxxxxxx developer had made s huge SELECT and then fetched millions of rows processing them on the application side. To make things worst that SELECT contained a non-indexed ORDER statement that caused the query to take some time. And due to table level locking no write (INSERT) statement was allowed to run until the read (SELECT) query is finished. As it turns out that beautiful table level locking scheme has did us a bad joke. Everything was running fine when only a few rows were SELECTed but now that rare SELECT was shutting us down every 30 minutes.

See the most probable causes of locking issues here: MySQL Table Locking Reference

And if you care for a better explanation just visit Mysql Lock Contention.

Veselin Kulov
26 Aug 2011