Wed, 09 Sep 2009

Mimetypes and Threading don't mix

I've just spent weeks (yes, weeks) battling a bug that turns out to have been caused by everyone's favourite broken stdlib module, mimetypes. I'm far from the first to be bitten by this module's strangeness – Jacob Rus has compiled a long list of reasons why the mimetypes module is pathologically broken, while Armin Ronacher recently got a 1000% speedup just by changing the way he imported things from the module (yes, 1000%).

So consider this another little heads-up about the mimetypes module: it doesn't play nice with threads.. If two threads call mimetypes.guess_type at the same time, and the module happens to need to initialise its internal database, then one of the threads will go into an infinite recursive loop and blow your stack. What fun!

To be fair, the mimetypes module is slowly being converted into a healthy state, and this particular bug will be fixed in the next release. But in the meantime, if you need to do mimetype guesswork in Python, make sure you do it very carefully.