Imports in functions? You sure about that?
First, we have to be glad that Python has such a flexible import system, allowing imports everywhere in the code. For some applications, that's quite nice, such as not requiring optional modules or loading them on demand to boost performance in the common case.
However, if you combine that with threading you can get bitten: Python has an import lock, which lets only one thread at a time import a module; that's pretty essential because of the large amount of shared globals involved in importing (think sys.modules, sys.path_importer_cache and friends). And the import lock is even acquired when the requested module is already loaded (i.e. present in sys.modules).
A while ago, I wrote a server application that, on startup, starts a thread that imports modules, and one of these imports has a side-effect that takes quite a long time. (Which is not good practice but required by the environment it runs in.) And curiously, during that startup, no client could connect. What happened?
A bit of debugging showed: the server is a SocketServer.ThreadingTCPServer subclass, and its process_request method calls import threading. This was fixed easily by overwriting that method.
After adding and changing some code, I noticed the problem was there again. And again, I found an import, this time in Queue.__init__. (It imports threading as well.) While the local import in the SocketServer.ThreadingMixIn makes some sense, this one does not, at least to me, since after all, if you import the Queue module you intend to use it, and therefore to instantiate a Queue.
My hackish workaround was to pre-create a freelist of queues and use them in my handler ;)
However, if you combine that with threading you can get bitten: Python has an import lock, which lets only one thread at a time import a module; that's pretty essential because of the large amount of shared globals involved in importing (think sys.modules, sys.path_importer_cache and friends). And the import lock is even acquired when the requested module is already loaded (i.e. present in sys.modules).
A while ago, I wrote a server application that, on startup, starts a thread that imports modules, and one of these imports has a side-effect that takes quite a long time. (Which is not good practice but required by the environment it runs in.) And curiously, during that startup, no client could connect. What happened?
A bit of debugging showed: the server is a SocketServer.ThreadingTCPServer subclass, and its process_request method calls import threading. This was fixed easily by overwriting that method.
After adding and changing some code, I noticed the problem was there again. And again, I found an import, this time in Queue.__init__. (It imports threading as well.) While the local import in the SocketServer.ThreadingMixIn makes some sense, this one does not, at least to me, since after all, if you import the Queue module you intend to use it, and therefore to instantiate a Queue.
My hackish workaround was to pre-create a freelist of queues and use them in my handler ;)
But much more importantly, PEP 8 says this is a no-no.
— D on Wednesday, March 4, 2009 19:25 #
— Jesse on Wednesday, March 4, 2009 19:43 #
Why does it do it this way?
— Justus on Thursday, March 5, 2009 0:57 #
— PJE on Thursday, March 5, 2009 5:14 #
So a less intrusive hack would be to create one Queue and throw it away - then you wouldn't need other bits of your code to know about the free list.
@PJE - could you elaborate? Wouldn't any top-level loop (or background worker thread) in an application be "long-running module-level code"? How would you avoid it?
— xtian on Thursday, March 5, 2009 11:02 #
Of course, avoiding long-running module code is the best solution, but in this instance it was not easily done.
— Georg on Thursday, March 5, 2009 11:15 #
— xtian on Thursday, March 5, 2009 11:17 #
— PJE on Wednesday, March 11, 2009 16:33 #