Consider the following Python script that uses SQLAlchemy and the Python multiprocessing module. This is with Python 2.6.6-8 + b1 (default) and SQLAlchemy 0.6.3-3 (default) during Debian compression. This is a simplified version of some actual code.
import multiprocessing from sqlalchemy import * from sqlalchemy.orm import * dbuser = ... password = ... dbname = ... dbstring = "postgresql://%s:% s@localhost :5432/%s"%(dbuser, password, dbname) db = create_engine(dbstring) m = MetaData(db) def make_foo(i): t1 = Table('foo%s'%i, m, Column('a', Integer, primary_key=True)) conn = db.connect() for i in range(10): conn.execute("DROP TABLE IF EXISTS foo%s"%i) conn.close() db.dispose() for i in range(10): make_foo(i) m.create_all() def do(kwargs): i, dbstring = kwargs['i'], kwargs['dbstring'] db = create_engine(dbstring) Session = scoped_session(sessionmaker()) Session.configure(bind=db) Session.execute("COMMIT; BEGIN; TRUNCATE foo%s; COMMIT;") Session.commit() db.dispose() pool = multiprocessing.Pool(processes=5)
This script hangs with the following error message.
Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner self.run() File "/usr/lib/python2.6/threading.py", line 484, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.6/multiprocessing/pool.py", line 259, in _handle_results task = get() TypeError: ('__init__() takes at least 4 arguments (2 given)', <class 'sqlalchemy.exc.ProgrammingError'>, ('(ProgrammingError) syntax error at or near "%"\nLINE 1: COMMIT; BEGIN; TRUNCATE foo%s; COMMIT;\n ^\n',))
Of course, the syntax error here is TRUNCATE foo%s;
. My question is: why is the process hanging, and can I convince it to exit with an error instead, without doing a serious operation with my code? This behavior is very similar to the behavior of my actual code.
Note that the hang does not occur if the statement is replaced with something like print foobarbaz
. In addition, freezing occurs if we replace
Session.execute("COMMIT; BEGIN; TRUNCATE foo%s; COMMIT;") Session.commit() db.dispose()
just Session.execute("TRUNCATE foo%s;")
I am using the old version because it is closer to what my actual code is doing.
In addition, removing multiprocessing
from the image and cycling through the tables sequentially causes the hanger to disappear, and it just quits with an error.
I am also puzzled by the error form, especially the TypeError: ('__init__() takes at least 4 arguments (2 given)'
bit TypeError: ('__init__() takes at least 4 arguments (2 given)'
. Where does this error come from? It seems to be somewhere in the multiprocessing
code.
PostgreSQL logs do not help. I see a lot of lines, for example
2012-01-09 14:16:34.174 IST [7810] 4f0aa96a.1e82/1 12/583 0 ERROR: syntax error at or near "%" at character 28 2012-01-09 14:16:34.175 IST [7810] 4f0aa96a.1e82/2 12/583 0 STATEMENT: COMMIT; BEGIN; TRUNCATE foo%s; COMMIT;
but nothing else seems relevant.
UPDATE 1: Thanks to lbolla and its insightful analysis, I was able to report a Python bug about this. See sbt in this report as well as here . See Also Python Error Report Fix Exception Etching . So, following the sbt explanation, we can reproduce the original error with
import sqlalchemy.exc e = sqlalchemy.exc.ProgrammingError("", {}, None) type(e)(*e.args)
which gives
Traceback (most recent call last): File "<stdin>", line 9, in <module> TypeError: __init__() takes at least 4 arguments (2 given)
UPDATE 2: This has been fixed, at least for SQLAlchemy, by Mike Bayer, see the Exception Error Report for the StatementError unc pickable fix. , At the suggestion of Mike, I also reported a similar error for psycopg2, although I did not have (and do not) an actual example of a breakdown. Despite this, they apparently corrected it, although they did not give any details about the correction. See Exceptions psycopg cannot be pickled . For good measure, I also reported a Python error. ConfigParser exceptions are not defined, which corresponds to the issue SO lbolla mentioned . It seems they want to check it out.
In any case, it seems that this will remain a problem in the foreseeable future, since Python developers, by and large, are not aware of this problem and therefore do not protect it. Surprisingly, it seems that people using multiprocessing are not enough to be a well-known problem, or maybe they just put up with it. I hope Python developers get around to fixing this, at least for Python 3, because it is annoying.
I accepted lbolla's answer, since without explaining how the problem is with exception handling, I would most likely not understand this. I also want to thank sbt for explaining that the Python problem is unable to expose exceptions. I am very grateful to both of them and ask you to vote for their answers. Thanks.
UPDATE 3: I posted the following question: Catching the irreplaceable exceptions and re-raising .