When to close cursors with MySQLdb

I am creating a WSGI web application and I have a MySQL database. I use MySQLdb, which provides cursors for executing statements and getting results. What is the standard practice for getting and closing cursors? In particular, how long should my cursors last? Should I get a new cursor for each transaction?

I believe that you need to close the cursor before making the connection. Is there a significant advantage in finding sets of transactions that do not require intermediate commits so that you do not have to get new cursors for each transaction? Is there a lot of overhead for getting new cursors, or is it just not a big deal?

+74
python mysql mysql-python
Apr 14 2018-11-21T00:
source share
5 answers

Instead of asking what standard practice is, as it is often unclear and subjective, you can try looking at the module itself for guidance. In general, using the with keyword as another suggested user is a great idea, but in this particular case it may not give you enough of the functionality that you expect.

Starting with version 1.2.5 of the MySQLdb.Connection module, MySQLdb.Connection implements the context manager protocol with the following code ( github ):

 def __enter__(self): if self.get_autocommit(): self.query("BEGIN") return self.cursor() def __exit__(self, exc, value, tb): if exc: self.rollback() else: self.commit() 

There are already several existing Q & A about with , or you can read Understanding Python with an instruction , but in essence, what happens is that __enter__ is executed at the beginning of the with block, and __exit__ is executed after exiting the with block. You can use the optional syntax with EXPR as VAR to bind the object returned by __enter__ to a name if you intend to reference this object later. So, given the above implementation, here is an easy way to query your database:

 connection = MySQLdb.connect(...) with connection as cursor: # connection.__enter__ executes at this line cursor.execute('select 1;') result = cursor.fetchall() # connection.__exit__ executes after this line print result # prints "((1L,),)" 

The question is, what are the connection and cursor states after exiting the with block? The above __exit__ method only calls self.rollback() or self.commit() , and none of these methods are used to call the close() method. The cursor itself does not have the __exit__ method, and it does not matter if it did this, because with controls only the connection. Therefore, both the connection and the cursor remain open after exiting the with block. This is easy to confirm by adding the following code to the example above:

 try: cursor.execute('select 1;') print 'cursor is open;', except MySQLdb.ProgrammingError: print 'cursor is closed;', if connection.open: print 'connection is open' else: print 'connection is closed' 

You should see that the “cursor is open, the connection is open” is printed in the standard format.

I believe that you need to close the cursor before making the connection.

Why? The MySQL C API , which is the basis for MySQLdb , does not implement any cursor object, as implied in the module documentation: "MySQL does not support cursors, but cursors are easily emulated." Indeed, the MySQLdb.cursors.BaseCursor class MySQLdb.cursors.BaseCursor inherited directly from object and does not impose such restrictions on cursors with respect to commit / rollback. The Oracle developer had this to say :

cnx.commit () before cur.close () sounds the most logical to me. Maybe you can follow the rule: "Close the cursor if you no longer need it." Therefore, commit () before closing the cursor. In the end, for Connector / Python, this is not a big deal, but another database.

I expect as close as you are going to go to "standard practice" on this.

Is there a significant advantage in finding sets of transactions that do not require intermediate commits so that you do not have to get new cursors for each transaction?

I doubt it very much, and trying to do this, you can introduce an additional human error. It is better to choose an agreement and stick to it.

Is there a lot of overhead for getting new cursors, or is it just not a big deal?

The overhead is negligible and does not apply to the database server at all; this is entirely within the scope of MySQLdb implementation. You can watch BaseCursor.__init__ on github if you are really interested in knowing what happens when you create a new cursor.

Returning to the previous one when we discussed with , perhaps now you can understand why the MySQLdb.Connection class __enter__ and __exit__ methods give you a new cursor object in each with t block to watch it or close it at the end of the block. It is quite lightweight and exists solely for your convenience.

If it is really important for you to micromechanize the cursor object, you can use contextlib.closing to make up for the fact that the cursor object does not have a specific __exit__ method. In this case, you can also use it to force the connection object to close when you exit the with block. This should output "my_curs closed, my_conn closed":

 from contextlib import closing import MySQLdb with closing(MySQLdb.connect(...)) as my_conn: with closing(my_conn.cursor()) as my_curs: my_curs.execute('select 1;') result = my_curs.fetchall() try: my_curs.execute('select 1;') print 'my_curs is open;', except MySQLdb.ProgrammingError: print 'my_curs is closed;', if my_conn.open: print 'my_conn is open' else: print 'my_conn is closed' 

Note that with closing(arg_obj) will not call the __enter__ and __exit__ argument methods; it will only call the method of the close object of the argument object at the end of the with block. (To see this in action, simply define the Foo class using the __enter__ , __exit__ and close methods, which contain simple print statements, and compare what happens when you do with Foo(): pass to what happens when you do with closing(Foo()): pass .) This has two significant meanings:

Firstly, if autocommit is enabled, MySQLdb will BEGIN an explicit transaction on the server when using with connection and committing the transaction at the end of the block. This is the default MySQLdb behavior, designed to protect you from the default MySQL behavior, which immediately makes any and all DML statements. MySQLdb assumes that when you use the context manager, you want to complete the transaction and use explicit BEGIN to bypass the autocommit parameter on the server. If you are used to using with connection , you might think that autocommit is turned off when it actually just bypasses. You may get an unpleasant surprise if you add closing to your code and lose transaction integrity; you cannot undo the changes, you can see concurrency errors, and this may not be immediately obvious.

Secondly, with closing(MySQLdb.connect(user, pass)) as VAR binds the connection object to the VAR , unlike with MySQLdb.connect(user, pass) as VAR , which binds the new cursor object to the VAR . In the latter case, you will not have direct access to the connection object! Instead, you will need to use the connection cursor attribute, which provides proxy access to the original connection. When the cursor is closed, the connection attribute is set to None . This results in an abandoned connection that will stick until one of the following events occurs:

  • All cursor links are deleted.
  • The cursor goes out of scope
  • Connection time
  • The connection is closed manually using the server administration tools.

You can verify this by checking open connections (in Workbench or using SHOW PROCESSLIST ) by doing the following lines one after another:

 with MySQLdb.connect(...) as my_curs: pass my_curs.close() my_curs.connection # None my_curs.connection.close() # throws AttributeError, but connection still open del my_curs # connection will close here 
+67
Mar 24 '14 at 19:26
source share

It’s better to rewrite it with the keyword “c”. "C" will take care to close the cursor (this is important because it is an unmanaged resource) automatically. The advantage is that it closes the cursor in case of an exception.

 from contextlib import closing import MySQLdb ''' At the beginning you open a DB connection. Particular moment when you open connection depends from your approach: - it can be inside the same function where you work with cursors - in the class constructor - etc ''' db = MySQLdb.connect("host", "user", "pass", "database") with closing(db.cursor()) as cur: cur.execute("somestuff") results = cur.fetchall() # do stuff with results cur.execute("insert operation") # call commit if you do INSERT, UPDATE or DELETE operations db.commit() cur.execute("someotherstuff") results2 = cur.fetchone() # do stuff with results2 # at some point when you decided that you do not need # the open connection anymore you close it db.close() 
+28
May 23 '13 at 15:59
source share

Note: this answer is for PyMySQL , which is a replacement for MySQLdb and is actually the latest version of MySQLdb since MySQLdb has stopped supporting. I believe that everything here is also true for legacy MySQLdb, but did not check.

First of all, some facts:

  • Python with syntax calls the __enter__ method of the __enter__ manager before executing the body of the with block, and then its __exit__ .
  • Connections have a __enter__ method that does nothing but create and return a cursor, and __exit__ that either commits or rolls back (depending on whether an exception was thrown). This does not close the connection.
  • Cursors in PyMySQL are just an abstraction implemented in Python; MySQL itself has no equivalent concept. one
  • __enter__ method that does nothing, and __exit__ which "closes" the cursor (which simply means nullifying the cursor’s link to its parent connection and deleting any data stored on the cursor).
  • Cursors contain a link to the connection that generated them, but connections do not contain a link to the cursors they created.
  • Connections have a __del__ method that closes them
  • According to https://docs.python.org/3/reference/datamodel.html , CPython (the default Python implementation) uses reference counting and automatically deletes the object as soon as the number of references to it reaches zero.

Putting it all together, we see that naive code like this is theoretically problematic:

 # Problematic code, at least in theory! import pymysql with pymysql.connect() as cursor: cursor.execute('SELECT 1') # ... happily carry on and do something unrelated 

The problem is that nothing closed the connection. Indeed, if you paste the above code into the Python shell and then run SHOW FULL PROCESSLIST in the MySQL shell, you can see the simple connection you created. Since the default number of connections in MySQL is 151 , which is small, theoretically you might run into problems if you have many processes that keep these connections open.

However, there is a saving in CPython that ensures that code like my example above probably won’t make you do without a lot of open connections. This saving grace is that as soon as the cursor goes out of scope (for example, the function in which it was created ends, or the cursor receives another value assigned to it), its reference counter reaches zero, which leads to its removal, breaking The connection reference count is zero, resulting in the __del__ connection method __del__ which force closes the connection. If you have already pasted the above code into the Python shell, now you can model it by doing cursor = 'arbitrary value' ; as soon as you do this, the connection you opened will disappear from the output of SHOW PROCESSLIST .

However, relying on this is not easy, and could theoretically fail in Python implementations other than CPython. In theory, more .close() would be an explicit .close() connection (to release a database connection without waiting for Python to destroy the object). This more robust code looks like this:

 import contextlib import pymysql with contextlib.closing(pymysql.connect()) as conn: with conn as cursor: cursor.execute('SELECT 1') 

This is ugly, but does not rely on Python destroying your objects to free up your (limited available) database connections.

Note that closing the cursor if you are already explicitly closing the connection is completely pointless.

Finally, to answer minor questions here:

Is there a lot of overhead for getting new cursors, or is it just not a big deal?

No, creating a cursor does not affect MySQL at all and does practically nothing .

Is there any significant advantage in finding sets of transactions that do not require intermediate commits so you don't have to get new cursors for each transaction?

It is situational and difficult to give a general answer. According to https://dev.mysql.com/doc/refman/en/optimizing-innodb-transaction-management.html : "an application may encounter performance problems if it is committed thousands of times per second, and other performance problems, if it is done only every 2-3 hours. " You pay performance overheads for each commit, but by leaving transactions open longer, you increase the likelihood that other connections will have to spend time waiting for locks, increase the risk of deadlocks and potentially increase the cost of some searches performed by other connections.,




1 MySQL has a construct that calls the cursor, but they exist only inside stored procedures; they are completely different from PyMySQL cursors and are not relevant here.

+6
Mar 23 '17 at 0:19
source share

I think it would be better for you to use one cursor for all your executions and close it at the end of your code. It's easier to work with, and it can also have performance benefits (don't quote me on that).

 conn = MySQLdb.connect("host","user","pass","database") cursor = conn.cursor() cursor.execute("somestuff") results = cursor.fetchall() ..do stuff with results cursor.execute("someotherstuff") results2 = cursor.fetchall() ..do stuff with results2 cursor.close() 

The fact is that you can save the results of cursor execution in another variable, thereby freeing your cursor to perform the second execution. You run into problems this way only if you use fetchone (), and you need to do a second cursor execution before you repeat all the results from the first query.

Otherwise, I would say, just close your cursors as soon as you finish getting all the data from them. This way, you don’t have to worry about tying loose ends later in your code.

+5
Jul 30 '11 at 19:06
source share

I suggest doing it like php and mysql. I run at the beginning of your code before printing the first data. Therefore, if you get a connection error, you can display a 50x error 50x (don’t remember which internal error). And keep it open for the entire session and close it when you know that you will no longer need it.

-four
Apr 14 2018-11-21T00:
source share



All Articles