Sqlalchemy: Secondary Relationship Update

I have two tables, for example A and B. Both have a primary key identifier. They have a many-to-many relationship, SEC.

SEC = Table('sec', Base.metadata, Column('a_id', Integer, ForeignKey('A.id'), primary_key=True, nullable=False), Column('b_id', Integer, ForeignKey('B.id'), primary_key=True, nullable=False) ) class A(): ... id = Column(Integer, primary_key=True) ... rels = relationship(B, secondary=SEC) class B(): ... id = Column(Integer, primary_key=True) ... 

Consider this piece of code.

 a = A() b1 = B() b2 = B() a.rels = [b1, b2] ... #some place later b3 = B() a.rels = [b1, b3] # errors sometimes 

Sometimes an error message appears on the last line

 duplicate key value violates unique constraint a_b_pkey 

In my understanding, I think that he is trying to add (a.id, b.id) to the "sec" table again, which leads to a unique constraint error. The way it is? If so, how can I avoid this? If not, why do I have this error?

+8
python postgresql orm sqlalchemy relationship
source share
2 answers

The problem is that you want to make sure that the instances you create are unique. We can create an alternative constructor that checks the cache of existing idle instances or queries the database for an existing processed instance before returning a new instance.

Here is a demonstration of such a method:

 from sqlalchemy import Column, Integer, String, ForeignKey, Table from sqlalchemy.engine import create_engine from sqlalchemy.ext.declarative.api import declarative_base from sqlalchemy.orm import sessionmaker, relationship engine = create_engine('sqlite:///:memory:', echo=True) Session = sessionmaker(engine) Base = declarative_base(engine) session = Session() class Role(Base): __tablename__ = 'role' id = Column(Integer, primary_key=True) name = Column(String, nullable=False, unique=True) @classmethod def get_unique(cls, name): # get the session cache, creating it if necessary cache = session._unique_cache = getattr(session, '_unique_cache', {}) # create a key for memoizing key = (cls, name) # check the cache first o = cache.get(key) if o is None: # check the database if it not in the cache o = session.query(cls).filter_by(name=name).first() if o is None: # create a new one if it not in the database o = cls(name=name) session.add(o) # update the cache cache[key] = o return o Base.metadata.create_all() # demonstrate cache check r1 = Role.get_unique('admin') # this is new r2 = Role.get_unique('admin') # from cache session.commit() # doesn't fail # demonstrate database check r1 = Role.get_unique('mod') # this is new session.commit() session._unique_cache.clear() # empty cache r2 = Role.get_unique('mod') # from database session.commit() # nop # show final state print session.query(Role).all() # two unique instances from four create calls 

The create_unique method was inspired by the SQLAlchemy wiki example . This version is much less confusing, which contributes to simplicity and flexibility. I used it in production systems without any problems.

There are obviously improvements that can be added; this is just a simple example. The get_unique method can be inherited from UniqueMixin , which will be used for any number of models. More flexible memorization of arguments can be implemented. This also fixes a problem with multiple threads inserting conflicting data mentioned by Ants Aasma; more complex, but should be an obvious extension. I leave it to you.

+8
source share

You mentioned an error by entering a contradictory value in the sec table. To verify that this is from the operation you are considering, and not from any previous change, enable SQL logging and check what values ​​it is trying to insert before the error.

When overwriting the values ​​of a many-to-many collection, SQLAlchemy compares the new contents of the collection with the state in the database and, accordingly, issues delete and insert instructions. If you don't get stuck inside SQLAlchemy's internal components, there should be two ways to meet this error.

Firstly, this is a simultaneous modification: process 1 retrieves the value of a.rels and notices that it is empty, while process 2 also retrieves a.rels, sets it to [b1, b2] and flushes (a, b1), ( a, b2), process 1 sets a.rels to [b1, b3], noticing that the previous content was empty, and when it tries to reset the sequencer (a, b1), it gets a duplicate key error. The correct action in such cases is usually to repeat the transaction from above. You can use serializable transaction isolation to instead receive a serialization error in this case, different from a business logic error causing a duplicate key error.

The second case happens when you manage to convince SQLAlchemy that you don't need to know the state of the database by setting the rels attribute loading strategy to noload . This can be done when defining a relationship by adding the parameter lazy='noload' or when querying by calling .options(noload(A.rels)) in the query. SQLAlchemy will assume that there are no matching rows in the sec table for objects loaded with this strategy.

+3
source share

All Articles