While I like the whole object-oriented approach with SQLAlchemy, sometimes I find it easier to directly use some SQL. And since the entries do not have a key, we need a line number ( _ROWID_ ) to delete the destination entries, and I donβt think the API provides it.
So, first we connect to the database:
from sqlalchemy import create_engine db = create_engine(r'sqlite:///C:\temp\example.db') eng = db.engine
Then list all entries:
for row in eng.execute("SELECT * FROM TableA;") : print row
And to display all duplicate entries where the dates are identical:
for row in eng.execute(""" SELECT * FROM {table} WHERE {field} IN (SELECT {field} FROM {table} GROUP BY {field} HAVING COUNT(*) > 1) ORDER BY {field}; """.format(table="TableA", field="Date")) : print row
Now that we have identified all the duplicates, they should probably be fixed if the other fields are different:
eng.execute("UPDATE TableA SET NormalA=18, specialA=20 WHERE Date = '2016-18-12' ;"); eng.execute("UPDATE TableA SET NormalA=4, specialA=8 WHERE Date = '2015-18-12' ;");
And at the root, save the first inserted record and delete the most recent duplicate records:
print eng.execute(""" DELETE FROM {table} WHERE _ROWID_ NOT IN (SELECT MIN(_ROWID_) FROM {table} GROUP BY {field}); """.format(table="TableA", field="Date")).rowcount
Or save the last inserted record and delete other duplicate records:
print eng.execute(""" DELETE FROM {table} WHERE _ROWID_ NOT IN (SELECT MAX(_ROWID_) FROM {table} GROUP BY {field}); """.format(table="TableA", field="Date")).rowcount
Florent B.
source share