Multiuser work with SQLAlchemy

I have a web application that is built using Pyramid / SQLAlchemy / Postgresql and allows users to manage some data, and this data is almost completely independent for different users. Say Alice visits alice.domain.com and can upload photos and documents, and Bob visits bob.domain.com , and can also upload photos and documents. Alice never sees anything created by Bob and vice versa (this is a simplified example, in fact there can be a lot of data in several tables, but the idea is the same).

Now the easiest way to organize data in the database of the database server is to use a single database, where each table ( pictures and documents ) has a user_id field, therefore, basically, to get all Alice’s photos, I can do something like

 user_id = _figure_out_user_id_from_domain_name(request) pictures = session.query(Picture).filter(Picture.user_id==user_id).all() 

All this is easy and simple, but there are some disadvantages

  • I need to remember that you always need to use the additional filter condition when executing queries, otherwise Alice can see Bob's photos;
  • If there are many users, tables can grow huge
  • It can be difficult to split a web application between multiple machines.

So, I think it would be very nice to somehow share the data for each user. I can imagine two approaches:

  • There are separate tables for photographs and documents of Alice and Bob in one database (Postgres Schemas , apparently, is the right approach to use in this case):

     documents_alice documents_bob pictures_alice pictures_bob 

    and then, using some dark magic, β€œlay out” all queries in one or another table in accordance with the current query domain:

     _use_dark_magic_to_configure_sqlalchemy('alice.domain.com') pictures = session.query(Picture).all() # selects all Alice pictures from "pictures_alice" table ... _use_dark_magic_to_configure_sqlalchemy('bob.domain.com') pictures = session.query(Picture).all() # selects all Bob pictures from "pictures_bob" table 
  • Use a separate database for each user:

     - database_alice - pictures - documents - database_bob - pictures - documents 

    which seems like a clean solution, but I'm not sure that multiple connections to the database will require much more RAM and other resources, which will limit the number of possible β€œtenants”.

So the question is, does everything make sense? If so, how do I configure SQLAlchemy to dynamically change table names for each HTTP request (for option 1) or to maintain a pool of connections to different databases and use the correct connection for each request (for option 2)?

+8
python postgresql sqlalchemy multi-tenant
source share
3 answers

Ok, I ended up modifying search_path at the beginning of each query using the Pyramid NewRequest event:

 from pyramid import events def on_new_request(event): schema_name = _figire_out_schema_name_from_request(event.request) DBSession.execute("SET search_path TO %s" % schema_name) def app(global_config, **settings): """ This function returns a WSGI application. It is usually called by the PasteDeploy framework during ``paster serve``. """ .... config.add_subscriber(on_new_request, events.NewRequest) return config.make_wsgi_app() 

It works very well if you leave transaction management in Pyramid (i.e. do not commit transactions / rollback transactions manually, allowing Pyramid to do this at the end of the request) - this is normal, as manual transactions are not a good approach anyway.

+2
source share

After pondering the jd answer, I was able to achieve the same result for postgresql 9.2, sqlalchemy 0.8 and flash frame 0.9:

 from sqlalchemy import event from sqlalchemy.pool import Pool @event.listens_for(Pool, 'checkout') def on_pool_checkout(dbapi_conn, connection_rec, connection_proxy): tenant_id = session.get('tenant_id') cursor = dbapi_conn.cursor() if tenant_id is None: cursor.execute("SET search_path TO public, shared;") else: cursor.execute("SET search_path TO t" + str(tenant_id) + ", shared;") dbapi_conn.commit() cursor.close() 
+9
source share

Which is very suitable for me to set the search path at the connection pool level, and not in the session. In this example, Flask and its local thread proxies pass the schema name, so you have to change the schema = current_schema._get_current_object() and the try block around it.

 from sqlalchemy.interfaces import PoolListener class SearchPathSetter(PoolListener): ''' Dynamically sets the search path on connections checked out from a pool. ''' def __init__(self, search_path_tail='shared, public'): self.search_path_tail = search_path_tail @staticmethod def quote_schema(dialect, schema): return dialect.identifier_preparer.quote_schema(schema, False) def checkout(self, dbapi_con, con_record, con_proxy): try: schema = current_schema._get_current_object() except RuntimeError: search_path = self.search_path_tail else: if schema: search_path = self.quote_schema(con_proxy._pool._dialect, schema) + ', ' + self.search_path_tail else: search_path = self.search_path_tail cursor = dbapi_con.cursor() cursor.execute("SET search_path TO %s;" % search_path) dbapi_con.commit() cursor.close() 

At the time of engine creation:

 engine = create_engine(dsn, listeners=[SearchPathSetter()]) 
+3
source share

All Articles