Project

General

Profile

Actions

Task #6205

open

Review encapsulation of SQLAlchemy and Psycopg2 DBALs

Added by Jan Mach over 1 year ago. Updated 6 months ago.

Status:
New
Priority:
Normal
Category:
Development - Core
Target version:
Start date:
01/24/2020
Due date:
% Done:

0%

Estimated time:
To be discussed:

Description

Working on issue #4251 revealed, that the SQLAlchemy and/or Psycopg2 DBALs may not be optimally used within the project. For example cursors or sessions my be abused in a way that goes against best practices, which can cause troubles like those in #4251, because sessions/transaction can get stuck.

It would be wise to review current implementation.


Files

test_flask_sqlalchemy.py (1.33 KB) test_flask_sqlalchemy.py Radko Krkoš, 01/04/2021 12:43 PM

Related issues

Related to Mentat - Bug #4251: User data access through SQLAlchemy keeps a transaction openClosedJan Mach08/09/2018

Actions
Actions #1

Updated by Jan Mach over 1 year ago

  • Related to Bug #4251: User data access through SQLAlchemy keeps a transaction open added
Actions #2

Updated by Jan Mach over 1 year ago

  • Target version changed from 2.7 to 2.8
Actions #3

Updated by Rajmund Hruska 9 months ago

  • Assignee changed from Jan Mach to Rajmund Hruska
Actions #4

Updated by Rajmund Hruska 8 months ago

  • To be discussed changed from No to Yes
Actions #5

Updated by Radko Krkoš 8 months ago

This issue stems from #4251, where a workaround for a specific transaction handling bug was found and recommended. Based on the videocall discussion a more streamlined approach is required, possibly by managing the transaction lifetime by limiting the life of a Session object.
Following the official documentation [1], the upcoming version of SQLAlchemy (v1.4), currently in beta, includes a Python's context manager based workflow with Session as a transaction manager and Engine as some form of a connection manager.
Based on [1], the best approach would be to keep a reference to Engine, instantiate a Session from it using a context manager and perform the data-focused actions inside that context. This is perhaps better documented by a code excerpt from [1]:

# create session and add objects
with Session(engine) as session, session.begin():
    session.add(some_object)
    session.add(some_other_object)
# inner context calls session.commit(), if there were no exceptions
# outer context calls session.close()

The current version 1.3 does not support the context manager approach while supporting the Engine/Session distinction, so a similar solution is possible [2], but the user (programmer) must both keep a reference to sessionmaker and carefully manage the lifetime of the Session object, what is error prone. It should nevertheless be a feasible approach.

Waiting for the next version might make sense here, although it is of note that the minimum required version of Python is bumped to 3.6 there [3] (not an issue for us as we are currently using 3.7.3, but might cause problems elsewhere). The migration document also promises various performance improvements in the next version.

The required changes to the code are non-trivial for both versions and the migration to v1.4 also seems to require some rewrites. I advise to do both at the same time and postpone this issue until after the SQLAlchemy v1.4 release.

[1] https://docs.sqlalchemy.org/en/14/orm/session_basics.html
[2] https://docs.sqlalchemy.org/en/13/orm/session_basics.html
[3] https://docs.sqlalchemy.org/en/14/changelog/migration_14.html

Actions #6

Updated by Radko Krkoš 8 months ago

As for the flask-sqlalchemy, it is just a simple adaptor [1] and seems to take the sessionmaker route [2]. It also seems to manage the lifetime of the session subobject. Although it is not obvious from the example code [3] (why would an object's attribute lifetime be different from the object itself?), the documentation refers to the session as scoped and declares [4]:

You have to commit the session, but you don’t have to remove it at the end of the request, Flask-SQLAlchemy does that for you.

So the way of flask-sqlalchemy should be a viable approach to correctly access the database and if not, it is a bug somewhere in the flask-sqlalchemy\SQLAlchemy\psycopg2 stack and should be treated as such.

This is all based on code review and study of the documentation, not tested hard evidence.

[1] https://github.com/pallets/flask-sqlalchemy/blob/master/src/flask_sqlalchemy/model.py
[2] https://flask-sqlalchemy.palletsprojects.com/en/2.x/api/#flask_sqlalchemy.SQLAlchemy
[3] https://flask-sqlalchemy.palletsprojects.com/en/2.x/quickstart/#a-minimal-application
[4] https://flask-sqlalchemy.palletsprojects.com/en/2.x/quickstart/#road-to-enlightenment

Actions #7

Updated by Radko Krkoš 7 months ago

Further investigating the flask-sqlalchemy, a simple testing script was prepared (attached), to determine the session management behaviour. The output, especially section "Commit - transaction start", shows that commit() in fact does not immediately start another transaction and that is deferred until the next query. Therefore, a viable approach using flask-sqlalchemy is:

from flask import Flask
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
db = SQLAlchemy(app)

# New transaction
db.session.query(...)
db.session.commit()

# New transaction
db.session.query(...)
db.session.commit()

...

as opposed to rollback() that is required using the current architecture based on pure SQLAlchemy.

Actions #8

Updated by Pavel Kácha 7 months ago

Consensus on 2021-01-11 meeting was that the correct way most probably is switching to flask-sqlalchemy, which hides some sqlalchemy idiosyncracies. We would than be able to remove weird rollback introduced in #4251.

Actions #9

Updated by Pavel Kácha 6 months ago

  • To be discussed deleted (Yes)
Actions #10

Updated by Pavel Kácha 6 months ago

  • Target version changed from 2.8 to 2.9
Actions

Also available in: Atom PDF