Task #6205
closedReview encapsulation of SQLAlchemy and Psycopg2 DBALs
100%
Description
Working on issue #4251 revealed, that the SQLAlchemy and/or Psycopg2 DBALs may not be optimally used within the project. For example cursors or sessions my be abused in a way that goes against best practices, which can cause troubles like those in #4251, because sessions/transaction can get stuck.
It would be wise to review current implementation.
Files
Related issues
Updated by Jan Mach almost 5 years ago
- Related to Bug #4251: User data access through SQLAlchemy keeps a transaction open added
Updated by Rajmund Hruška about 4 years ago
- Assignee changed from Jan Mach to Rajmund Hruška
Updated by Rajmund Hruška about 4 years ago
- To be discussed changed from No to Yes
Updated by Radko Krkoš about 4 years ago
This issue stems from #4251, where a workaround for a specific transaction handling bug was found and recommended. Based on the videocall discussion a more streamlined approach is required, possibly by managing the transaction lifetime by limiting the life of a Session
object.
Following the official documentation [1], the upcoming version of SQLAlchemy
(v1.4), currently in beta, includes a Python
's context manager based workflow with Session
as a transaction manager and Engine
as some form of a connection manager.
Based on [1], the best approach would be to keep a reference to Engine
, instantiate a Session
from it using a context manager and perform the data-focused actions inside that context. This is perhaps better documented by a code excerpt from [1]:
# create session and add objects with Session(engine) as session, session.begin(): session.add(some_object) session.add(some_other_object) # inner context calls session.commit(), if there were no exceptions # outer context calls session.close()
The current version 1.3 does not support the context manager approach while supporting the Engine
/Session
distinction, so a similar solution is possible [2], but the user (programmer) must both keep a reference to sessionmaker
and carefully manage the lifetime of the Session
object, what is error prone. It should nevertheless be a feasible approach.
Waiting for the next version might make sense here, although it is of note that the minimum required version of Python
is bumped to 3.6 there [3] (not an issue for us as we are currently using 3.7.3, but might cause problems elsewhere). The migration document also promises various performance improvements in the next version.
The required changes to the code are non-trivial for both versions and the migration to v1.4 also seems to require some rewrites. I advise to do both at the same time and postpone this issue until after the SQLAlchemy v1.4
release.
[1] https://docs.sqlalchemy.org/en/14/orm/session_basics.html
[2] https://docs.sqlalchemy.org/en/13/orm/session_basics.html
[3] https://docs.sqlalchemy.org/en/14/changelog/migration_14.html
Updated by Radko Krkoš about 4 years ago
As for the flask-sqlalchemy
, it is just a simple adaptor [1] and seems to take the sessionmaker
route [2]. It also seems to manage the lifetime of the session subobject. Although it is not obvious from the example code [3] (why would an object's attribute lifetime be different from the object itself?), the documentation refers to the session
as scoped and declares [4]:
You have to commit the session, but you don’t have to remove it at the end of the request, Flask-SQLAlchemy does that for you.
So the way of flask-sqlalchemy
should be a viable approach to correctly access the database and if not, it is a bug somewhere in the flask-sqlalchemy
\SQLAlchemy
\psycopg2
stack and should be treated as such.
This is all based on code review and study of the documentation, not tested hard evidence.
[1] https://github.com/pallets/flask-sqlalchemy/blob/master/src/flask_sqlalchemy/model.py
[2] https://flask-sqlalchemy.palletsprojects.com/en/2.x/api/#flask_sqlalchemy.SQLAlchemy
[3] https://flask-sqlalchemy.palletsprojects.com/en/2.x/quickstart/#a-minimal-application
[4] https://flask-sqlalchemy.palletsprojects.com/en/2.x/quickstart/#road-to-enlightenment
Updated by Radko Krkoš almost 4 years ago
- File test_flask_sqlalchemy.py test_flask_sqlalchemy.py added
Further investigating the flask-sqlalchemy
, a simple testing script was prepared (attached), to determine the session management behaviour. The output, especially section "Commit - transaction start", shows that commit()
in fact does not immediately start another transaction and that is deferred until the next query. Therefore, a viable approach using flask-sqlalchemy
is:
from flask import Flask from flask_sqlalchemy import SQLAlchemy app = Flask(__name__) db = SQLAlchemy(app) # New transaction db.session.query(...) db.session.commit() # New transaction db.session.query(...) db.session.commit() ...
as opposed to rollback()
that is required using the current architecture based on pure SQLAlchemy
.
Updated by Pavel Kácha almost 4 years ago
Consensus on 2021-01-11 meeting was that the correct way most probably is switching to flask-sqlalchemy, which hides some sqlalchemy idiosyncracies. We would than be able to remove weird rollback introduced in #4251.
Updated by Pavel Kácha almost 4 years ago
- Target version changed from 2.8 to 2.9
Updated by Rajmund Hruška about 3 years ago
- Status changed from New to Feedback
- To be discussed set to Yes
I tried using flask_sqlalchemy to get rid of that rollback in 0d12bd68. Although the code is executable and all tests passed, when I check the running queries I see that the query from #4251 is still running. So the original problem is not solved yet.
From my understanding, I used the flask_sqlalchemy in the same way as in the example posted in #note-7 so it should be working. That is not the case though.
Updated by Radko Krkoš about 3 years ago
Rajmund Hruska wrote in #note-11:
From my understanding, I used the flask_sqlalchemy in the same way as in the example posted in #note-7 so it should be working.
Not quite. You need to signal an end of transaction to the ORM somehow. In #note-7, it is done by calling db.session.commit()
. In whois.py
, after being done with the loop of queries (somewhere about where previously the storage.session.rollback()
was), the storage.session.commit()
method should be called. That should close the running transaction, without it, the transaction will remain open. The symptoms are the same as for #4251, but the cause is not a deficiency in the new ORM (as was with the prior one), but in the way it is used, see #note-6 and original the source [4,#note-6] for more details.
I did not run any code now, so take all this with a grain of salt. Is my reasoning understandable?
Updated by Radko Krkoš about 3 years ago
As was discussed on the meeting, with regards also to #4569, access to SQLAlchemy's internals is needed even in the flask_sqlalchemy
. Based on the documentation [1], this might work:
from flask_sqlalchemy import BaseQuery
[1] https://flask-sqlalchemy.palletsprojects.com/en/2.x/customizing/#query-class
Updated by Rajmund Hruška about 3 years ago
- Status changed from Feedback to In Progress
- % Done changed from 0 to 50
- To be discussed changed from Yes to No
Radko Krkoš wrote in #note-13:
As was discussed on the meeting, with regards also to #4569, access to SQLAlchemy's internals is needed even in the
flask_sqlalchemy
. Based on the documentation [1], this might work:[...]
[1] https://flask-sqlalchemy.palletsprojects.com/en/2.x/customizing/#query-class
I found out that Query
can be found at flask_sqlalchemy.orm.Query
, so I used that class.
I can't fully remove sqlalchemy
module because sqlalchemy.exc.OperationalError
is used and exc
module is not present in the flask_sqlalchemy
module.
In 7c1dce70 I also added the RetryingQuery
back, which I removed in the previous commit just for test purposes.
As it was discussed during the meeting, I will check if flask_sqlalchemy
can be used at other places where insert/delete queries are used.
Updated by Radko Krkoš about 3 years ago
Rajmund Hruska wrote in #note-14:
I can't fully remove
sqlalchemy
module becausesqlalchemy.exc.OperationalError
is used andexc
module is not present in theflask_sqlalchemy
module.
I have found:
flask_sqlalchemy.orm.exc.sa_exc.OperationalError
Not thoroughly tested.
Also, there seems to be proper support for model definition:
https://flask-sqlalchemy.palletsprojects.com/en/2.x/models/
Lastly, I would advise to redo the patches from scratch, the commit history is getting hard to understand.
Updated by Rajmund Hruška about 3 years ago
Radko Krkoš wrote in #note-15:
Rajmund Hruska wrote in #note-14:
I can't fully remove
sqlalchemy
module becausesqlalchemy.exc.OperationalError
is used andexc
module is not present in theflask_sqlalchemy
module.I have found:
[...]
Not thoroughly tested.
I tried running the following code:
import flask_sqlalchemy
import sqlalchemy
def func():
raise sqlalchemy.exc.OperationalError('Test', orig=None, params=None)
def func2():
try:
func()
except flask_sqlalchemy.orm.exc.sa_exc.OperationalError:
print('caught')
func2()
After executing the code, 'caught' was printed on the screen so I assume it is possible to use
flask_sqlalchemy.orm.exc.sa_exc.OperationalError
instead of sqlalchemy.exc.OperationalError
.
Also, there seems to be proper support for model definition:
https://flask-sqlalchemy.palletsprojects.com/en/2.x/models/
I will look into that in more detail.
Lastly, I would advise to redo the patches from scratch, the commit history is getting hard to understand.
I will create a new branch.
Updated by Rajmund Hruška about 3 years ago
I checked the eventstorage.py
service, which uses Psycopg2. I don't have any experience with Psycopg2, I only read the official documentation. From my point of view, the Psycopg2 is used correctly and I wouldn't change anything there.
Updated by Radko Krkoš about 3 years ago
Rajmund Hruska wrote in #note-17:
I checked the
eventstorage.py
service, which uses Psycopg2. I don't have any experience with Psycopg2, I only read the official documentation. From my point of view, the Psycopg2 is used correctly and I wouldn't change anything there.
Yes, this is about an idiosyncrasy of the SQLAlchemy
ORM. With psycopg2
, the transactions are managed differently (manually), so there should be no gotchas there.
Updated by Rajmund Hruška about 3 years ago
Radko Krkoš wrote in #note-18:
Rajmund Hruska wrote in #note-17:
I checked the
eventstorage.py
service, which uses Psycopg2. I don't have any experience with Psycopg2, I only read the official documentation. From my point of view, the Psycopg2 is used correctly and I wouldn't change anything there.Yes, this is about an idiosyncrasy of the
SQLAlchemy
ORM. Withpsycopg2
, the transactions are managed differently (manually), so there should be no gotchas there.
I thought so but I checked psycopg2
because it was mentioned in the title and description of the issue. Anyway, at least I understand it better now
Updated by Rajmund Hruška about 3 years ago
- To be discussed changed from No to Yes
Updated by Rajmund Hruška almost 3 years ago
- Status changed from In Progress to Feedback
- % Done changed from 50 to 100
So, in ed1b8f23 I started using flask_sqlalchemy
for sessions in lib/mentat
. In lib/hawat
the flask_sqlalchemy
is already used. And lib/vial
will be merged into lib/hawat
. I guess the issue is resolved now, isn't it?
Updated by Radko Krkoš almost 3 years ago
Rajmund Hruska wrote in #note-21:
So, in ed1b8f23 I started using
flask_sqlalchemy
for sessions inlib/mentat
. Inlib/hawat
theflask_sqlalchemy
is already used. Andlib/vial
will be merged intolib/hawat
. I guess the issue is resolved now, isn't it?
From a quick glance, not all issues from the previous review have been dealt with. At least I can see the sqlalchemy.exc.OperationalError
one unaddressed.
In addition, I do not like the inconsistency of going to full path in code on lines 31 & 44 (after diff) while going in the other direction on lines 32 & 53. Why?
Updated by Rajmund Hruška almost 3 years ago
Radko Krkoš wrote in #note-22:
Rajmund Hruska wrote in #note-21:
So, in ed1b8f23 I started using
flask_sqlalchemy
for sessions inlib/mentat
. Inlib/hawat
theflask_sqlalchemy
is already used. Andlib/vial
will be merged intolib/hawat
. I guess the issue is resolved now, isn't it?From a quick glance, not all issues from the previous review have been dealt with. At least I can see the
sqlalchemy.exc.OperationalError
one unaddressed.
I think we talked about that in one of the meetings but I don't remember the reason why I didn't the OperationalError
from flask_sqlalchemy
. I guess I can use the other one, as it seems to be working anyway (#note-16).
In addition, I do not like the inconsistency of going to full path in code on lines 31 & 44 (after diff) while going in the other direction on lines 32 & 53. Why?
There wasn't really any particular reason at all it was just how it was convenient at the time. I only needed the OperationalError
in lines 32 & 53 and not full sqlalchemy
as it was before diff. And I needed multiple objects from flask_sqlalchemy
(lines 76, 78).
Updated by Radko Krkoš almost 3 years ago
Rajmund Hruska wrote in #note-23:
Radko Krkoš wrote in #note-22:
Rajmund Hruska wrote in #note-21:
So, in ed1b8f23 I started using
flask_sqlalchemy
for sessions inlib/mentat
. Inlib/hawat
theflask_sqlalchemy
is already used. Andlib/vial
will be merged intolib/hawat
. I guess the issue is resolved now, isn't it?From a quick glance, not all issues from the previous review have been dealt with. At least I can see the
sqlalchemy.exc.OperationalError
one unaddressed.I think we talked about that in one of the meetings but I don't remember the reason why I didn't the
OperationalError
fromflask_sqlalchemy
. I guess I can use the other one, as it seems to be working anyway (#note-16).In addition, I do not like the inconsistency of going to full path in code on lines 31 & 44 (after diff) while going in the other direction on lines 32 & 53. Why?
There wasn't really any particular reason at all it was just how it was convenient at the time. I only needed the
OperationalError
in lines 32 & 53 and not fullsqlalchemy
as it was before diff. And I needed multiple objects fromflask_sqlalchemy
(lines 76, 78).
OK.
Updated by Pavel Kácha almost 3 years ago
As the most important aim of this task was to get rid transaction dancing kludge to sidestep bare SQLAlchemy idiosyncracies, I guess we're done here - time for polishing and merging.
There are still cases of questionable *SQLAlchemy usage, I've created another ticket for long winter evenings: #7545.
Updated by Rajmund Hruška almost 3 years ago
- Status changed from Feedback to In Review
- To be discussed deleted (
Yes)
Merged into devel.
Updated by Pavel Kácha almost 3 years ago
- Status changed from In Review to Closed
Updated by Jakub Judiny over 1 year ago
- Related to Task #7619: Flask Deprecation warnings added
Updated by Jakub Judiny over 1 year ago
- Related to Support #7642: Upgrade Flask-SQLAlchemy added