Actions
Bug #7241
openHawat fails with 500 on DB restart, Mentat (Storage) also breaks
Start date:
04/23/2021
Due date:
% Done:
0%
Estimated time:
To be discussed:
No
Description
This was dealt with before, but only partially. Recent events have shown that the connection management code is in case of DB restart still failing to reconnect. I cannot find the original bug now. Attaching log for posterity:
Request: /events/search?dt_from=2021-04-16+10%3A00%3A00&dt_to=2021-04-23+10%3A00%3A00&source_addrs=&source_ports=&submit=Search, referer: https://mentat-hub.cesnet.cz/mentat/eve Traceback:, referer: https://mentat-hub.cesnet.cz/mentat/events/search Traceback (most recent call last):, referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app, referer: https://mentat-hub.cesnet.cz/mentat/events/search response = self.full_dispatch_request(), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search rv = self.handle_user_exception(e), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception, referer: https://mentat-hub.cesnet.cz/mentat/events/search reraise(exc_type, exc_value, tb), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise, referer: https://mentat-hub.cesnet.cz/mentat/events/search raise value, referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search rv = self.dispatch_request(), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search return self.view_functions[rule.endpoint](**req.view_args), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask_login/utils.py", line 272, in decorated_view, referer: https://mentat-hub.cesnet.cz/mentat/events/search return func(*args, **kwargs), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/flask/views.py", line 89, in view, referer: https://mentat-hub.cesnet.cz/mentat/events/search return self.dispatch_request(*args, **kwargs), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/vial/view/__init__.py", line 909, in dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search items = self.search(form_data), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/hawat/base.py", line 372, in search, referer: https://mentat-hub.cesnet.cz/mentat/events/search self._check_search_query_quota(), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/hawat/base.py", line 350, in _check_search_query_quota, referer: https://mentat-hub.cesnet.cz/mentat/events/search int(flask_login.current_user.get_id()), referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1710, in queries_status, referer: https://mentat-hub.cesnet.cz/mentat/events/search qpattern, referer: https://mentat-hub.cesnet.cz/mentat/events/search File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute, referer: https://mentat-hub.cesnet.cz/mentat/events/search return super(NamedTupleCursor, self).execute(query, vars), referer: https://mentat-hub.cesnet.cz/mentat/events/search psycopg2.InterfaceError: cursor already closed, referer: https://mentat-hub.cesnet.cz/mentat/events/search, referer: https://mentat-hub.cesnet.cz/mentat/events/search
Updated by Pavel Kácha over 2 years ago
Seems storage daemon is not happy with DB restart:
2022-03-22 09:45:56,665 mentat-storage.py [10370] ERROR: Component 'storage': Unable to store IDEA message 'mentat-alt.2891.1647938749.568712.42.36522548.idea' into database : 'Traceback (most recent call last): File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1272, in insert_event_bulkci self.savepoint_create() File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f return func(other_self, *args, **kwargs) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1053, in savepoint_create self.savepoint File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute return super(NamedTupleCursor, self).execute(query, vars) psycopg2.OperationalError: terminating connection due to administrator command SSL connection has been closed unexpectedly server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 260, in cbk_event_message_process self.event_gateway(daemon, args) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 104, in _event_insert_bulk self.event_service.insert_event_bulkci(args['idea']) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f return func(other_self, *args, **kwargs) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1282, in insert_event_bulkci self.__init__() File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 966, in __init__ self.connection = psycopg2.connect(**self.dsn) File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/__init__.py", line 126, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: FATAL: the database system is shutting down ' 2022-03-22 09:45:56,666 mentat-storage.py [10370] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit. 2022-03-22 09:45:56,666 mentat-storage.py [10370] ERROR: Exception <class 'psycopg2.InterfaceError'>: cursor already closed 2022-03-22 09:45:56,667 mentat-storage.py [10370] ERROR: File "/var/mentat/venv/lib/python3.7/site-packages/pyzenkit/zendaemon.py", line 1209, in _sub_stage_process self._event_loop() File "/var/mentat/venv/lib/python3.7/site-packages/pyzenkit/zendaemon.py", line 1188, in _event_loop (flag, args) = handler(self, args) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 296, in cbk_event_database_commit self._commit_pending() File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 323, in _commit_pending self.event_service.commit_bulk() File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f return func(other_self, *args, **kwargs) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1030, in commit_bulk self.savepoint_release() File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f return func(other_self, *args, **kwargs) File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1071, in savepoint_release self.savepoint File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute return super(NamedTupleCursor, self).execute(query, vars)
Updated by Radko Krkoš over 2 years ago
Further failure points:
- Administration\Database Status (psycopg2.InterfaceError: cursor already closed, referer: https://mentat-alt.cesnet.cz/mentat/status/view).
- Search (psycopg2.OperationalError: terminating connection due to administrator command, referer: https://mentat-alt.cesnet.cz/mentat/events/search)
Updated by Pavel Kácha over 1 year ago
- Subject changed from Hawat fails with 500 on DB restart to Hawat fails with 500 on DB restart, Mentat (Storage) also breaks
Updated by Pavel Kácha over 1 year ago
Seems problem is in both SQLAlchemy backends and psycopg (events) backend.
Actions