Project

General

Profile

Actions

Bug #7241

open

Hawat fails with 500 on DB restart, Mentat (Storage) also breaks

Added by Radko Krkoš about 3 years ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
04/23/2021
Due date:
% Done:

0%

Estimated time:
To be discussed:
No

Description

This was dealt with before, but only partially. Recent events have shown that the connection management code is in case of DB restart still failing to reconnect. I cannot find the original bug now. Attaching log for posterity:

Request: /events/search?dt_from=2021-04-16+10%3A00%3A00&dt_to=2021-04-23+10%3A00%3A00&source_addrs=&source_ports=&submit=Search, referer: https://mentat-hub.cesnet.cz/mentat/eve
Traceback:, referer: https://mentat-hub.cesnet.cz/mentat/events/search
Traceback (most recent call last):, referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    response = self.full_dispatch_request(), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    rv = self.handle_user_exception(e), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    reraise(exc_type, exc_value, tb), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    raise value, referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    rv = self.dispatch_request(), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    return self.view_functions[rule.endpoint](**req.view_args), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask_login/utils.py", line 272, in decorated_view, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    return func(*args, **kwargs), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/flask/views.py", line 89, in view, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    return self.dispatch_request(*args, **kwargs), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/vial/view/__init__.py", line 909, in dispatch_request, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    items = self.search(form_data), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/hawat/base.py", line 372, in search, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    self._check_search_query_quota(), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/hawat/base.py", line 350, in _check_search_query_quota, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    int(flask_login.current_user.get_id()), referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1710, in queries_status, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    qpattern, referer: https://mentat-hub.cesnet.cz/mentat/events/search
  File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute, referer: https://mentat-hub.cesnet.cz/mentat/events/search
    return super(NamedTupleCursor, self).execute(query, vars), referer: https://mentat-hub.cesnet.cz/mentat/events/search
psycopg2.InterfaceError: cursor already closed, referer: https://mentat-hub.cesnet.cz/mentat/events/search, referer: https://mentat-hub.cesnet.cz/mentat/events/search
Actions #1

Updated by Pavel Kácha about 2 years ago

Seems storage daemon is not happy with DB restart:

2022-03-22 09:45:56,665 mentat-storage.py [10370] ERROR: Component 'storage': Unable to store IDEA message 'mentat-alt.2891.1647938749.568712.42.36522548.idea' into database
: 'Traceback (most recent call last):
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1272, in insert_event_bulkci
    self.savepoint_create()
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f
    return func(other_self, *args, **kwargs)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1053, in savepoint_create
    self.savepoint
  File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute
    return super(NamedTupleCursor, self).execute(query, vars)
psycopg2.OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 260, in cbk_event_message_process
    self.event_gateway(daemon, args)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 104, in _event_insert_bulk
    self.event_service.insert_event_bulkci(args['idea'])
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f
    return func(other_self, *args, **kwargs)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1282, in insert_event_bulkci
    self.__init__()
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 966, in __init__
    self.connection = psycopg2.connect(**self.dsn)
  File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/__init__.py", line 126, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: FATAL:  the database system is shutting down

'
2022-03-22 09:45:56,666 mentat-storage.py [10370] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
2022-03-22 09:45:56,666 mentat-storage.py [10370] ERROR: Exception <class 'psycopg2.InterfaceError'>: cursor already closed
2022-03-22 09:45:56,667 mentat-storage.py [10370] ERROR: 
  File "/var/mentat/venv/lib/python3.7/site-packages/pyzenkit/zendaemon.py", line 1209, in _sub_stage_process
    self._event_loop()
  File "/var/mentat/venv/lib/python3.7/site-packages/pyzenkit/zendaemon.py", line 1188, in _event_loop
    (flag, args) = handler(self, args)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 296, in cbk_event_database_commit
    self._commit_pending()
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/daemon/component/storage.py", line 323, in _commit_pending
    self.event_service.commit_bulk()
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f
    return func(other_self, *args, **kwargs)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1030, in commit_bulk
    self.savepoint_release()
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 947, in wrapped_f
    return func(other_self, *args, **kwargs)
  File "/var/mentat/venv/lib/python3.7/site-packages/mentat/services/eventstorage.py", line 1071, in savepoint_release
    self.savepoint
  File "/var/mentat/venv/lib/python3.7/site-packages/psycopg2/extras.py", line 319, in execute
    return super(NamedTupleCursor, self).execute(query, vars)
Actions #2

Updated by Radko Krkoš about 2 years ago

Further failure points:
- Administration\Database Status (psycopg2.InterfaceError: cursor already closed, referer: https://mentat-alt.cesnet.cz/mentat/status/view).
- Search (psycopg2.OperationalError: terminating connection due to administrator command, referer: https://mentat-alt.cesnet.cz/mentat/events/search)

Actions #3

Updated by Pavel Kácha 11 months ago

  • Subject changed from Hawat fails with 500 on DB restart to Hawat fails with 500 on DB restart, Mentat (Storage) also breaks
Actions #4

Updated by Pavel Kácha 11 months ago

Seems problem is in both SQLAlchemy backends and psycopg (events) backend.

Actions #5

Updated by Pavel Kácha about 1 month ago

  • Priority changed from Low to Normal
Actions

Also available in: Atom PDF