Actions
Bug #7733
closedmentat-storage may break during shutdown
Start date:
05/02/2024
Due date:
% Done:
0%
Estimated time:
To be discussed:
No
Description
Found (and forgotten) during server upgrade - when shutting down, storage lost (or reused after remove) a savepoint.
2024-03-11 10:11:32,577 mentat-storage.py [1499] INFO: Component 'storage': Committing all pending messages and switching to immediate commit mode. 2024-03-11 10:11:32,577 mentat-storage.py [1499] ERROR: Exception <class 'mentat.services.eventstorage.EventStorageException'>: Savepoint does not exist in transaction. 2024-03-11 10:11:32,581 mentat-storage.py [1499] ERROR: File "/var/mentat/venv/lib/python3.11/site-packages/pyzenkit/zendaemon.py", line 1209, in _sub_stage_process self._event_loop() File "/var/mentat/venv/lib/python3.11/site-packages/pyzenkit/zendaemon.py", line 1188, in _event_loop (flag, args) = handler(self, args) ^^^^^^^^^^^^^^^^^^^ File "/var/mentat/venv/lib/python3.11/site-packages/mentat/daemon/component/storage.py", line 238, in cbk_event_stop self._commit_pending() File "/var/mentat/venv/lib/python3.11/site-packages/mentat/daemon/component/storage.py", line 323, in _commit_pending self.event_service.commit_bulk() File "/var/mentat/venv/lib/python3.11/site-packages/mentat/services/eventstorage.py", line 1010, in wrapped_f return func(other_self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/var/mentat/venv/lib/python3.11/site-packages/mentat/services/eventstorage.py", line 1097, in commit_bulk self.savepoint_release() File "/var/mentat/venv/lib/python3.11/site-packages/mentat/services/eventstorage.py", line 1010, in wrapped_f return func(other_self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/var/mentat/venv/lib/python3.11/site-packages/mentat/services/eventstorage.py", line 1131, in savepoint_release raise EventStorageException("Savepoint does not exist in transaction.")
Updated by Rajmund Hruška 6 months ago
- Status changed from New to In Progress
- Assignee set to Rajmund Hruška
- Target version changed from Backlog to 2.13
I think I know what the issue was and luckily, we didn't lose any events.
To reproduce this bug, try stopping the storage just after this line appears in the log:
2024-06-13 11:28:50,289 mentat-storage.py [5269] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
That is produced by this function:
def cbk_event_database_commit(self, daemon, args):
"""
Check, whether there are any messages waiting to be committed for greater than
configured time period. Commit them if necessary to avoid loss of data.
"""
daemon.logger.debug(
"Component '{}': Checking whether commit needs to be enforced".format(
self.cid
)
)
if self.events_uncommitted and ((time.time() - self.last_commit) > self.commit_bulkintv):
daemon.logger.info(
"Component '{}': Commit timeout '{}' elapsed, performing commit.".format(
self.cid,
self.commit_bulkintv
)
)
self._commit_pending()
here is _commit_pending
:
def _commit_pending(self):
self.event_service.commit_bulk()
self.inc_statistic(self.STATS_CNT_COMMIT_TIMEOUT, self.events_uncommitted)
self.inc_statistic(self.STATS_CNT_COMMITS_TIMEOUT)
self.events_uncommitted = 0
self.last_commit = time.time()
and commit_bulk
:
def commit_bulk(self):
"""
Release and commit currently pending savepoint changes.
"""
self.savepoint_release()
self.commit()
Now in the function handling stopping of the storage:
def cbk_event_stop(self, daemon, args):
"""
Stop the component.
"""
daemon.logger.debug(
"Component '{}': Stopping the component".format(
self.cid
)
)
# In case we are running in bulk commit mode.
if self.commit_bulk:
daemon.logger.info(
"Component '{}': Committing all pending messages and switching to immediate commit mode.".format(
self.cid
)
)
# Commit all currently pending IDEA messages.
self._commit_pending()
# Switch to immediate commit mode for the rest of the messages in the queue.
self._setup_insert_now(daemon)
one can see that _commit_pending
is once again called resulting in calling savepoint_release
twice.
Updated by Rajmund Hruška 6 months ago
- Status changed from In Progress to Resolved
Updated by Rajmund Hruška 6 months ago
- Status changed from Resolved to In Review
Actions