Bug #7648
closedEnricher and Storage modules are too slow
0%
Description
Approximately since the warden outage, the real-time processing modules mentat-storage
and mentat.enricher
are too slow to process the incoming messages. This happens irregularly, probably based on the amount of events coming from warden.
Files
Updated by Rajmund Hruška over 1 year ago
The first module in the pipeline is actually mentat-inspector
. Surprisingly, this module doesn't have such performance issues.
Updated by Rajmund Hruška over 1 year ago
- Status changed from New to In Progress
The bottleneck is actually mentat-storage.py
. In mentat-enricher
, when 5000 messages limit in the output queue (input queue of mentat-storage
) is reached, the enricher
module is waiting until the amount of messages decreases below this limit. Therefore, mentat-enricher
is mostly sleeping.
Updated by Rajmund Hruška over 1 year ago
Sometimes, mentat-storage.py
is idle for a couple of minutes.
2023-05-05 10:43:44,656 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.364001.42.20144884.idea':'e53e18c4-9c86-4300-9bc1-5d122ea4ab06' into database (bulk mode).
2023-05-05 10:43:44,665 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.363383.42.20144878.idea':'dd8c921d-4808-4b48-8f15-c7e276a56efc' into database (bulk mode).
2023-05-05 10:43:44,707 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.361595.42.20144868.idea':'5f6e1099-c194-4296-921e-215cd5692232' into database (bulk mode).
2023-05-05 10:43:44,744 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.361945.42.20144869.idea':'0b45605d-8e55-4f0b-be37-346beb65732f' into database (bulk mode).
2023-05-05 10:43:44,786 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.364102.42.20144885.idea':'e82575bc-2859-4680-9f7d-9504ae329d61' into database (bulk mode).
2023-05-05 10:43:44,816 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.362930.42.20144876.idea':'09e5da1f-161b-4cdf-bc3b-2427ab0adb7e' into database (bulk mode).
2023-05-05 10:43:44,856 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.364796.42.20144887.idea':'887b5059-1f25-453d-b034-30654a86a43a' into database (bulk mode).
2023-05-05 10:43:44,891 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.365145.42.20144888.idea':'0eaf85ad-56e4-4370-8d43-cb37a20e3e8e' into database (bulk mode).
2023-05-05 10:43:44,973 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.364449.42.20144886.idea':'ec8bad9d-bf71-423f-9384-6d9dbcdab4ea' into database (bulk mode).
2023-05-05 10:43:44,973 mentat-storage.py [23692] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
2023-05-05 10:43:45,022 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.365507.42.20144889.idea':'4687d6cb-8a3b-46a9-aee2-92328134c02f' into database (bulk mode).
2023-05-05 10:43:45,052 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.365849.42.20144890.idea':'91198035-2d1e-4ce5-9513-08cfb413deb8' into database (bulk mode).
2023-05-05 10:43:45,083 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.366309.42.20144891.idea':'326635c4-5e3e-4b5c-9b36-75010be88591' into database (bulk mode).
2023-05-05 10:46:59,537 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.366818.42.20144892.idea':'7612666a-1cc8-4e34-9206-d8cabcdd86d4' into database (bulk mode).
2023-05-05 10:46:59,537 mentat-storage.py [23692] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
2023-05-05 10:46:59,539 mentat-storage.py [23692] INFO: Component 'filer': *** Processing statistics ***
cnt_enqueued 0 ( +0, 0.00 #/s)
cnt_fetched 8,750,612 ( +508, 2.50 #/s)
cnt_updated 0 ( +0, 0.00 #/s)
cnt_committed 8,750,611 ( +508, 2.50 #/s)
cnt_banished 0 ( +0, 0.00 #/s)
cnt_canceled 0 ( +0, 0.00 #/s)
cnt_dispatched 0 ( +0, 0.00 #/s)
cnt_duplicated 0 ( +0, 0.00 #/s)
cnt_errors 0 ( +0, 0.00 #/s)
2023-05-05 10:46:59,540 mentat-storage.py [23692] INFO: Component 'parser': *** Processing statistics ***
cnt_parsed 8,750,612 ( +508, 2.50 #/s)
cnt_encoded 0 ( +0, 0.00 #/s)
cnt_errors 0 ( +0, 0.00 #/s)
2023-05-05 10:46:59,540 mentat-storage.py [23692] INFO: Component 'storage': *** Processing statistics ***
cnt_stored 8,750,612 ( +508, 2.50 #/s)
cnt_errors 0 ( +0, 0.00 #/s)
cnt_eci_timeout 8,495,612 ( +611, 3.01 #/s)
cnt_eci_threshold 255,000 ( +0, 0.00 #/s)
cnt_cis_timeout 26,918 ( +3, 0.01 #/s)
cnt_cis_threshold 510 ( +0, 0.00 #/s)
2023-05-05 10:46:59,540 mentat-storage.py [23692] INFO: Component 'commiter': *** Processing statistics ***
cnt_commits 8,750,612 ( +508, 2.50 #/s)
2023-05-05 10:46:59,541 mentat-storage.py [23692] INFO: Application runlog saved to file '/var/mentat/run/mentat-storage.py/202305031329.05c86.runlog'
2023-05-05 10:46:59,582 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.367627.42.20144894.idea':'ab5c7467-7e49-4d0f-b227-c90706c4b952' into database (bulk mode).
2023-05-05 10:46:59,624 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.367974.42.20144895.idea':'daba791d-e99a-4f12-8770-37abf34f71f6' into database (bulk mode).
2023-05-05 10:46:59,663 mentat-storage.py [23692] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683275939.367277.42.20144893.idea':'9f24ec86-d28e-4179-bd50-cb9d9ee587a8' into database (bulk mode).
Updated by Rajmund Hruška over 1 year ago
- File difference.txt difference.txt added
Mentat-hub also has such problem. I checked the log from the past 6 days and I looked for commits which took longer than 5 seconds.
Here is a couple of the longer ones (the full list is in the attached file):
Difference: 0:04:13.735000
2023-04-30 18:38:04,090 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1682872674.215313.42.95108556.idea':'5d26766f-1b6c-4464-9a45-abbb2b1f0d8b' into database (bulk mode).
2023-04-30 18:42:17,825 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1682872674.213048.42.95108554.idea':'d8f041ee-1f2a-4c8c-a923-37db9db0e37a' into database (bulk mode).
Difference: 0:00:37.145000
2023-05-02 18:07:35,152 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683043546.844466.42.104630444.idea':'9996e774-4258-4e34-bb48-7a8718b20e2d' into database (bulk mode).
2023-05-02 18:08:12,297 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683043546.844523.42.104630445.idea':'7008923c-99cf-415f-9021-6312f003bf7e' into database (bulk mode).
Difference: 0:01:14.004000
2023-05-02 18:10:57,553 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683043765.774413.42.104638710.idea':'5bc7bbf9-c9e9-49e4-b4f6-18dec1698ddf' into database (bulk mode).
2023-05-02 18:12:11,557 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683043765.772501.42.104638709.idea':'0b2834ce-cf75-4586-8ce3-2465f48e8205' into database (bulk mode).
Difference: 0:03:15.174000
2023-05-04 06:24:21,431 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683174131.527206.42.112273032.idea':'fc24b1bc-935f-4b50-824d-7b9c38f752cb' into database (bulk mode).
2023-05-04 06:27:36,605 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683174131.529052.42.112273033.idea':'86b913c1-7b9e-4066-88d5-34e18f1339d2' into database (bulk mode).
Difference: 0:02:13.985000
2023-05-04 06:28:06,507 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683174168.153324.42.112256350.idea':'78c428bd-dcbc-4785-a50e-353842d30c39' into database (bulk mode).
2023-05-04 06:30:20,492 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683174168.153397.42.112256351.idea':'e327639f-f183-45c9-96ce-47fed564f5cf' into database (bulk mode).
Difference: 0:01:08.315000
2023-05-04 11:54:07,147 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683194042.539961.42.113508866.idea':'4d2310e2-c85b-4691-8765-6964013cbd27' into database (bulk mode).
2023-05-04 11:55:15,462 mentat-storage.py [4248] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683194042.540401.42.113508874.idea':'884da1eb-9e3a-4268-87bd-aa317fa16ec1' into database (bulk mode).
The respective IDEA events are:
https://mentat-hub.cesnet.cz/mentat/events/d8f041ee-1f2a-4c8c-a923-37db9db0e37a/show
https://mentat-hub.cesnet.cz/mentat/events/7008923c-99cf-415f-9021-6312f003bf7e/show
https://mentat-hub.cesnet.cz/mentat/events/0b2834ce-cf75-4586-8ce3-2465f48e8205/show
https://mentat-hub.cesnet.cz/mentat/events/86b913c1-7b9e-4066-88d5-34e18f1339d2/show
https://mentat-hub.cesnet.cz/mentat/events/e327639f-f183-45c9-96ce-47fed564f5cf/show
https://mentat-hub.cesnet.cz/mentat/events/884da1eb-9e3a-4268-87bd-aa317fa16ec1/show
Updated by Rajmund Hruška over 1 year ago
I think it's actually not related to the size of the stored event. For example event with id '049c4bc7-6b9f-4fa4-9067-5973ff79d194' was stored both on mentat-hub and mentat-alt.
I checked logs on mentat-hub
2023-05-09 09:35:59,147 mentat-storage.py [67933] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683617735.348093.42.135677343.idea':'7c810a13-346d-48da-9f4b-9e074c30bacb' into database (bulk mode).
2023-05-09 09:35:59,162 mentat-storage.py [67933] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683617735.348316.42.135677344.idea':'049c4bc7-6b9f-4fa4-9067-5973ff79d194' into database (bulk mode).
2023-05-09 09:35:59,177 mentat-storage.py [67933] INFO: Component 'storage': Stored message 'mentat-hub.3937.1683617735.348763.42.135677346.idea':'53f9accd-179f-4cfe-ae06-c939cdaffa92' into database (bulk mode).
and mentat-alt
2023-05-09 09:36:11,572 mentat-storage.py [5435] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683617733.891316.42.48657867.idea':'3afcc9cc-b36c-44ea-a99f-a7180e1c477e' into database (bulk mode).
2023-05-09 09:36:11,592 mentat-storage.py [5435] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683617733.896053.42.48657898.idea':'1a7c73b4-be52-4f0c-9de7-fc5b92c44bd5' into database (bulk mode).
2023-05-09 09:36:11,592 mentat-storage.py [5435] INFO: Component 'filer': Scheduling next queue check after '3' seconds
2023-05-09 09:36:11,593 mentat-storage.py [5435] INFO: Waiting for '2' seconds until next scheduled event
2023-05-09 09:36:13,593 mentat-storage.py [5435] INFO: Received wakeup signal (14)
2023-05-09 09:36:13,593 mentat-storage.py [5435] INFO: Waiting for '1' seconds until next scheduled event
2023-05-09 09:36:14,593 mentat-storage.py [5435] INFO: Received wakeup signal (14)
2023-05-09 09:36:14,593 mentat-storage.py [5435] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
and later, a few minutes after stopping the module with mentat-controller
2023-05-09 10:18:14,746 mentat-storage.py [5435] WARNING: Received signal 'SIGINT' (2)
2023-05-09 10:18:14,770 mentat-storage.py [5435] INFO: Component 'storage': Stored message 'mentat-alt.3017.1683617733.896821.42.48657900.idea':'049c4bc7-6b9f-4fa4-9067-5973ff79d194' into database (bulk mode).
2023-05-09 10:18:14,770 mentat-storage.py [5435] INFO: Component 'storage': Commit timeout '5' elapsed, performing commit.
2023-05-09 10:18:14,771 mentat-storage.py [5435] INFO: Component 'filer': *** Processing statistics ***
2023-05-09 10:18:14,772 mentat-storage.py [5435] INFO: Component 'parser': *** Processing statistics ***
2023-05-09 10:18:14,772 mentat-storage.py [5435] INFO: Component 'storage': *** Processing statistics ***
2023-05-09 10:18:14,772 mentat-storage.py [5435] INFO: Component 'commiter': *** Processing statistics ***
2023-05-09 10:18:14,773 mentat-storage.py [5435] INFO: Application runlog saved to file '/var/mentat/run/mentat-storage.py/202305051102.01535.runlog'
I was also lucky to see the query in the database as shown in the attached screenshot.
Updated by Rajmund Hruška over 1 year ago
So, the issue was with the database. I should still check enricher, though, which seems to be the bottleneck. Maybe there is some low hanging fruit.
Updated by Rajmund Hruška over 1 year ago
I looked for some easy ways to improve enricher
but I found next to nothing. The only reasonable thing I saw is using country database instead of city database in geoip
plugin. It only gives about 0.5 miliseconds of improvement per event, but it's a trivial change which also makes more sense, so I would go for it. And in a day it can save about 25 minutes.
Updated by Pavel Kácha over 1 year ago
- Target version changed from 2.11 to Backlog
Updated by Rajmund Hruška 8 months ago
- Status changed from In Progress to Closed
No longer an issue on the new mentat-hub.