Feature #4274
closedMinimize whole JSON IDEA events usage (jsonb column)
100%
Description
Mentat relies too much on the whole original JSON data (stored and encoded in jsonb column). Lot of queries spend cycles on decoding of the data, which is already stored in metadata columns, and pqsql wastes RAM on jsonb column.
First step of solution is to implement lightweight API, which would allow to serve results, which are satisfiable from metadata columns, directly, without resorting to jsonb.
However, couple of features depends on IDEA format, namely filtering. Sidesteping this could be- creating and returning incomplete lightweight IDEA-like events on the fly from the metadata columns, and allow for asking only for data, which are needed by the caller (thus freeing database from the need for fetching the whole lines to the memory and from the need to push jsonb data into app), and freeing caller from the need to ingest, parse and convert JSON
- creating complementary part of the API, which would allow for "extending" of the data, or "upgrading" incomplete lightweight IDEA events to full blown data, fetched from the db based on the ID
- converting as much of code using current API as possible to work with minimum data it needs (incomplete events) and extending to full data only after all the hard work is done
This itself would help majority of the bigger queries, and probably mostly solve 'big events' problem - set of simple deterministic conversions from the metadata table will replace costly JSON demarshalling. It might even speed up parts where 'extend' part is necessary - if all the costly processing and filtering is done beforehand on the incomplete events, and the number of the complete events is trimmed to tens or hundreds (Hawat 'show' event, reporter creating mails).
Related issues
Updated by Pavel Kácha over 6 years ago
- Related to Bug #4253: Handling of too big events added
Updated by Pavel Kácha over 6 years ago
- Blocked by Feature #4275: Split jsonb column into its own table added
Updated by Pavel Kácha over 6 years ago
- Blocked by deleted (Feature #4275: Split jsonb column into its own table)
Updated by Pavel Kácha over 6 years ago
- Follows Feature #4275: Split jsonb column into its own table added
Updated by Pavel Kácha over 6 years ago
Note from Radko at #4253:
iprange types are returned as string representations in single address, prefix or arbitrary range (min-max) forms (best match). This is because iprange is a nonstandard extension and the standard connector does not understand its data type. Anyways, there is no native/standard library data type that would fit iprange (not the arbitrary range part at least).
Which is cool, as it might work as direct input to ipranges.
Updated by Pavel Kácha almost 6 years ago
After meeting: seems like the only thing remaining is the api for extending ghost Idea to full.
Updated by Pavel Kácha about 5 years ago
- Target version changed from Backlog to 2.6
Updated by Pavel Kácha about 5 years ago
Query: 195.113.0.0/16 on Mentat-Alt runs away - seems like missing limit somewhere?
Updated by Jan Mach about 5 years ago
- % Done changed from 0 to 100
- To be discussed changed from No to Yes
Reported bug should be fixed now, limit for the size of the returned result set is in place again.
Are we going to perform some impact measurement?
Updated by Jan Mach about 5 years ago
- Status changed from New to Feedback
- Assignee changed from Jan Mach to Radko Krkoš
Updated by Jan Mach about 5 years ago
- Related to Task #6054: Explore the use of PostgreSQL views for easier event storing and querying added
Updated by Jan Mach about 5 years ago
- What are the actual performance impacts of this change.
Updated by Radko Krkoš about 5 years ago
- Status changed from Feedback to Resolved
There are obvious differences in performance between the former and the new version. Right now, each is deployed on different servers, but looking at mentat-alt
(new) and mentat-hub
(former), the plans generated are the same. The only difference therefore comes from reading, keeping in cache, processing and marshaling the event BYTEA
, which is omitted in the new version. The runtime went from 16s to 9s on mentat-alt
for a pathological query. Based on information above, similar speedup is expected in production.
Updated by Jan Mach about 5 years ago
- Status changed from Resolved to Closed
- To be discussed changed from Yes to No
Thank you Radko for your comment, I think, we are done with this issue.