Feature #4274: Minimize whole JSON IDEA events usage (jsonb column) - Mentat - Homeproj: Redmine for CESNET

Actions

Copy link

Feature #4274

closed

Minimize whole JSON IDEA events usage (jsonb column)

Added by Pavel Kácha over 6 years ago. Updated about 5 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Radko Krkoš

Category:

Development - Core

Target version:

2.6

Start date:

08/22/2018

Due date:

08/22/2018

% Done:

100%

Estimated time:

To be discussed:

Description

Mentat relies too much on the whole original JSON data (stored and encoded in jsonb column). Lot of queries spend cycles on decoding of the data, which is already stored in metadata columns, and pqsql wastes RAM on jsonb column.

First step of solution is to implement lightweight API, which would allow to serve results, which are satisfiable from metadata columns, directly, without resorting to jsonb.

However, couple of features depends on IDEA format, namely filtering. Sidesteping this could be

creating and returning incomplete lightweight IDEA-like events on the fly from the metadata columns, and allow for asking only for data, which are needed by the caller (thus freeing database from the need for fetching the whole lines to the memory and from the need to push jsonb data into app), and freeing caller from the need to ingest, parse and convert JSON
creating complementary part of the API, which would allow for "extending" of the data, or "upgrading" incomplete lightweight IDEA events to full blown data, fetched from the db based on the ID
converting as much of code using current API as possible to work with minimum data it needs (incomplete events) and extending to full data only after all the hard work is done

This itself would help majority of the bigger queries, and probably mostly solve 'big events' problem - set of simple deterministic conversions from the metadata table will replace costly JSON demarshalling. It might even speed up parts where 'extend' part is necessary - if all the costly processing and filtering is done beforehand on the incomplete events, and the number of the complete events is trimmed to tens or hundreds (Hawat 'show' event, reporter creating mails).

Related issues

Actions

Copy link

Updated by Pavel Kácha over 6 years ago

Related to Bug #4253: Handling of too big events added

Actions

Copy link

Updated by Pavel Kácha over 6 years ago

Blocked by Feature #4275: Split jsonb column into its own table added

Actions

Copy link

Updated by Pavel Kácha over 6 years ago

Blocked by deleted (Feature #4275: Split jsonb column into its own table)

Actions

Copy link

Updated by Pavel Kácha over 6 years ago

Follows Feature #4275: Split jsonb column into its own table added

Actions

Copy link

Updated by Pavel Kácha over 6 years ago

Note from Radko at #4253:

iprange types are returned as string representations in single address, prefix or arbitrary range (min-max) forms (best match). This is because iprange is a nonstandard extension and the standard connector does not understand its data type. Anyways, there is no native/standard library data type that would fit iprange (not the arbitrary range part at least).

Which is cool, as it might work as direct input to ipranges.

Actions

Copy link

Updated by Pavel Kácha almost 6 years ago

After meeting: seems like the only thing remaining is the api for extending ghost Idea to full.

Actions

Copy link

Updated by Pavel Kácha about 5 years ago

Target version changed from Backlog to 2.6

Actions

Copy link

Updated by Pavel Kácha about 5 years ago

Query: 195.113.0.0/16 on Mentat-Alt runs away - seems like missing limit somewhere?

Actions

Copy link

Updated by Jan Mach about 5 years ago

% Done changed from 0 to 100
To be discussed changed from No to Yes

Reported bug should be fixed now, limit for the size of the returned result set is in place again.

Are we going to perform some impact measurement?

Actions

Copy link

#10

Updated by Jan Mach about 5 years ago

Status changed from New to Feedback
Assignee changed from Jan Mach to Radko Krkoš

Actions

Copy link

#11

Updated by Jan Mach about 5 years ago

Related to Task #6054: Explore the use of PostgreSQL views for easier event storing and querying added

Actions

Copy link

#12

Updated by Jan Mach about 5 years ago

Things to discuss:

What are the actual performance impacts of this change.

Actions

Copy link

#13

Updated by Radko Krkoš about 5 years ago

Status changed from Feedback to Resolved

There are obvious differences in performance between the former and the new version. Right now, each is deployed on different servers, but looking at mentat-alt (new) and mentat-hub (former), the plans generated are the same. The only difference therefore comes from reading, keeping in cache, processing and marshaling the event BYTEA, which is omitted in the new version. The runtime went from 16s to 9s on mentat-alt for a pathological query. Based on information above, similar speedup is expected in production.

Actions

Copy link

#14

Updated by Jan Mach about 5 years ago

Status changed from Resolved to Closed
To be discussed changed from Yes to No

Thank you Radko for your comment, I think, we are done with this issue.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Mentat

Custom queries

Feature #4274

Minimize whole JSON IDEA events usage (jsonb column)

Updated by Pavel Kácha over 6 years ago

Updated by Pavel Kácha over 6 years ago

Updated by Pavel Kácha over 6 years ago

Updated by Pavel Kácha over 6 years ago

Updated by Pavel Kácha over 6 years ago

Updated by Pavel Kácha almost 6 years ago

Updated by Pavel Kácha about 5 years ago

Updated by Pavel Kácha about 5 years ago

Updated by Jan Mach about 5 years ago

Updated by Jan Mach about 5 years ago

Updated by Jan Mach about 5 years ago

Updated by Jan Mach about 5 years ago

Updated by Radko Krkoš about 5 years ago

Updated by Jan Mach about 5 years ago