Project

General

Profile

Actions

Feature #4274

closed

Minimize whole JSON IDEA events usage (jsonb column)

Added by Pavel Kácha about 3 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Development - Core
Target version:
Start date:
08/22/2018
Due date:
08/22/2018
% Done:

100%

Estimated time:
To be discussed:
No

Description

Mentat relies too much on the whole original JSON data (stored and encoded in jsonb column). Lot of queries spend cycles on decoding of the data, which is already stored in metadata columns, and pqsql wastes RAM on jsonb column.

First step of solution is to implement lightweight API, which would allow to serve results, which are satisfiable from metadata columns, directly, without resorting to jsonb.

However, couple of features depends on IDEA format, namely filtering. Sidesteping this could be
  • creating and returning incomplete lightweight IDEA-like events on the fly from the metadata columns, and allow for asking only for data, which are needed by the caller (thus freeing database from the need for fetching the whole lines to the memory and from the need to push jsonb data into app), and freeing caller from the need to ingest, parse and convert JSON
  • creating complementary part of the API, which would allow for "extending" of the data, or "upgrading" incomplete lightweight IDEA events to full blown data, fetched from the db based on the ID
  • converting as much of code using current API as possible to work with minimum data it needs (incomplete events) and extending to full data only after all the hard work is done

This itself would help majority of the bigger queries, and probably mostly solve 'big events' problem - set of simple deterministic conversions from the metadata table will replace costly JSON demarshalling. It might even speed up parts where 'extend' part is necessary - if all the costly processing and filtering is done beforehand on the incomplete events, and the number of the complete events is trimmed to tens or hundreds (Hawat 'show' event, reporter creating mails).


Related issues

Related to Mentat - Bug #4253: Handling of too big eventsClosedJan Mach08/09/2018

Actions
Related to Mentat - Task #6054: Explore the use of PostgreSQL views for easier event storing and queryingRejectedJan Mach11/12/2019

Actions
Follows Mentat - Feature #4275: Split jsonb column into its own tableClosedPavel Kácha

Actions
Actions #1

Updated by Pavel Kácha about 3 years ago

  • Related to Bug #4253: Handling of too big events added
Actions #2

Updated by Pavel Kácha about 3 years ago

  • Blocked by Feature #4275: Split jsonb column into its own table added
Actions #3

Updated by Pavel Kácha about 3 years ago

  • Blocked by deleted (Feature #4275: Split jsonb column into its own table)
Actions #4

Updated by Pavel Kácha about 3 years ago

  • Follows Feature #4275: Split jsonb column into its own table added
Actions #5

Updated by Pavel Kácha about 3 years ago

Note from Radko at #4253:

iprange types are returned as string representations in single address, prefix or arbitrary range (min-max) forms (best match). This is because iprange is a nonstandard extension and the standard connector does not understand its data type. Anyways, there is no native/standard library data type that would fit iprange (not the arbitrary range part at least).

Which is cool, as it might work as direct input to ipranges.

Actions #6

Updated by Pavel Kácha over 2 years ago

After meeting: seems like the only thing remaining is the api for extending ghost Idea to full.

Actions #7

Updated by Pavel Kácha almost 2 years ago

  • Target version changed from Backlog to 2.6
Actions #8

Updated by Pavel Kácha almost 2 years ago

Query: 195.113.0.0/16 on Mentat-Alt runs away - seems like missing limit somewhere?

Actions #9

Updated by Jan Mach almost 2 years ago

  • % Done changed from 0 to 100
  • To be discussed changed from No to Yes

Reported bug should be fixed now, limit for the size of the returned result set is in place again.

Are we going to perform some impact measurement?

Actions #10

Updated by Jan Mach almost 2 years ago

  • Status changed from New to Feedback
  • Assignee changed from Jan Mach to Radko Krkoš
Actions #11

Updated by Jan Mach almost 2 years ago

  • Related to Task #6054: Explore the use of PostgreSQL views for easier event storing and querying added
Actions #12

Updated by Jan Mach almost 2 years ago

Things to discuss:
  1. What are the actual performance impacts of this change.
Actions #13

Updated by Radko Krkoš almost 2 years ago

  • Status changed from Feedback to Resolved

There are obvious differences in performance between the former and the new version. Right now, each is deployed on different servers, but looking at mentat-alt (new) and mentat-hub (former), the plans generated are the same. The only difference therefore comes from reading, keeping in cache, processing and marshaling the event BYTEA, which is omitted in the new version. The runtime went from 16s to 9s on mentat-alt for a pathological query. Based on information above, similar speedup is expected in production.

Actions #14

Updated by Jan Mach almost 2 years ago

  • Status changed from Resolved to Closed
  • To be discussed changed from Yes to No

Thank you Radko for your comment, I think, we are done with this issue.

Actions

Also available in: Atom PDF