Bug #6210: Index usage in mentat_main database - Mentat - Homeproj: Redmine for CESNET

Actions

Copy link

Bug #6210

closed

Index usage in mentat_main database

Added by Jan Mach almost 5 years ago. Updated almost 4 years ago.

Status:

Rejected

Priority:

Normal

Assignee:

Radko Krkoš

Category:

Development - Database

Target version:

Rejected

Start date:

02/03/2020

Due date:

% Done:

Estimated time:

To be discussed:

Description

I have noticed, that there are following tables in mentat_main database containing time sequential data without index and clustering on createtime column. Should we add them?

reports_events
statistics_events
changelogs_items

Actions

Copy link

Updated by Radko Krkoš almost 5 years ago

To be discussed changed from No to Yes

That depends. I do not know the access characteristics. Are the rows therein accessed by createtime? Either way, those tables are rather small, the index would probably not help much (a filtering query on the largest one - statistics_events with a month worth of data takes under 200ms). I am also puzzled by their meaning. They seem to contain some generated data since (Mentat's) big bang. Are those tables expected to just grow forever, with no purge over time?

Actions

Copy link

Updated by Jan Mach almost 5 years ago

Yes, these tables are accessed according to the createtime. The statistics_events table contains precalculated IDEA event statistics for 5 minute time windows and they are used on /events/dashboard view. Worst case scenario is generating statistics for the whole previous year, that are required by our manager annually.

So far we haven`t been deleting anything to keep history, which may prove useful one day to generate some long term overview. Once we reach certain (yet unknown) threshold, we will have to start purging.

Actions

Copy link

Updated by Radko Krkoš almost 5 years ago

To be discussed deleted (~~Yes~~)

Actions

Copy link

Updated by Radko Krkoš almost 5 years ago

To be discussed set to Yes

The reports_events and statistics_events tables seem to be accessed by interval. That is implemented as a text field. I advise to drop it and instead use a combined index on (dt_from, dt_to). The attribute delta is also redundant as it is the difference of dt_to - dt_from, createtime was discussed and should be kept for bookkeeping's sake. The id seems to serve no purpose than being an artificial primary key, this could be switched to the interval pair. Also, the JSON documents seem to only be stored, not used in queries, therefore I advise json instead of jsonb.
So, for statistics_events I would use a structure like this:

CREATE TABLE statistics_events (
  dt_from datetime NOT NULL,
  dt_to datetime NOT NULL CHECK (dt_to > dt_from),
  count integer NOT NULL,
  createtime datetime NOT NULL CHECK (createtime > dt_to),
  stats_overall json,
  stats_internal json,
  stats external json,
  PRIMARY KEY (dt_from, dt_to)
);

Actions

Copy link