Project

General

Profile

Actions

Bug #6211

closed

Missing indices on aggregated columns

Added by Pavel Kácha almost 5 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Development - Database
Target version:
Start date:
02/07/2020
Due date:
% Done:

100%

Estimated time:
To be discussed:
No

Description

There are no indices on

  • source_ip_aggr_ip4
  • target_ip_aggr_ip4
  • source_ip_aggr_ip6
  • target_ip_aggr_ip6

on both mentat-hub and mentat-alt.

It might have vanished with some migration, anyway it doesn't seem right.


Related issues

Related to Mentat - Config #6251: Rethink the indices and planner cost optimizations in light of the current usage patternsDeferredRadko Krkoš03/05/2020

Actions
Related to Mentat - Task #6610: Upgrade PostgreSQL to v13ClosedRadko Krkoš10/30/2020

Actions
Actions #1

Updated by Jan Mach almost 5 years ago

  • To be discussed changed from No to Yes
Actions #2

Updated by Pavel Kácha almost 5 years ago

  • To be discussed deleted (Yes)

Needs some more investigation (one index? more indices? works at all?)

Actions #3

Updated by Radko Krkoš over 4 years ago

  • Related to Config #6251: Rethink the indices and planner cost optimizations in light of the current usage patterns added
Actions #4

Updated by Radko Krkoš over 4 years ago

  • Assignee changed from Radko Krkoš to Pavel Kácha

The indices are indeed missing, probably forgotten during schema update. Adding them now is of course trivial, but in the end futile. For the indices to have possitive effect, planner costs would have to be altered. This was done on mentat-alt experimentally, but it breaks the fragile equilibrium, breaking many other cases and leading to suboptimal plans there.
Based on the work on this issue, #6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.

Actions #5

Updated by Pavel Kácha over 4 years ago

  • Status changed from New to Rejected

Radko Krkoš wrote:

#6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.

Okay, let's see what comes up from #6251 (it might be that different set of indices will solve/mitigate the issue, or we pursue different form, or we drop the idea - and columns - altoghether).

Actions #6

Updated by Radko Krkoš over 4 years ago

  • Status changed from Rejected to Deferred
  • Assignee changed from Pavel Kácha to Radko Krkoš
  • Target version changed from 2.7 to Backlog
  • To be discussed set to Yes

With PG13, this might actually be finally possible. The multi-column statistics are supposed to be extended for the ANY operator, leading to better plans with sensible (compatible with other queries) planner costs. Details here: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development, a long discussion about three separate improvements, the first one is of interest in this case.

Actions #7

Updated by Radko Krkoš over 4 years ago

  • Status changed from Deferred to In Progress
  • To be discussed changed from Yes to No

The conclusion from video is that this should be tested with self-compiled PG13beta1 on mentat-dev with data exported from a running Mentat instance.

Actions #8

Updated by Radko Krkoš over 4 years ago

  • % Done changed from 0 to 50
  • To be discussed changed from No to Yes

The database on mentat-dev was updated to PGv13beta1, there are actually testing debian packages available from the postgres apt repository. The events table with all of its configuration was migrated using pg_dump/pg_restore from mentat-alt. The difference is that mentat-dev is of a much lower perforance budget and also the table does not fit into the RAM there.
The testing shows, that the improvements in PG13 are helping tremendously with this workload. Out of the box (with events table and it's settings transferred), the querying works as designed:

                                                                                                         QUERY PLAN                                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=442687.13..442687.38 rows=100 width=581) (actual time=1661.499..1661.502 rows=6 loops=1)
   ->  Sort  (cost=442687.13..442695.19 rows=3225 width=581) (actual time=1643.662..1643.663 rows=6 loops=1)
         Sort Key: detecttime DESC
         Sort Method: quicksort  Memory: 31kB
         ->  Bitmap Heap Scan on events "_mentatq(83_iteeik)_"  (cost=201791.51..442563.87 rows=3225 width=581) (actual time=1547.127..1643.640 rows=6 loops=1)
               Recheck Cond: ((source_ip_aggr_ip4 && '195.113.252.59'::ip4r) AND (detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone))
               Rows Removed by Index Recheck: 129936
               Filter: ('195.113.252.59'::iprange && ANY (source_ip))
               Rows Removed by Filter: 119
               Heap Blocks: exact=724 lossy=9323
               ->  BitmapAnd  (cost=201791.51..201791.51 rows=65958 width=0) (actual time=1513.217..1513.217 rows=0 loops=1)
                     ->  Bitmap Index Scan on ip_aggr_idx  (cost=0.00..13969.77 rows=329780 width=0) (actual time=437.759..437.759 rows=279233 loops=1)
                           Index Cond: (source_ip_aggr_ip4 && '195.113.252.59'::ip4r)
                     ->  Bitmap Index Scan on events_detecttime_idx  (cost=0.00..187819.87 rows=13191543 width=0) (actual time=1052.043..1052.043 rows=12965074 loops=1)
                           Index Cond: ((detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone))
 Planning Time: 0.649 ms
 JIT:
   Functions: 5
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 3.807 ms, Inlining 0.000 ms, Optimization 1.531 ms, Emission 16.037 ms, Total 21.375 ms
 Execution Time: 1668.019 ms

The absolute run-times are not comparable. The tests show that it is best to use the extended query form with aggregation comparison for prefixes longer than /16. For shorter prefixes, the original form offers better performance (as matching events are quite abundant).
All in all, this seems like a viable approach with PG13, as was intended originally.

Actions #9

Updated by Pavel Kácha over 4 years ago

Looks cool! Thanks for testing.

Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.

Anyway, seems we can postpone more convoluted solution attempts for now.

Actions #10

Updated by Radko Krkoš over 4 years ago

Pavel Kácha wrote:

Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.

This was with full data, the events table as exported from mentat-alt is 39GB, 53GB with indices, the machine has 32GB of RAM. Not that it matters, this is from cache and the indices and corresponding pages can fit into RAM.
The run times on mentat-hub for such a query were about 250ms if I remember correctly.

Pavel Kácha wrote:

Anyway, seems we can postpone more convoluted solution attempts for now.

Especially as this is practically done, it just was not usable because of query planner internals until PG13.

Actions #11

Updated by Radko Krkoš over 4 years ago

  • To be discussed changed from Yes to No
Actions #12

Updated by Radko Krkoš about 4 years ago

  • Related to Task #6610: Upgrade PostgreSQL to v13 added
Actions #13

Updated by Pavel Kácha about 4 years ago

Indices have been recreated and succesfuly tested on production on PSQL13. However, it's necessary to check if they are correctly created/taken into account on installation and Mentat upgrades.

Actions #14

Updated by Radko Krkoš about 4 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Radko Krkoš to Pavel Kácha
  • % Done changed from 50 to 90

Implemented and to be reviewed in repository branch source:@krkos-bugfix-#6211-ip_aggr_idx. This branch also contains documentation update for #6610, as that one is already closed and the update is thematically similar.
By the way, this still has "Blacklog" assigned as the target version, this should be fixed.

Actions #15

Updated by Radko Krkoš almost 4 years ago

  • To be discussed changed from No to Yes
Actions #16

Updated by Pavel Kácha almost 4 years ago

  • Target version changed from Backlog to 2.8
Actions #17

Updated by Pavel Kácha almost 4 years ago

  • To be discussed changed from Yes to No
Actions #18

Updated by Pavel Kácha almost 4 years ago

  • Status changed from Feedback to In Review
Actions #19

Updated by Jan Mach almost 4 years ago

  • Status changed from In Review to Closed
  • % Done changed from 90 to 100

Merged into devel.

Actions

Also available in: Atom PDF