Project

General

Profile

Bug #6211

Missing indices on aggregated columns

Added by Pavel Kácha 5 months ago. Updated 19 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Development - Database
Target version:
Start date:
02/07/2020
Due date:
% Done:

50%

Estimated time:
To be discussed:
No

Description

There are no indices on

  • source_ip_aggr_ip4
  • target_ip_aggr_ip4
  • source_ip_aggr_ip6
  • target_ip_aggr_ip6

on both mentat-hub and mentat-alt.

It might have vanished with some migration, anyway it doesn’t seem right.


Related issues

Related to Mentat - Config #6251: Rethink the indices and planner cost optimizations in light of the current usage patternsNewRadko Krkoš03/05/2020

Actions
#1

Updated by Jan Mach 5 months ago

  • To be discussed changed from No to Yes
#2

Updated by Pavel Kácha 5 months ago

  • To be discussed deleted (Yes)

Needs some more investigation (one index? more indices? works at all?)

#3

Updated by Radko Krkoš 4 months ago

  • Related to Config #6251: Rethink the indices and planner cost optimizations in light of the current usage patterns added
#4

Updated by Radko Krkoš 3 months ago

  • Assignee changed from Radko Krkoš to Pavel Kácha

The indices are indeed missing, probably forgotten during schema update. Adding them now is of course trivial, but in the end futile. For the indices to have possitive effect, planner costs would have to be altered. This was done on mentat-alt experimentally, but it breaks the fragile equilibrium, breaking many other cases and leading to suboptimal plans there.
Based on the work on this issue, #6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.

#5

Updated by Pavel Kácha 3 months ago

  • Status changed from New to Rejected

Radko Krkoš wrote:

#6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.

Okay, let’s see what comes up from #6251 (it might be that different set of indices will solve/mitigate the issue, or we pursue different form, or we drop the idea - and columns - altoghether).

#6

Updated by Radko Krkoš about 2 months ago

  • Status changed from Rejected to Deferred
  • Assignee changed from Pavel Kácha to Radko Krkoš
  • Target version changed from 2.7 to Backlog
  • To be discussed set to Yes

With PG13, this might actually be finally possible. The multi-column statistics are supposed to be extended for the ANY operator, leading to better plans with sensible (compatible with other queries) planner costs. Details here: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development, a long discussion about three separate improvements, the first one is of interest in this case.

#7

Updated by Radko Krkoš about 2 months ago

  • Status changed from Deferred to In Progress
  • To be discussed changed from Yes to No

The conclusion from video is that this should be tested with self-compiled PG13beta1 on mentat-dev with data exported from a running Mentat instance.

#8

Updated by Radko Krkoš about 2 months ago

  • % Done changed from 0 to 50
  • To be discussed changed from No to Yes

The database on mentat-dev was updated to PGv13beta1, there are actually testing debian packages available from the postgres apt repository. The events table with all of its configuration was migrated using pg_dump/pg_restore from mentat-alt. The difference is that mentat-dev is of a much lower perforance budget and also the table does not fit into the RAM there.
The testing shows, that the improvements in PG13 are helping tremendously with this workload. Out of the box (with events table and it’s settings transferred), the querying works as designed:

                                                                                                         QUERY PLAN                                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=442687.13..442687.38 rows=100 width=581) (actual time=1661.499..1661.502 rows=6 loops=1)
   ->  Sort  (cost=442687.13..442695.19 rows=3225 width=581) (actual time=1643.662..1643.663 rows=6 loops=1)
         Sort Key: detecttime DESC
         Sort Method: quicksort  Memory: 31kB
         ->  Bitmap Heap Scan on events "_mentatq(83_iteeik)_"  (cost=201791.51..442563.87 rows=3225 width=581) (actual time=1547.127..1643.640 rows=6 loops=1)
               Recheck Cond: ((source_ip_aggr_ip4 && '195.113.252.59'::ip4r) AND (detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone))
               Rows Removed by Index Recheck: 129936
               Filter: ('195.113.252.59'::iprange && ANY (source_ip))
               Rows Removed by Filter: 119
               Heap Blocks: exact=724 lossy=9323
               ->  BitmapAnd  (cost=201791.51..201791.51 rows=65958 width=0) (actual time=1513.217..1513.217 rows=0 loops=1)
                     ->  Bitmap Index Scan on ip_aggr_idx  (cost=0.00..13969.77 rows=329780 width=0) (actual time=437.759..437.759 rows=279233 loops=1)
                           Index Cond: (source_ip_aggr_ip4 && '195.113.252.59'::ip4r)
                     ->  Bitmap Index Scan on events_detecttime_idx  (cost=0.00..187819.87 rows=13191543 width=0) (actual time=1052.043..1052.043 rows=12965074 loops=1)
                           Index Cond: ((detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone))
 Planning Time: 0.649 ms
 JIT:
   Functions: 5
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 3.807 ms, Inlining 0.000 ms, Optimization 1.531 ms, Emission 16.037 ms, Total 21.375 ms
 Execution Time: 1668.019 ms

The absolute run-times are not comparable. The tests show that it is best to use the extended query form with aggregation comparison for prefixes longer than /16. For shorter prefixes, the original form offers better performance (as matching events are quite abundant).
All in all, this seems like a viable approach with PG13, as was intended originally.

#9

Updated by Pavel Kácha about 2 months ago

Looks cool! Thanks for testing.

Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I’m curious what final times on our boxes will be.

Anyway, seems we can postpone more convoluted solution attempts for now.

#10

Updated by Radko Krkoš about 2 months ago

Pavel Kácha wrote:

Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I’m curious what final times on our boxes will be.

This was with full data, the events table as exported from mentat-alt is 39GB, 53GB with indices, the machine has 32GB of RAM. Not that it matters, this is from cache and the indices and corresponding pages can fit into RAM.
The run times on mentat-hub for such a query were about 250ms if I remember correctly.

Pavel Kácha wrote:

Anyway, seems we can postpone more convoluted solution attempts for now.

Especially as this is practically done, it just was not usable because of query planner internals until PG13.

#11

Updated by Radko Krkoš 19 days ago

  • To be discussed changed from Yes to No

Also available in: Atom PDF