Bug #6211
closedMissing indices on aggregated columns
100%
Description
There are no indices on
- source_ip_aggr_ip4
- target_ip_aggr_ip4
- source_ip_aggr_ip6
- target_ip_aggr_ip6
on both mentat-hub and mentat-alt.
It might have vanished with some migration, anyway it doesn't seem right.
Related issues
Updated by Pavel Kácha almost 5 years ago
- To be discussed deleted (
Yes)
Needs some more investigation (one index? more indices? works at all?)
Updated by Radko Krkoš over 4 years ago
- Related to Config #6251: Rethink the indices and planner cost optimizations in light of the current usage patterns added
Updated by Radko Krkoš over 4 years ago
- Assignee changed from Radko Krkoš to Pavel Kácha
The indices are indeed missing, probably forgotten during schema update. Adding them now is of course trivial, but in the end futile. For the indices to have possitive effect, planner costs would have to be altered. This was done on mentat-alt
experimentally, but it breaks the fragile equilibrium, breaking many other cases and leading to suboptimal plans there.
Based on the work on this issue, #6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.
Updated by Pavel Kácha over 4 years ago
- Status changed from New to Rejected
Radko Krkoš wrote:
#6251 was devised as the best course of action, the work should continue there and this issue should be abandoned. Reassigning to Pavel for consideration.
Okay, let's see what comes up from #6251 (it might be that different set of indices will solve/mitigate the issue, or we pursue different form, or we drop the idea - and columns - altoghether).
Updated by Radko Krkoš over 4 years ago
- Status changed from Rejected to Deferred
- Assignee changed from Pavel Kácha to Radko Krkoš
- Target version changed from 2.7 to Backlog
- To be discussed set to Yes
With PG13, this might actually be finally possible. The multi-column statistics are supposed to be extended for the ANY
operator, leading to better plans with sensible (compatible with other queries) planner costs. Details here: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development, a long discussion about three separate improvements, the first one is of interest in this case.
Updated by Radko Krkoš over 4 years ago
- Status changed from Deferred to In Progress
- To be discussed changed from Yes to No
The conclusion from video is that this should be tested with self-compiled PG13beta1 on mentat-dev
with data exported from a running Mentat instance.
Updated by Radko Krkoš over 4 years ago
- % Done changed from 0 to 50
- To be discussed changed from No to Yes
The database on mentat-dev
was updated to PGv13beta1, there are actually testing debian packages available from the postgres apt repository. The events
table with all of its configuration was migrated using pg_dump
/pg_restore
from mentat-alt
. The difference is that mentat-dev
is of a much lower perforance budget and also the table does not fit into the RAM there.
The testing shows, that the improvements in PG13 are helping tremendously with this workload. Out of the box (with events
table and it's settings transferred), the querying works as designed:
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=442687.13..442687.38 rows=100 width=581) (actual time=1661.499..1661.502 rows=6 loops=1) -> Sort (cost=442687.13..442695.19 rows=3225 width=581) (actual time=1643.662..1643.663 rows=6 loops=1) Sort Key: detecttime DESC Sort Method: quicksort Memory: 31kB -> Bitmap Heap Scan on events "_mentatq(83_iteeik)_" (cost=201791.51..442563.87 rows=3225 width=581) (actual time=1547.127..1643.640 rows=6 loops=1) Recheck Cond: ((source_ip_aggr_ip4 && '195.113.252.59'::ip4r) AND (detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone)) Rows Removed by Index Recheck: 129936 Filter: ('195.113.252.59'::iprange && ANY (source_ip)) Rows Removed by Filter: 119 Heap Blocks: exact=724 lossy=9323 -> BitmapAnd (cost=201791.51..201791.51 rows=65958 width=0) (actual time=1513.217..1513.217 rows=0 loops=1) -> Bitmap Index Scan on ip_aggr_idx (cost=0.00..13969.77 rows=329780 width=0) (actual time=437.759..437.759 rows=279233 loops=1) Index Cond: (source_ip_aggr_ip4 && '195.113.252.59'::ip4r) -> Bitmap Index Scan on events_detecttime_idx (cost=0.00..187819.87 rows=13191543 width=0) (actual time=1052.043..1052.043 rows=12965074 loops=1) Index Cond: ((detecttime >= '2020-05-22 12:00:00+02'::timestamp with time zone) AND (detecttime <= '2020-05-29 12:00:00+02'::timestamp with time zone)) Planning Time: 0.649 ms JIT: Functions: 5 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 3.807 ms, Inlining 0.000 ms, Optimization 1.531 ms, Emission 16.037 ms, Total 21.375 ms Execution Time: 1668.019 ms
The absolute run-times are not comparable. The tests show that it is best to use the extended query form with aggregation comparison for prefixes longer than /16. For shorter prefixes, the original form offers better performance (as matching events are quite abundant).
All in all, this seems like a viable approach with PG13, as was intended originally.
Updated by Pavel Kácha over 4 years ago
Looks cool! Thanks for testing.
Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.
Anyway, seems we can postpone more convoluted solution attempts for now.
Updated by Radko Krkoš over 4 years ago
Pavel Kácha wrote:
Did you test with full data (not fitting into RAM), or only part of it (fitting into RAM)? Because if you can get queries under 2 secs on full data on this iron, I'm curious what final times on our boxes will be.
This was with full data, the events
table as exported from mentat-alt
is 39GB, 53GB with indices, the machine has 32GB of RAM. Not that it matters, this is from cache and the indices and corresponding pages can fit into RAM.
The run times on mentat-hub
for such a query were about 250ms if I remember correctly.
Pavel Kácha wrote:
Anyway, seems we can postpone more convoluted solution attempts for now.
Especially as this is practically done, it just was not usable because of query planner internals until PG13.
Updated by Radko Krkoš about 4 years ago
- Related to Task #6610: Upgrade PostgreSQL to v13 added
Updated by Pavel Kácha about 4 years ago
Indices have been recreated and succesfuly tested on production on PSQL13. However, it's necessary to check if they are correctly created/taken into account on installation and Mentat upgrades.
Updated by Radko Krkoš about 4 years ago
- Status changed from In Progress to Feedback
- Assignee changed from Radko Krkoš to Pavel Kácha
- % Done changed from 50 to 90
Implemented and to be reviewed in repository branch source:@krkos-bugfix-#6211-ip_aggr_idx. This branch also contains documentation update for #6610, as that one is already closed and the update is thematically similar.
By the way, this still has "Blacklog" assigned as the target version, this should be fixed.
Updated by Radko Krkoš almost 4 years ago
- To be discussed changed from No to Yes
Updated by Pavel Kácha almost 4 years ago
- Target version changed from Backlog to 2.8
Updated by Pavel Kácha almost 4 years ago
- To be discussed changed from Yes to No
Updated by Pavel Kácha almost 4 years ago
- Status changed from Feedback to In Review
Updated by Jan Mach almost 4 years ago
- Status changed from In Review to Closed
- % Done changed from 90 to 100
Merged into devel.