Task #4100
closed
Move datetime evaluations to DB where appropriate
Added by Radko Krkoš over 6 years ago.
Updated about 6 years ago.
Category:
Development - Core
Description
There are queries for all events from a time window <T-4h, T), which use externally supplied timestamp, e.g.:
2018-05-21 09:55:04.113 CEST statement: SELECT * FROM events WHERE "cesnet_storagetime" >= '2018-05-21T05:55:01.362722'::timestamp
Such queries are currently planned based on ANALYZE statistics which get out of date very quickly, leading to suboptimal plans.
Replacing those queries by:
SELECT * FROM events WHERE "cesnet_storagetime" >= NOW() - INTERVAL '4h'
would allow the planner to understand the true meaning and choose the optimal plan (single threaded sequential scan over storagetime index).
- The origin of these queries is currently unknown, reporter is the suspect.
- This query seems to be ran every hour at n:55, n=0,1,2,...,23.
After more investigation (by Jan Mach), watchdog was confirmed to be the source of the query. The purpose is to detect whether any data were stored in the last 2 hours (stored in GMT).
The most efficient way to do this is:
SELECT MAX > CURRENT_TIMESTAMP AT TIME ZONE 'GMT' - INTERVAL '2h' FROM events;
returning boolean TRUE/FALSE, with run time of <1ms, therefore an improvement of 4 orders of magnitude.
Unmangled query, sorry:
SELECT MAX(cesnet_storagetime) > CURRENT_TIMESTAMP AT TIME ZONE 'GMT’ - INTERVAL '2h’ FROM events;
- Target version set to Backlog
- Status changed from New to Resolved
This query was used by the watchdog and is being migrated to a more efficient one like:
SELECT max(cesnet_storagetime) > NOW() AT TIME ZONE 'GMT' - INTERVAL '2h' FROM events;
returning the boolean value that watchdog is interested in, with runtime of <1ms.
- Status changed from Resolved to In Progress
This issue is not resolved yet, implementation was not yet done.
- Target version changed from Backlog to 2.1
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Attached commit successfully resolves the issue. Accepted solution:
params = ('{:d}s'.format(interval),)
query = psycopg2.sql.SQL("SELECT max({}) > NOW() AT TIME ZONE 'GMT' - INTERVAL %s AS watchdog FROM events").\
format(psycopg2.sql.Identifier('cesnet_storagetime'))
Issue is in the resolved state, waiting for the verification after building and deploying the packages on the server before closing it for good.
- Assignee changed from Jan Mach to Radko Krkoš
- Status changed from Resolved to Closed
Accepted solution matches the recommendation. If the watchdog works as intended, then it is an improvement from the DB point of view. "Less is less". Closing the issue.
Also available in: Atom
PDF