Task #4100

Move datetime evaluations to DB where appropriate

Added by Radko Krkoš 9 months ago. Updated 5 months ago.

Status:ClosedStart date:05/22/2018
Priority:NormalDue date:
Assignee:Radko Krkoš% Done:

100%

Category:Development - Core
Target version:2.1

Description

There are queries for all events from a time window <T-4h, T), which use externally supplied timestamp, e.g.:

2018-05-21 09:55:04.113 CEST statement: SELECT * FROM events WHERE "cesnet_storagetime" >= '2018-05-21T05:55:01.362722'::timestamp

Such queries are currently planned based on ANALYZE statistics which get out of date very quickly, leading to suboptimal plans.
Replacing those queries by:

SELECT * FROM events WHERE "cesnet_storagetime" >= NOW() - INTERVAL '4h'

would allow the planner to understand the true meaning and choose the optimal plan (single threaded sequential scan over storagetime index).
  • The origin of these queries is currently unknown, reporter is the suspect.

Associated revisions

Revision 54788cbe
Added by Jan Mach 5 months ago

Optimized event table watchdog query for better performance.

According to the recommendation of our database specialist the event table watchdog query was rewritten for better performance. New approach makes better use of database and results in better database query plan and procesing time. More detail on the issue in appropriate Redmine ticket. (Redmine issue: #4100)

History

#1 Updated by Radko Krkoš 9 months ago

  • This query seems to be ran every hour at n:55, n=0,1,2,...,23.

#2 Updated by Radko Krkoš 9 months ago

After more investigation (by Jan Mach), watchdog was confirmed to be the source of the query. The purpose is to detect whether any data were stored in the last 2 hours (stored in GMT).
The most efficient way to do this is:

SELECT MAX > CURRENT_TIMESTAMP AT TIME ZONE 'GMT’ - INTERVAL '2h’ FROM events;

returning boolean TRUE/FALSE, with run time of <1ms, therefore an improvement of 4 orders of magnitude.

#3 Updated by Radko Krkoš 9 months ago

Unmangled query, sorry:

SELECT MAX(cesnet_storagetime) > CURRENT_TIMESTAMP AT TIME ZONE 'GMT’ - INTERVAL '2h’ FROM events;

#4 Updated by Jan Mach 7 months ago

  • Target version set to Future

#5 Updated by Radko Krkoš 7 months ago

  • Status changed from New to Resolved

This query was used by the watchdog and is being migrated to a more efficient one like:

SELECT max(cesnet_storagetime) > NOW() AT TIME ZONE 'GMT' - INTERVAL '2h' FROM events;

returning the boolean value that watchdog is interested in, with runtime of <1ms.

#6 Updated by Jan Mach 7 months ago

  • Status changed from Resolved to In Progress

This issue is not resolved yet, implementation was not yet done.

#7 Updated by Jan Mach 5 months ago

  • Target version changed from Future to 2.1

#8 Updated by Jan Mach 5 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Attached commit successfully resolves the issue. Accepted solution:

params = ('{:d}s'.format(interval),)
query = psycopg2.sql.SQL("SELECT max({}) > NOW() AT TIME ZONE 'GMT' - INTERVAL %s AS watchdog FROM events").\
    format(psycopg2.sql.Identifier('cesnet_storagetime'))

Issue is in the resolved state, waiting for the verification after building and deploying the packages on the server before closing it for good.

#9 Updated by Pavel Kácha 5 months ago

  • Assignee changed from Jan Mach to Radko Krkoš

#10 Updated by Radko Krkoš 5 months ago

  • Status changed from Resolved to Closed

Accepted solution matches the recommendation. If the watchdog works as intended, then it is an improvement from the DB point of view. “Less is less”. Closing the issue.

Also available in: Atom PDF