Task #7538
openStreamline documentation concerning VACUUM/CLUSTER/ANALYZE
0%
Description
Stemmed from #7215:
Rajmund Hruska wrote in #note-12:
I fixed the unit tests so the only thing left is to discuss whether or not to put
CLUSTER
in the migration.
Pavel Kácha wrote in #note-13:
In the Upgrading docs we have "step 4", where we want admin to run VACUUM/CLUSTER/ANALYZE unconditionally on any upgrade - which we think solves it and we do not need CLUSTER in the migration.
However, we might make "step 4" conditional - "run this only if Alembic runs any migrations".
Radko, could you please confirm?
Radko Krkoš wrote in #note-15:
It is more complicated. The first part of step 4 of the "Upgrading" section indeed runs
alembic
migrations, first formentat_main
throughflask
's CLI, then onmentat_events
through a helper script, then setsCLUSTER
-ing indices and autovacuum parameters. All of that is idempotent and effectively no-op if there are no changes.
The second part instructs the user to runVACUUM FREEZE
,CLUSTER
andANALYZE
by hand. At leastVACUUM FREEZE
is either useful or quick, if it was not or was run recently, respectively. I would keep that one. The last one,ANALYZE
is also useful and takes a short time, so skipping it adds less value than it takes.
The only non-straightforward one is theCLUSTER
. That is never no-op, it can and will take a lot of time. Also, one canCLUSTER
a freshlyCLUSTER
-ed table, running the whole process again with the same run time but no benefits. The trouble is that it is hard to decide whether aCLUSTER
might be useful, based on information that the administrator has available. First, it is not obvious whether anyalembic
migrations will be run until the above mentioned commands are run. Next, even if there are any migrations, they might not affect the tables in a way thatCLUSTER
would be beneficial. Last, it is a good idea to runCLUSTER
from time to time on workloads such as Mentat, where data are piped (as in running window) through the database. The need forCLUSTER
should be determined based on table/index bloat.
I would go with extending thesqldb-optimize.sh
script and adding code for conditionallyCLUSTER
-ing only tables with excess bloat. A bloat level detection code does exist, it is reported by thesystem-status
script.
Pavel Kácha wrote in #note-16:
Mmmkay. After discussion - I guess we might revisit tuning section of Database doc - akin to "add recommended CLUSTER run based on high bloat". Or, as you mentioned, we could consider adding this into sqldb-optimize.sh (however I'm unsure about that, this script only sets recommended values for arbitrary runs of VACUUM/CLUSTER/ANALYZE).
Current bloat reporting script is 11k of dense Perl code from Nagios tools, Radko, isn't there something ligher on stomach?
Related issues
Updated by Pavel Kácha about 3 years ago
- Related to Bug #7215: Wrong usage of DB array in mentat_main/settings_reporting/emails added
Updated by Pavel Kácha almost 3 years ago
- Target version changed from 2.9 to 2.10
Updated by Pavel Kácha over 2 years ago
Mostly solved in Support #7555: Upgrade PostgreSQL to v14, but needs to be incorporated from Database upgrade section also to Mentat upgrade section.
Updated by Rajmund Hruška over 2 years ago
- Target version changed from 2.10 to 2.11
Updated by Rajmund Hruška over 1 year ago
- Target version changed from 2.11 to Backlog