Feature #4276
closedTest database conversion time
100%
Description
- crate new table
- alter table old jsonb NULL
- repeat: select old jsonb into new jsonb limit XXX; until done
- alter table drop old jsonb
- vacuum
- cluster
Related issues
Updated by Radko Krkoš about 6 years ago
- % Done changed from 0 to 50
Results:
- crate new table
Instant, only metadata are written. - alter table old jsonb NULL
Hard to predict for mentat-hub. Dropping NOT NULL seems to be much slower than setting it. Needs to be measured later. - repeat: select old jsonb into new jsonb limit XXX; until done
Dependent on other load. Single run looks very fast, the whole table will take some time. It is advised to do this in a single transaction, iterate over storage time with LIMIT + OFFSET. The final CLUSTER will take care of the "mess" created that way. - alter table drop old jsonb
Instant, only metadata are written. - vacuum
Does not take long for tables with no (few) DELETEs and/or UPDATEs. Autovacuum as set up now should be sufficient during the new table population, one final vacuum on both tables before CLUSTERing them is advised. Also, disable old data removal for the migration period to reduce the background load. - cluster
Very dependent on disk subsystem throughput, will take at least an hour on mentat-hub based on tests from before it became production system (not possible to test now without some serious downtime). It will be faster if Mentat is shut down (including cron jobs - in mentat-controller command terminology: disabled) for this.
Updated by Radko Krkoš about 6 years ago
Radko Krkoš wrote:
- alter table old jsonb NULL
Hard to predict for mentat-hub. Dropping NOT NULL seems to be much slower than setting it. Needs to be measured later.
This is a non-issue as event BYTEA is already declared as Nullable.
Just for reference, ALTER TABLE x ALTER COLUMN y DROP NOT NULL;
which would be otherwise required is an instant operation. The SET NOT NULL
inverse operation requires a full scan (see remark below) to check if there are any NULL values.
SET NOT NULL
Updated by Radko Krkoš about 6 years ago
- % Done changed from 50 to 90
To sum this up, only SELECT INTO data transfer and CLUSTER are time consuming, with only CLUSTER requiring downtime. The data transfer can benefit from stopped import and cron jobs if this is acceptable. Ready for discussion and closure at the next VC.
Updated by Pavel Kácha almost 6 years ago
- Status changed from New to Resolved
- Priority changed from Normal to Low
So, we need to take care of incremental conversion, may take days. Cluster would need several hours of outage. Leaving open for reference.
Updated by Radko Krkoš almost 6 years ago
- Priority changed from Low to Normal
Based on latest tests, lessons learned from behaviour while running out of disk cache and the recent reduction of stored event time interval, the best course of action would be:
1. Prepare Mentat for the new model.
2. Stop event import, deploy the updated mentat packages.
3. Create the data table (id + event bytea).
4. Insert events into the new data table.
5. ALTER TABLE events DROP COLUMN event;
6. CLUSTER events;
7. Start event import.
This will require downtime (2-7), that will be similar to the one during PostgreSQL upgrade to v11, even shorter as the amount of stored events is quite a bit lower. I guess such downtime is acceptable. This reduces the implementation complexity considerably.
Measurements show that the split results in 1:2 table size reduction. That would mean fitting about 3x as much data before running out of cache.
Priority increase as this is showing to be more acute.
Please comment. Or should we plan a VC? There is quite a bit to discuss.
Updated by Radko Krkoš almost 6 years ago
- Status changed from Resolved to Feedback
Updated by Radko Krkoš almost 6 years ago
- Status changed from Feedback to Deferred
- Parent task deleted (
#4275)
Feedback received, the conversion will be performed using the steps in #4276#note-5 once the code in Mentat is ready.
Updated by Radko Krkoš almost 6 years ago
- Blocked by Feature #4275: Split jsonb column into its own table added
Updated by Radko Krkoš almost 6 years ago
CLUSTER
time on mentat-hub
for 4 weeks of data: 27 minutes. This was done before the wal_level=minimal
change as discussed in #4515, so a lower run time should be expected.
Updated by Radko Krkoš about 5 years ago
- Status changed from Deferred to Resolved
- % Done changed from 90 to 100
- To be discussed changed from No to Yes
- Stop import (warden_filer)
- Perform
VACUUM FREEZE
- Stop mentat
- Perform migration
- Perform
CLUSTER
- Perform
ANALYZE
- Start mentat and warden_filer
This can limit the web interface and services downtime to below 1h. If new event import and processing is paramount, swap the first two steps. This will prolong the procedure somewhat, not terribly.
Updated by Jan Mach about 5 years ago
- Target version changed from Backlog to 2.6
Updated by Pavel Kácha about 5 years ago
- Status changed from Resolved to Closed
Closed after consensus