Task #3362
closedMigrate Mentat system to new hardware
100%
Description
Overall migration status (periodically updated)¶
- (DONE) Install new base server
- New server mentat-alt.cesnet.cz is ready and installed using Ansible
- (DONE) Install Mentat system and Warden client on the server
- Development version of Mentat system is installed on the server using Debian package system
- Warden client is installed on the server and connected to production instance of Warden server
- (IN PROGRESS) Perform data and service migration to new server
- Prepare database migration scripts
- Prepare filesystem migration scripts
- Prepare utility migration scripts
- Migrate the Mentat service
- Verify functionality
General guidelines for migration process¶
- Day before migration lower the TTL of relevant DNS records for mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Presynchronize filesystem data (rsync), so that the actual migration will be much quicker later.
- report attachments
- RRD databases and chart images
- cache files
- persistent state files
- runlog files? (maybe not necessary)
- log files? (maybe not necessary)
- Shut down Warden client on mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers and let Mentat empty all queues.
- Shut down Mentat systems on mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Perform database migration
- users
- groups
- filters
- networks
- reports
- event statistics
- Perform filesystem migration - same data as above
- Perform configuration migration
- synchronize content of
/etc/mentat
configuration directory
- synchronize content of
- Switch warden client certificates between mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Switch shibd configuration between mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Switch hostnames and IP addresses between mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Reboot both servers and pray to your favorite god, or as an atheist sit quietly with your hands in your lap
- Login to new mentat-hub.cesnet.cz and launch everything
- Launch Mentat backend services (daemons and scripts)
- Launch Warden client and verify messages are being stored into database
- Verify that the web interface is accessible
- Synchronize crontab for root
Event migration might not be necessary. If the Mentat will be running on new server for some time and without any downtimes, we could skip slow migration of events from MongoDB to PostgreSQL.
Migration process checklist¶
Trial period before migration¶
During this period the mentat-alt.cesnet.cz server works as independent and fully operational instance of Mentat system, which can be used for testing and development purposes.
- (DONE) mentat-alt.cesnet.cz: Install base server.
- mentat-alt.cesnet.cz: Configure server monitoring with Nagios.
- mentat-alt.cesnet.cz: Configure server backup.
- (DONE) mentat-alt.cesnet.cz: Install development version of Mentat system. Keep it running and updated during trial period.
- (DONE) mentat-hub.cesnet.cz: Write script for periodical dump of MongoDB.
- Script is called
/root/mentatdb-dump-all.sh
. - Verified, that script is working properly.
- Script is called
- (DONE) mentat-alt.cesnet.cz: Write script for periodical import of MongoDB dumps from mentat-hub.cesnet.cz.
- Script is called
/root/mentat-sync-mongodb.sh
. - Verified, that script is working properly.
- Script is called
- (DONE) mentat-hub.cesnet.cz: Install cronjob for script
/root/mentat-sync-mongodb.sh
to periodically test the import process.- Installed with following root crontab record:
5 */4 * * * /root/mentat-sync-mongodb.sh
- The script will perform fresh dump using
/root/mentatdb-dump-all.sh
on mentat-hub.cesnet.cz, fetch the result and import it to local MongoDB instance. - Verified, that cronjob is working properly.
- Installed with following root crontab record:
From day before migration until migration time.¶
After this period the mentat-alt.cesnet.cz is getting ready for migration process. All Mentat modules will be stopped and data will be synchronized to the local filesystem. Only web interface will be operational to some extend and can be used to verify, that migrated data will be accessible.
- (DONE) Stop all Warden client daemons.
- (DONE) Stop all Mentat modules.
- (DONE) mentat-alt.cesnet.cz: Write script for periodical Mentat filesystem data synchronization.
- Script is called
/root/mentat-sync-files.sh
. - Verified, that script is working properly.
- Script is called
- (DONE) mentat-hub.cesnet.cz: Install cronjob for script
/root/mentat-sync-files.sh
to periodically prefetch filesystem data to target server.- Installed with following root crontab record:
35 * * * * /root/mentat-sync-files.sh --skip-install
- Verified, that cronjob is working properly.
- Installed with following root crontab record:
- (DONE) mentat-alt.cesnet.cz: Prepare new networking configuration into file
/etc/network/interfaces.new
, backup current setting into file/etc/networking/interfaces.old
.- New networking configuration configuration can be enabled by following command
cp /etc/networking/interfaces.new /etc/networking/interfaces
and restarting the networking service.
- New networking configuration configuration can be enabled by following command
- (DONE) mentat-alt.cesnet.cz: Write script for quick renaming of the server to different name.
- Script will replace all ocurences of
mentat-alt
withmentat-hub
in list of selected configuration files. - Script is called
/root/system-rename.sh
. - Verified, that script is working properly.
- Script will replace all ocurences of
- (DONE) mentat-alt.cesnet.cz: Write script for quick switching of most important configurations.
- Send various configuration files to source server and fetch corresponding ones from it.
- Configurations like server certificates, shibboleth configurations, Warden client configurations, etc.
- Script is called
/root/mentat-sync-config.sh
. - Verified, that script is working properly.
- (DONE) Perform initial database migration to be more efficient and less time consuming later.
- Fetched current database with
/root/mentat-sync-mongodb.sh
. - Launch migration with
/etc/mentat/scripts/sqldb-migrate-data.py --drop
intmux
terminal.
- Fetched current database with
Actual migration process¶
# Disable utility migration scripts that were installed before so that they do not mess with migration process. root@mentat-alt$ /root/mentat-sync-config.sh root@mentat-alt$ /root/mentat-sync-files.sh root@mentat-alt$ /root/mentat-sync-mongodb.sh root@mentat-alt$ /etc/mentat/scripts/sqldb-migrate-data.py --clear --from-timestamp 1532304000 2018-07-24 13:59:48,096 sqldb-migrate-data.py INFO: Data migration results: 2018-07-24 13:59:48,096 sqldb-migrate-data.py INFO: -------------------------------------------------- 2018-07-24 13:59:48,102 sqldb-migrate-data.py INFO: User count: 133 2018-07-24 13:59:48,105 sqldb-migrate-data.py INFO: Group count: 295 2018-07-24 13:59:48,107 sqldb-migrate-data.py INFO: Network count: 1,833 2018-07-24 13:59:48,110 sqldb-migrate-data.py INFO: Filter count: 55 2018-07-24 13:59:48,114 sqldb-migrate-data.py INFO: Setting count: 295 2018-07-24 13:59:48,157 sqldb-migrate-data.py INFO: Event reports count: 163,900 2018-07-24 13:59:48,240 sqldb-migrate-data.py INFO: Event stats count: 375,573 2018-07-24 13:59:48,240 sqldb-migrate-data.py INFO: -------------------------------------------------- 2018-07-24 13:59:48,240 sqldb-migrate-data.py INFO: Migration started at: 2018-07-24 13:15:49.078338 2018-07-24 13:59:48,240 sqldb-migrate-data.py INFO: Migration finished at: 2018-07-24 13:59:48.096645 2018-07-24 13:59:48,240 sqldb-migrate-data.py INFO: Migration duration: 0:43:59.018307 root@mentat-alt$ psql -f mentat-tweakdb.sql mentat_main root@mentat-alt$ /root/system-rename.sh # Configure all Mentat modules by comparing configuration files form old production server. # Configure all Warden modules, switch warden client certificates. # Reconfigure IP address settings to values of mentat-hub.cesnet.cz. root@mentat-alt$ reboot
After rebooting bring the whole system back up:
root@mentat-alt$ mentat-controller.py --command start root@mentat-alt$ mentat-controller.py --command enable root@mentat-alt$ /etc/init.d/warden_filer_receiver start root@mentat-alt$ /etc/init.d/warden_filer_sender start root@mentat-alt$ update-rc.d warden_filer_receiver defaults root@mentat-alt$ update-rc.d warden_filer_sender defaults
Related issues
Updated by Jan Mach over 6 years ago
- Status changed from New to In Progress
- Priority changed from Low to High
Development version of Mentat system is installed on new hardware. Currently it is being used for debugging and testing purposes before releasing new stable version. Database and filesystem migration scripts are ready, but might need one more revision.
Updated by Jan Mach over 6 years ago
- Related to Task #3752: Migration from MongoDB to PostgreSQL added
Updated by Jan Mach over 6 years ago
- Related to Task #3734: Migrate Hawat web user inteface from Perl-base to Python-based Mentat framework added
Updated by Jan Mach over 6 years ago
- Related to Task #3374: Migrate all core modules from legacy Mentat added
Updated by Jan Mach over 6 years ago
- Description updated (diff)
- Status changed from In Progress to Feedback
- Assignee changed from Jan Mach to Pavel Kácha
- % Done changed from 0 to 30
Updated by Pavel Kácha over 6 years ago
- Assignee changed from Pavel Kácha to Jan Mach
The actual process of migration will be done according to the following checklist:
Hint: Set short (~minutes) TTL on all related A/AAAA/CNAME/PTR RRs.
- Presynchronize filesystem data (rsync), so that the actual migration will be much quicker.
Except db perhaps?
- Shut down Mentat and Warden systems on mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
Hint: Disable automatic start of whatever does state changes - warden-filer, cron scripts, automatic downloads, etc.
Hint: Also disable start of Mentat itself...
- Perform database migration
So real migration of data or just run with month of already saved data? (No hard opinion here, we can import older data later if we find it important.)
- Perform filesystem migration
rsync again? Or do you mean something else?
- Perform configuration migration
- Switch warden client certificates between mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Switch hostnames and IP addresses between mentat-hub.cesnet.cz and mentat-alt.cesnet.cz servers
- Reboot both servers and pray to your favorite god, or as an atheist sit quietly with your hands in your lap
Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn!
- Login to new mentat-hub.cesnet.cz, launch all services
Hint: If only basic system started automatically, daemons start can be tested by hand from the end (starting from storage), and data inflow (warden-filer) and disruptive scripts can be started only when everything is checked as ok.
Updated by Jan Mach over 6 years ago
- Description updated (diff)
- Status changed from Feedback to In Progress
Updated by REST Automat Admin over 6 years ago
Remarks regarding database migration¶
MongoDB database dump on server mentat-hub.cesnet.cz:
/root/mentatdb-dump-all.sh
(dump script for Mentat databases)
MongoDB database restore on server mentat-alt.cesnet.cz:
/root/mentat-sync-db.sh
(executed regularly at 8am by cron to verify functionality)
MongoDB -> PostgreSQL database migration on server mentat-alt.cesnet.cz:
/etc/mentat/scripts/sqldb-migrate-data.py
(migrate metadata database containing users, groups, reports, statistics, etc.)/etc/mentat/scripts/sqldb-migrate-events.py
(migrate IDEA messages, might not be necessary)
At this point database migration should be ready.
Remarks regarding data migration¶
Migrate data:
rsync --archive --update --delete --progress /var/mentat root@target:/var
Cleanup runlogs and logs (might cause issue with new version):
find /var/mentat/log -name=*.log* -delete
find /var/mentat/run -name=*.runlog -delete
find /var/mentat/run -name=*.pstate -delete
find /var/mentat/run -name=*.state -delete
Updated by Pavel Kácha over 6 years ago
user#373 wrote:
At this point database migration should be ready.
should implies it might not. What if something goes awry?
Cleanup runlogs and logs (might cause issue with new version):
What issue? Something critical?
Updated by Jan Mach over 6 years ago
Pavel Kácha wrote:
user#373 wrote:
At this point database migration should be ready.
should implies it might not. What if something goes awry?
You can never be 100% sure I have tested that many many times, so that the should is as close to will as possible .
Cleanup runlogs and logs (might cause issue with new version):
What issue? Something critical?
Some modules have additional runlog attributes. Everything is written with backwards compatibility in mind, but some really old runlogs could cause problems. However these problems will only show when evaluating runlogs using --action=runlogs-evaluate
module action. So these possible problems are not critical, they just make the deloper look bad.
Updated by Jan Mach over 6 years ago
New Mentat installation guide in official documentation:
https://alchemist.cesnet.cz/mentat/doc/development/html/_doclib/installation.html
New Mentat migration guide in official documentation:
https://alchemist.cesnet.cz/mentat/doc/development/html/_doclib/migration.html
New Mentat reporting guide in official documentation:
https://alchemist.cesnet.cz/mentat/doc/development/html/_doclib/reporting.html
Updated by Jan Mach over 6 years ago
- Description updated (diff)
- Category changed from Installation to Documentation
Updated by Jan Mach over 6 years ago
- Category changed from Documentation to Installation
- Status changed from In Progress to Feedback
- % Done changed from 30 to 100
Migration was successfully performed on 24.7. 2018. Waiting for any feedback from users before closing as successfull.
Updated by Jan Mach over 6 years ago
- Related to Task #4210: Release and deploy Mentat package version 2.0 added
Updated by Jan Mach about 6 years ago
- Status changed from Feedback to In Progress
- All Ansible roles related to Mentat server management were improved and polished.
- Automated build system Alchemist received big overhaul and is now back online. It provides building packages of newly introduced release suite, which is something in between of development and production. This is going to enable us test the Mentat code in our production environment before releasing it as true production level code.
- I am now waiting for confirmation from the manager of our monitoring system based on Nagios, that he updated the monitoring configuration according to new requirements.
Updated by Jan Mach about 6 years ago
- Status changed from In Progress to Closed
Migration complete, all Nagios monitoring scripts are fixed, up and running. Closing issue as resolved, this also completes the work on version 2.0.