Wednesday, May 17, 2017

How do I keep MySQL from ever increasing it's disk space usage when used with puppet-dashboard?



The setup



We have a Debian Linux set up with MySQL v5.1.73 (innoDB storage engine) and puppet-dashboard version 1.2.23. As you probably guessed, puppet-dashboard is using MySQL as its backend.



Also, it shouldn't be relevant but this a VMware virtual machine on vSphere 5.5.




The problem



The problem is that, despite the number of puppet nodes and run frequency staying relatively the same, the disk space used by MySQL keeps on increasing in a disturbing fashion to the point where it is now becoming an issue.



The following graph illustrates the issue.



disk space goes down



We have put in place the two cron jobs that should allow disk space to be freed. They are the following and both run daily :





  • rake RAILS_ENV=production db:raw:optimize

  • rake RAILS_ENV=production reports:prune:orphaned upto=3 unit=mon



The drops you can see in the graph are the cron jobs running and eating up more space trying to free some space.



MySQL binary logs are not enabled. 95% of the disk space used on this server is located in the /var/lib/mysql/dashboard_production which is the directory where the MySQL data is stored.




We have had this issue before with a different application (Zabbix monitoring) and had to dump the DB and re-import in order to free up space. This was a very painful process and not a very elegant solution but it ultimately worked.



Is there any way we can reclaim this disk space ? What we can do this stop this behavior ?



Edit 1



We are indeed using innoDB and we are not using configuration directive "innodb_file_per_table".



As requested by Felix, the output of the command is the following :




+----------------------+-------------------+-------------+
| table_schema | table_name | data_length |
+----------------------+-------------------+-------------+
| dashboard_production | resource_statuses | 39730544640 |
| dashboard_production | metrics | 643825664 |
| dashboard_production | report_logs | 448675840 |
| dashboard_production | timeline_events | 65634304 |
| dashboard_production | reports | 50937856 |
| dashboard_production | resource_events | 38338560 |
| glpidb | glpi_crontasklogs | 21204608 |

| ocsweb | softwares | 8912896 |
| ocsweb | deploy | 5044208 |
| phpipam | logs | 1269584 |
+----------------------+-------------------+-------------+


Also, I will be trying the reports:prune task without the "orphaned" option as mentionned as well as the other alternatives and will keep this question updated.



Edit 2




I ran the reports:prune rake task and, despite deleting 230000 reports, it kept on eating more space... I will therefore move on to the other options.



enter image description here



The solution



After deleting two thirds of the entries in the database, it only freed up 200MB of disk space which is senseless. We ended up dumping the content and re-importing it taking care to enable "innodb_file_per_table".



We will just have to wait and see if this fixes the solution in the long term but it seems to be the case for the moment.


Answer




I found this article which seems to address the issue pretty well



http://ximunix.blogspot.co.uk/2014/01/howto-cleanup-puppet-reports-and-db.html



posted by Ximena Cardinali



The short story is start deleting reports in small batches and then reclaim the space from MySQL







HOWTO Cleanup Puppet Reports and DB



If the database for Puppet Dashboard is using several GB and getting larger everyday, this is a way to get some of the space back.



There are two rake jobs you should be running everyday as part of daily maintenance for Puppet Dashboard.



cd /usr/share/puppet-dashboard
env RAILS_ENV=production rake reports:prune upto=5 unit=day
env RAILS_ENV=production rake reports:prune:orphaned



You can change the RAILS_ENV and number of day (day), weeks (wk), months (mon), etc to match your system and its needs.




  1. Stop incoming reports:



    cd /path/to/puppet-dashboard



    env RAILS_ENV=production script/delayed_job -p dashboard -m stop


  2. Start deleting reports in small batches





Keep working your way in towards the length of time you want to keep reports for. The reason for this is Innodb tables have poor performance when deleting more than 10k rows at a time. If you try to deleting a few hundred thousand rows, it will timeout and you'll have to break it up into smaller deletes anyway. Also the Ruby rake process will use probably use all your RAM and likely get killed off by the kernel before it finishes. Something like this progression should work for most people, but if you have many months of data you may want to start with a month or two of your earliest records. In our case, we are keeping just 2 weeks reports (14 days).



env RAILS_ENV=production rake reports:prune upto=6 unit=mon
env RAILS_ENV=production rake reports:prune upto=4 unit=mon
env RAILS_ENV=production rake reports:prune upto=2 unit=mon
env RAILS_ENV=production rake reports:prune upto=3 unit=wk
env RAILS_ENV=production rake reports:prune upto=1 unit=wk
env RAILS_ENV=production rake reports:prune upto=5 unit=day




  1. Determine the best method to reclaim space from MySQL



There are two methods to reclaim space depending on how MySQL was configured. Run this command to determine if "innodb_file_per_table" is enabled. It should be set to "ON" if it is.
NOTE: I recommend to use innodb on your MySQL for cases like this one.



mysqladmin variables -u root -p | grep innodb_file_per_table



You can also do a listing of the database to see if there are larger data files. The table most likely to be large is resource_statuses.ibd.



ls -lah /var/lib/mysql/dashboard_production
...
-rw-rw---- 1 mysql mysql 8.9K Jan 08 12:50 resource_statuses.frm
-rw-rw---- 1 mysql mysql 15G Jan 08 12:50 resource_statuses.ibd
...




  1. Reclaiming space the easy way



If MySQL was configured with innodb_file_per_table and your Dashoard DB shows that your data is in large table files, do the following:



mysql -u root -p
use puppet_dashboard;
OPTIMIZE TABLE resource_statuses;



This will create a new table based on the current data and copy it into place. If you do a listing while this is in progress you should see something like this:



-rw-rw---- 1 mysql mysql       8.9K Jan  08 12:50 resource_statuses.frm
-rw-rw---- 1 mysql mysql 15G Jan 08 12:50 resource_statuses.ibd
-rw-rw---- 1 mysql mysql 8.9K Jan 08 12:50 #sql-379_415.frm
-rw-rw---- 1 mysql mysql 238M Jan 08 12:51 #sql-379_415.ibd



And when it finished it'll copy the tmp file into place. In this case we went from 15GB to 708MB.



-rw-rw---- 1 mysql mysql 8.9K Jan 08 13:01 resource_statuses.frm
-rw-rw---- 1 mysql mysql 708M Jan 08 13:03 resource_statuses.ibd



  1. Reclaiming space the hard way




If your system was not configured with innodb_file_per_table or all the current data resides in a large ibdata file, the only way to reclaim space is to wipe the entire installation and reimport all the data.
The overall method should be something like: First configure innodb_file_per_table, dump all the databases, then stop Mysql, delete /var/lib/mysql, run mysql_install_db to create /var/lib/mysql again, start MySQL, and finally reimport the data. There will be no need to the optimize steps because of the data import.




  1. Finally, Restart the delayed_job:



    cd /path/to/puppet-dashboard



    env RAILS_ENV=production script/delayed_job -p dashboard -n 2 -m start


  2. Daily Reports Cleanup and DB Maintenance:





For a daily Reports Cleanup you can create a simple BASH script who search the Reports on /var/lib/puppet/reports by time (mtime +14 in our case), remove them and then cleanup the DB with (upto=2 unit=wk) and set it in your crontab.
An example of the script can be:



#!/bin/bash
REPORTS=`find /var/lib/puppet/reports -type f -mtime +14`
for i in $REPORTS; do rm -f $i; done

cd /usr/share/puppet-dashboardenv RAILS_ENV=production rake reports:prune upto=2 unit=wk


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...