replication - Restoring a slave MySQL database from raw backups of master gives InnoDB tablespace errors

Wednesday, March 4, 2015

replication - Restoring a slave MySQL database from raw backups of master gives InnoDB tablespace errors

I have a master/slave replication setup where I use InnoDB and MyISAM tables in over 7000 databases that I want to copy from my master to the slave to restore replication.

Both servers were running Ubuntu 10.04.2 LTS (which uses the mysql-server 5.1.41-3ubuntu12 package). Recently I tried to upgrade MySQL in the hope that I was hitting some bug that a newer version had resolved -- so my slave is now Ubuntu 10.10. However, the problem appears to be the same.

I'd prefer not to disrupt my master, so I have tried taking an LVM snapshot of my entire disc so that I can copy my data and log directory via rsync to my slave:
/var/lib/mysql : Where my ibdata1 and ib_logfile0, as well as all my .ibd and .frm files are stored. I used innodb_file_per_table, so there are a lot of .idb files.
/var/log/mysql : Where I keep all my binary logs

Once copied, I reset the permissions:

chown mysql.mysql /var/lib/mysql -R  
chown mysql.mysql /var/log/mysql -R

I remove the master.info and relay-log.info files from the /var/lib/mysql directory. (Since my master is actually to slave to another master, for certain tables).

Then I try to start mysql on the slave. Soon, I start to see the lots and lots of errors that look like the following in /var/log/mysql.err:

InnoDB: Error: tablespace id is 150238 in the data dictionary  
InnoDB: but in file ./1_107789/email.ibd it is 150747!

or:


InnoDB: Error: trying to add tablespace 148302 of name './23_4377/link.ibd'
InnoDB: to the tablespace memory cache, but tablespace
InnoDB: 148302 of name './1_68522/open.ibd' already exists in the tablespace
InnoDB: memory cache!

And then every now and then:

110207 13:55:45  InnoDB: Assertion failure in thread 2979265392 in file ../../../storage/innobase/fil/fil0fil.c line 603
InnoDB: Failing assertion: 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html

InnoDB: about forcing recovery.
110207 13:55:45 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216

read_buffer_size=131072
max_used_connections=1
max_threads=10000
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 868418 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xbc5a7138

Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xb193f13c thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2d) [0xb7638c4d]
/usr/sbin/mysqld(handle_segfault+0x494) [0xb7304854]
[0xb707f400]
/lib/tls/i686/cmov/libc.so.6(abort+0x182) [0xb6d89a82]
/usr/sbin/mysqld(+0x477790) [0xb7514790]
/usr/sbin/mysqld(+0x47795e) [0xb751495e]

/usr/sbin/mysqld(fil_space_get_size+0xdc) [0xb751966c]
/usr/sbin/mysqld(buf_read_page+0xad) [0xb75015dd]
/usr/sbin/mysqld(buf_page_get_gen+0x331) [0xb74fab21]
/usr/sbin/mysqld(btr_get_size+0x190) [0xb75b02b0]
/usr/sbin/mysqld(dict_update_statistics_low+0x50) [0xb7503e70]
/usr/sbin/mysqld(dict_table_get+0xec) [0xb750682c]
/usr/sbin/mysqld(+0x4cde5f) [0xb756ae5f]
/usr/sbin/mysqld(row_ins+0x157) [0xb756d3c7]
/usr/sbin/mysqld(row_ins_step+0x110) [0xb756d710]
/usr/sbin/mysqld(row_insert_for_mysql+0x37e) [0xb75754de]

/usr/sbin/mysqld(ha_innobase::write_row(unsigned char*)+0xf9) [0xb74e1299]
/usr/sbin/mysqld(handler::ha_write_row(unsigned char*)+0x6d) [0xb7412d3d]
/usr/sbin/mysqld(write_record(THD*, st_table*, st_copy_info*)+0x3ba) [0xb7391e2a]
/usr/sbin/mysqld(mysql_insert(THD*, TABLE_LIST*, List&, List >&, List&, List&, enum_duplicates, bool)+0x1122) [0xb73967c2]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0xc85) [0xb7317c95]
/usr/sbin/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x3ae) [0xb731f45e]
/usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int)+0x47d) [0xb73dbe9d]
/usr/sbin/mysqld(Query_log_event::do_apply_event(Relay_log_info const*)+0x26) [0xb73dca76]
/usr/sbin/mysqld(apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*)+0x137) [0xb7463cc7]
/usr/sbin/mysqld(handle_slave_sql+0x1094) [0xb74662e4]

/lib/tls/i686/cmov/libpthread.so.0(+0x596e) [0xb706396e]
/lib/tls/i686/cmov/libc.so.6(clone+0x5e) [0xb6e29a4e]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xb183bdc6 is an invalid pointer
thd->thread_id=2
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

I have been fiddling with various options and trying to understand why it thinks there is a table mismatch. As far as I am concerned there should be no mismatch because I'm copying both the ibdata1, innodb log files as well as the .ibd. So why doesn't it just recover and get on with it, so that I can restore the replication? I'm clearly missing something, but I cannot find it.

Any clues or suggestions appreciated.
Thanks

Blog

Wednesday, March 4, 2015

replication - Restoring a slave MySQL database from raw backups of master gives InnoDB tablespace errors

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server