Monday, May 3, 2010

Simple Backup Restore Trick

I don't usually post these simple tricks, but it came to my attention today and it's very simple and have seen issues when trying to get around it. This one tries to solve the question: How do I restore my production backup to a different schema? It looks obvious, but I haven't seen many people thinking about it.

Most of the time backups using mysqldump will include the following line:

USE `schema`;

This is OK when you're trying to either (re)build a slave or restore a production database. But what about restoring it to a test server in a different schema?

The actual trick


Using vi (or similar) editors to edit the line will most likely result in the editor trying to load the whole backup file into memory, which might cause paging or even crash the server if the backup is big enough (I've seen it happen). Using sed (or similar) might take some time with a big file. The quick and dirty trick I like is:

grep -v "USE \`schema\`" backup.sql | mysql -u user -p new_schema

Adapt the mysql command options to your needs. It's necessary to escape the backticks (`), otherwise the shell might interpret it as your trying to execute schema and use the output as the actual schema name. Also, make sure that new_schema already exists in the server.

This method is quick and dirty and leaves the original backup intact. I hope you find it useful.

12 comments:

  1. Don't you mean "grep -v" here? You want to remove the line matching "USE `schema`", not return only lines matching that.

    ReplyDelete
  2. I think you meant grep -v "USE \`schema\`" backup.sql, not grep "USE \`schema\`" backup.sql

    ReplyDelete
  3. Don't you mean grep -v ? Isn't the intent to exclude the USE line from the input to the restore? Maybe worth also pinning the pattern to the start of the line, just in case....

    ReplyDelete
  4. Surely you mean grep -v ?

    Your method won't work very well in the general case, especially when you have binary data which happens to include the string "USE `schema`", for example, this blog post article.

    A more general solution would be to use "tail" to skip the "USE" line, if there is only one near the top of the file.

    ReplyDelete
  5. Pretty sure you mean `grep -v […]` there.

    ReplyDelete
  6. Did you mean "grep -v USE... " - the -v returning every line apart from the 'USE...' ?

    Without the -v, you would just be returning the USE lines and pass that to mysql - but nothing else.

    ReplyDelete
  7. This makes more sense to me if you meant to say use grep with the -v switch.

    From the man page:
    -v, --invert-match
    Invert the sense of matching, to select non-matching lines.

    So using grep -v in your example, the USE `schema` would be removed and all other lines from the backup.sql file would be piped into mysql.

    ReplyDelete
  8. Hi,

    The "USE" statement does not appear by default when the dump contains only one schema (unless you explicitly used --databases d1).
    So, when there's a USE, it's probably because there is more than one database, which leads to the conclusion there will be *another* USE statement - not handled by the script.

    The trick is good. But I do believe sed would be far more powerful here, being able not only to ignore, but also to replace the USE statements to your liking.
    I do not see that sed would perform any worse than grep, when the outcome is ... | mysql -u user

    Issuing the statements in mysql will make any difference between grep & sed (if any) to be insignificant.

    Regards

    ReplyDelete
  9. I think if you dont use -B or --databases option in mysqldump then use `schema`; will not be visible in the dump sql file.

    ReplyDelete
  10. To all who noticed the missing '-v', thank you for reading. I caught the typo once Planet MySQL picked it up and made the correction.

    @Schlomi, correct, it won't work in every circumstance. I prefer to backup single databases, including the CREATE statement to rebuild the schema if needed, for regular backups. When you have more than 1 schema and need consistency among schemas, then you're backup will contain more than 1 USE statement and this method won't work. The difference I noticed with 'sed' is when you have to process a backup file that is 100G or bigger. I'd have to retry it.

    @Saurabh, correct. I was trying an example when you already have the nightly dump. If you're creating a dump for testing, then use the additional options to accommodate to your needs. But keep in mind that a dump can take several hours and create load on the server for the cases this tip was intended.

    Thank you all for reading and comment.

    ReplyDelete
  11. BTW, I agree with Ronald -- he slipped in the ^ at the beginning there....

    grep -vi '^use \`dbname\`;'

    otherwise you might grep out something you don't want to, though it's a pretty unlikely scenario.

    ReplyDelete