This is a story with a happy ending, eventually.  It’s the tale of how
Google and some script-fu saved my bacon.  There’s a moral, too: don’t be
stupid, and you’re stupider than you think you are.  That was my case,
anyway, when a bear mauled my database.

I was running 4 websites on FreeBSD 9.0, and after two years of upgrading, it
was time to upgrade the system to FreeBSD 10.0.  That was an upgrade with some
risks, and in my case, it went poorly.  The system had been getting quirky
anyway, so I figured it was time to just wipe it clean and install FreeBSD 10
fresh.  I had downloaded to my local machine a full set of backups, so I was
ready to go.  As the new machine came up, disaster struck.  I’d settled on a
set of bash scripts to back up the databases via mysqldump, one each time.  For
the wildspaces database, I simply copied another script and changed the
destination file.  Unfortunately, I didn’t change the database name itself,
and the script was producing a wildspace.sql.tz file with the contents of
another database.  My backup was worthless: disaster.  I looked around to see
what else I could work with.  I found an extremely old database backup from
three years previous, when I was running Joomla instead of Serendipity.  It was
a different database but at least had the content of the first 10 years of my
site.  I also had a folder of text files I’d kept, each one containing one
un-formatted entry devoid of hyperlinks and some later corrections.  But for
the last year I hadn’t saved any text files, and feared those files would be
totally lost.  Still, there were at least some means of restoring my fourteen
years’ worth of content, although it was going to be laborious.

First, a fresh install of Serendipity and some comparison showed where the
content was coming from, and where it would have to go.  The table fields
matched up like this:

Joomla (table: jos_content)Serendipity (table: serendipity_entries)idid
titletitlecreated\	timestampintrotextbodyfulltextextended

It would’ve been easier to just use a ‘select/insert into’ sequence to
just dump the Joomla content into the wildspaces (Serendipity) database, but
unfortunately I needed the recreated system to stay true to the original URLs.
It was essential then to map the Joomla content to the now-disappeared
Serendipity URLs.  Google came to my rescue.

I had to act fast, but my system had eaten shit only a day or two ago, and
Google had cached my whole site.  By referring to a list of articles I’d
written, I could do a search like this, and Google would show me its cached
entry.  Time was passing though, and the next time Google’s robots passed
they would update the cache with the now mauled-bear entries.  I did 270 Google
searches (so many, in fact, that Google made me fill out a captcha after about
100 to see if I was human or a script: that’s something I never even knew
happened).  

But in order to get the old content into the right slots in the database, I
thought it was easiest to create 270 ‘stub’ entries, and then update them
by inserting the proper content.  I created one legitimate entry with the text
"This website was mauled by a bear.  Please be patient as we
rebuild."  Then from the mysql command prompt I copied it 270 times so
I’d have a database full of empty entries, into which I could insert the
proper text from the old (Joomla) table.

The trick to duplicating an existing row in a table is to copy it to a separate
(temporary) table, drop the ID column from that temporary table, and then
re-insert it into the original table.  Having dropped the ID column allows the
original table to assign it an auto-incremented ID as it receives the data.  To
run this script 270 times, I saved it as a text file, and then sourced it from
the mysql command prompt.  

mysql> source MYFILE.sql

Then I pressed the up arrow to retrieve the previous command in mysql’s
command history (i.e., to rerun this command).  Repeat 270 times.  Here’s
the SQL:


create temporary table tempo select * from serendipity_entries where id='1';
alter table tempo drop id;
insert into serendipity_entries select null,tempo.* from tempo;
drop table tempo;


Now that I had 270 blank entries and a map showing the Joomla article ID and
the Serendipity ID, I could run a loop in a bash script to query myself and
transfer the articles’ text and date, one by one.  The only nuance in the
below bash script and sql command is a unix_timestamp clause that converts the
time from one format to another when transferring.


#! /usr/local/bin/bash
while :
do
echo "Enter ID number from Joomla we will move."
read JOOMLA
echo "Enter ID number in S9Y to which we'll move it."
read SEREN

echo "Article title, date:"
mysql -u root -PASSWORD -e "select id, title, created from jos_content where id=$JOOMLA;" joomla

echo "From:  $JOOMLA     To: $SEREN"
echo "Hit 0 to commit or anything else to bail out with shit pants"
read ANYFUCKINGKEY
if [ "$ANYFUCKINGKEY" = "0" ]; then

mysql -u root -pPASSWORD -e " update wildspaces.serendipity_entries,
joomla.jos_content set wildspaces.serendipity_entries.title =
joomla.jos_content.title, wildspaces.serendipity_entries.timestamp =
unix_timestamp(joomla.jos_content.created), wildspaces.serendipity_entries.body
= joomla.jos_content.introtext, wildspaces.serendipity_entries.extended =
joomla.jos_content.fulltext where (wildspaces.serendipity_entries.id='$SEREN'
and joomla.jos_content.id='$JOOMLA');" wildspaces

else
        break
fi
done


There it was: at the end of the day, I had a 270-row serendipity database into
which I’d transferred everything left in that ancient backup.  I used the
collection of text files to populate another 3 years’ of writing.  That left
the final year, in which I’d neither saved text files nor been able to
restore the data from the old Joomla backup.  Again I turned to Google.

By searching again, I was able to quickly retrieve cached versions of the
remaining pages, save one.  That last search turned up not my original articles
but the new stub ("This site was mauled by a bear") articles.  So I
was glad I’d acted quickly.  Google’s robots were already updating their
cache, and my old article was off to the bit bucket in the sky.  It was easy
enough to recreate from memory, as it wasn’t a difficult article.

I’ve still got a lot of work to do: the next step is reassigning each
article’s pictures, updating the hit counter where possible, and doing a lot
of formatting.  That will give me several weeks penance in which to reflect
upon what a bonehead I was – and to think about the morals of this story:


You are not as clever as you think you are.  Or I’m not, anyway.  Never
assume you have mastered a Unix system well enough to avoid disaster.  Don’t
trust your backups.  You’ve got to check them, especially before relying on
them!  Copying one backup script in order to produce a new one is a risky move.
You’ve got to really check you’ve correctly modified everything that
needs modification.  It seemed like extra work at first, and even redundant,
but actually, the discipline of saving each posted article as a text or HTML
file turned out to be useful and valuable.  I’ll keep doing that (starting
with this one!) I do a lot of bitching about Google’s avaricious collecting
and caching of data, but they sure saved my bacon this time.  So thanks,
Google, really!  (But can we talk about Android permissions some day?)


Now pardon me as I get back to work cleaning up this site.  You see, a bear
mauled it.