[Patrick here, not Abi.]
As luck would have it, I had to run off to band practice very shortly after the crisis began. I’m back. And I’ve had a life-giving, brain-restoring sandwich. As soon as I post this I’m going to start reading the email, and comments below, from our incredibly generous and helpful readers.
As mentioned previously, we do have a complete backup, downloaded with wget, of the site as of March 1, 2008. Hosting Matters has now set up new server space for us, and presently I’ll be trying to upload the backup; if everything works, Making-Light-as-of-March-1 will appear at the usual URL. Much more likely, however, is that I’ll have a zillion finicky permission problems that I’ll need someone’s help with before we get to that point.
At this point we have HTML of all the missing posts, and quite a lot of the missing comments, but many more comments remain to be excavated. Please, if you have been using an RSS reader to follow ML comments–generally or in specific threads–consider that you may actually have copies of comments from the last two months in your reader’s cache folders. Please take a look.
More to come. My actual suspicion is that once we get the March 1 version of ML restored, the best way to re-attach the missing material is going to be by editing and manipulating the MySQL database in the guts of the site. I know very little about this, and the small amount that’s been explained to me by helpful readers is pretty much worn away by time, but it seems to me that in principle it should be possible to automate, or at least semi-automate, the processes of:
(1) stripping away extraneous cruft from various saves of ML content in HTML and XML form, and–
(2) –wedging it all into the appropriate fields and tables in the MySQL database, so that–
(3) –all of ML reappears, right up to the present, along with all the–
[…]books we bought in college and sold for half-price unread
And sacks and sacks of earring backs lost under someone’s bed
And baseball cards and army men and model planes galore
And every tiny plastic high-heel Barbie ever wore
(Thank you, Austin Lounge Lizards. Music here, if you’re willing to sign up for a free trial of Rhapsody.)
Of course, knowing that something is theoretically possible doesn’t mean it’s practical, but I bring it up for discussion nonetheless.
– and a huge mound of unmatched socks.
Patrick, I’m not a database guru, but I have worked with MySQL, and I can write SQL scripts if they don’t have to do really complex and clever things. If you figure out some way to play divide and conquer with the mass of data you’ve got, I’ll be glad to help.
Bruce
It’s possible, since MT, like all blog software, outputs stuff in regular enough form that it can be screen-scraped.
It might take really ugly perl regexps to do so, and it might even be the case that we have to take out the XML parsing tools and whack the HTML/XML over the head, but it’s not impossible.
Shoving it all back into MySQL is along the lines of print out the SQL and then have MySQL chomp it all in. Alternatively, there are perl/ruby/your-favorite-scripting-language hooks into MySQL that can do it from the script, but I prefer the “print-look-human check for sanity-then read in” approach.
But it is possible. I can help if you guys don’t already have tons of helpers on hand already.
By the way, WordPress: I highly suggest. There is no end of community support for WP. There is unfortunately less of it for most other blogging systems out there.
And I’m pretty sure WordPress can import from nearly any damn thing. 🙂
Oh… and before we all cheer that this is possible… someone will have to analyze and understand the MoveableType database tables and the logic in the blog software itself that reads from the tables.
This can be easy. Or MT may have decided to be really bizarre about its architecture. Reverse-engineering is still possible; but also possibly painful.
There are other ways around lacking a library from MT to shove stuff into the database.
On the other hand, lots of people know how to shove stuff into a WP database. So you could bring Making Light all the way over to WP, have WP import the older entries you restored, and then have some WP hacker write the code to insert stuff into a DB schema that’s open.
Heck, if you move to WP, we don’t even need the hacker. Just someone who can manipulate the XML into WP’s XML-based import format, which can also snarf in comments and such things.
We’ve got any number of ways to stomp on problems, in other words. I think the best ones involve dropping MT, but that carries its own costs (e.g., you run the risk of the site not working for a bit. Of course, the site’s already not working…).
Hope this helps.
MT’s database structure’s easy. I already offered in email, but I stand ready to help out with the repopulation. (Personally, I really like MT.)
Michael Roberts,
Well thereyago.
I’m a fan of anything that works. If MT to MT will work, that’s beautiful.
I’m currently writing a perl script that will read one of the Making Light post pages and gather the data in such a way that it can be populated into a database. I can hand it to you and you can work your MT magic. 🙂
(Of course, you may have already written it, in which case I will feel silly… but less worried.)
it might not be good to upload the backup yet, I’ve been told, because Google will start repopulating its cache w/ it. better to return a blank.
I just noticed something. The cache truncates for some reason, or at least it does on mine. So a post with 485 comments will have only 237 or something like that.
Maybe it’s just Mac OS X + Firefox 3.05 beta?
Hi all,
I wrote the perl script. It has lots of comments documenting stuff I do. More hacky than I like but works on the Google cache pages I tried, and doesnt require any extraneous libraries other than Data::Dumper, which comes with perl.
http://files.spontaneousderivation.com/parse-ml-post.txt
People can pick it apart and use it for their own needs, like printing out SQL to load into MT.
I have not written it — I make it a policy to wait until somebody else does the work, then step in and take the credit. Much easier that way. For me.
Michael,
There you go then! 🙂
It’s nice to be 25 and young forever.
(No, really, I’m nearly 30. But a girl can dream of youthful days.)
I’m probably dead short on time (due to travel) for the next few days, but my perl and mysql are tolerably reasonable, and I’m certainly willing to add my voice/efforts to the reassembly process (I’ve got at least one “pull stuff out of text files and munge it into mysql based on given fields & params” script kicking about).
So, Patrick, are you rethinking the wisdom of publishing Little Brother yet? Ever since we picked up our copy today, I’ve noticed my laptop seems to be running a bit hot.