Rebuilding the threads

Here’s what I’ve got for Making Light since March 1: original post date, abbreviated name of the post, and the number of comments I have actual copies of.


3/1 Who’s surprised? 66
3/3 All come singing 69
3/3 Can you read this 53
3/4 Greyhawk 253
3/11 Collect underpants 265
3/13 Open thread 103 936
3/16 Just do it 38
3/16 Literary divination 106
3/18 Arthur C. Clarke 177
3/20 Bigger laser 174
3/28 Divided by errors 34
3/28 Open thread 104 931
3/30 London photograph 204
3/31 Deep value 434
4/1 Amsterdam 70
4/2 Pity the Times 167
4/4 Forty years gone 70
4/6 Heads they win 320
4/6 Employ the scythe 126
4/9 SFWA deadline 25
4/11 Future of publishing 32
4/12 Book by its cover 37
4/13 Bury my acorns 87
4/13 Goose-stepping (actually 469) 468
4/14 Open thread 105 906
4/16 Housekeeping 7
4/16 Newsweek 245
4/17 Little Brother 180
4/22 Penn for Hillary 124
4/23 Font game 125
4/23 TNH in San Francisco 18
4/25 Indistinguishable from parody 186
4/26 Clapton 107
4/26 Feeling the heat 31
4/26 SFWA election 45
4/26 TNH in the Observer 105
4/27 Open thread 106 288

In addition, I have a 131-comment version of the Clay Shirky post, but in fact I know there were at least 254 comments; if anyone has a 254-comment version, please do send it on. (Apologies if you did already. I may have lost it. Processing all this stuff on the fly has been a challenge.)

I also have the recently-posted comments to old threads “Worldcongoing,” “New Magics,” and “Abi on catz.” I do not to have the recently-posted comments to “Darwin fish found”; the same apologies apply as in the previous paragraph.

26 thoughts on “Rebuilding the threads”

  1. I have a 251 comment version of “Where do people find the time?” (the Clay Shirky post) and a 334 comment version of “Darwin fish found”. The latter is definitely incomplete: there are later posts in people’s View All By. I’m slowly tracking as many of those as possible.

  2. Does anyone know of a general search engine other than the Big Three (Google, Yahoo and Microsoft)? Everything else I can remember is highly specialized. Would be nice if we could find something with a spider that ran after May 1.

    Librarians?

  3. I sent a 244-version of the Shirky post to TNH – digging for other things now.

    Other searches – lycos, altavista. Dogpile is a conglomerate I think.

  4. I’ve decided on this as the most efficient search string for comments:

    “comments posted to making light by a” “08 on entry” site:nielsenhayden.com

    The letter ‘a’ there will pick up all names beginning with a and A. The “08 on entry” gets only recently updated threads. The site:nielsenhayden.com gets rid of some sploggage.

    I’m going to start with the letter z on Yahoo! and work backwards.

  5. Mary Dell: Windows Live and MSN bring back the same results. They’re both Microsoft sites, so I guess it isn’t too surprising.

    Jon Meltzer: I’ve tried a number of different engines/spiders without success. It would be great if somebody found one.

  6. The latest Shirky cache I’ve found is the 251-posting collected by Microsoft’s crawler (live.com, msn.com) on Friday the 2nd about 4 PM. So there’s about a day of comments yet to be found. The rest of the comment threads were all cached between April 24 or so and May 2nd. Losing one week, while bad, is still a lot better than losing a couple of months, though …

  7. However, this does seem to be working:

    “posted 05.02.08” “comments posted to making light by” “08 on entry” site:nielsenhayden.com

    I only found two with 05.03.08 in it. The idea of this strategy is to get the most recent comments.

  8. Microsoft search also returns only 2 comments for 5/3. One is on the Shirky thread and it references comment 251.

  9. Patrick,
    I just resent the 254 comment version of ‘Where do they find the time?’ (Also, see request in 2nd email,)

    This email / zip file also has the main page from then, which shows who had made those latest comments: there probably weren’t too many others before the system went down.

    The lastest 10 then were:
    Terry Karney on Open thread 106 [#288 May 03, 2008, 03:10 AM]
    Linkmeister on Open thread 106
    David Goldfarb on Grease Monkey
    Marilee on “Where do people find the time?” [#254 May 03, 2008, 02:07 AM]
    Marilee on Open thread 106
    Marilee on Teresa in the Observer
    Michael Roberts on Open thread 106
    Bruce Cohen (SpeakerToManagers) on Open thread 106
    Mary Dell on “Where do people find the time?”
    Matt McIrvin on Eric Clapton, White Power enthusiast

  10. I pulled a list of recent comments from google cache as of April 30, and another list from live.com cache as of April 25, and got a list (no content, just the list) of 3000 comments. There’s no overlap between the two lists so I’m assuming there’s a gap.

    I put everything in an excel workbook and sorted out unique commentors and unique threads onto the 2d and 3rd sheets in the workbook.

    If it’s useful, grab it: http://www.canary3d.com/marys/list.zip

    I also mailed it to Abi, PNH & TNH.

  11. I’ve retrieved my comments that Google had cached (in HTML, of course), if anyone actually wants them. Last one it had is 4/30, but I doubt there’s anything missing other than a couple of days of unimportant comments.

  12. What about posts from before 3/1 that have comments after then?

    I don’t seem to have anything that isn’t in Patrick’s list at the top of the post.

  13. Actually, I take that back. I just got 2/18, “This can’t be good for one’s soul”, which had a couple of post-3/1 comments. I’ll see what else I can pull starting off that.

  14. Posts which I’ve got, which have post-3/1 comments.

    2/18 Can’t be good for one’s soul 791
    2/21 Blog writers we can’t get enough of 209
    2/23 Robert Legault 166
    2/24 Why Does Nader Hate America? 359
    2/25 Secret Service writes off security 121
    2/26 Cold Weather Drinks 69
    2/26 Memorial 4
    2/26 Turkey radically revising Hadith 96
    2/26 Fascist Octopus 113
    2/26 Art links 24
    2/27 William F. Buckley, dead 493
    2/28 Hugos, 2008 51
    2/28 Open thread 102 928

    These are all off MSN Live search cache. Patrick, let me know if you want me to email you a .tar.gz of them.

  15. “Feeling the heat 31”

    “334 comment version of “Darwin fish found”. The latter is definitely incomplete”

    I have a 32-comment copy of “Feeling the Heat”, and a 386-comment copy of “Darwin fish found”. Are either of these still useful?

    (I also have Sidelights from April 3 to May 2, and Particles from March 2 to April 30. Same question.)

    Everything else I have has the same number of comments as, or fewer than, the list above.

  16. Paul A, please post that info on the “who’s been saved” thread on Making Light – they don’t have the longer version of Darwin Fish up there, I just checked.

Comments are closed.