Archving things on the web
Posted by ark, ,
I often seem images as I'm browsing around that make me laugh a lot myconfinedspace.com has been an especially good source recently . I'd like to keep them, but it needs to be as easy and seamless as possible. Here's what I do.

If I can I select the image, open up a google notebook and click on 'clip'
now it's in my google notebook.
However notebook used to serve a cached copy of the image, but it no longer does that so I have one more step to make sure I don't lose the image.
I have a cron job that runs every week that backs up my public notebook url and all the images referenced from it. It's easy to do with wget, here's the command line I use:

wget -t 1 -T 15 -N -E -H -k -K -p http://google.com/notebook/public/NOTEBOOK_PATH
That makes a whole bunch of directories for each website and has all the images on the local disk. If an image vanishes off the net it should stay around in the directories and I'll still have it.

After it's loaded you can run my wwwis script over it to fix all the image width and heights too.

Comments

Posted Tuesday 15 April 2008 Share