How to download your photos from LiveJournal Scrapbook using wget

It is not trivial to download photos from LiveJournal Scrapbook, at the moment (November 2013) no application (e.g. Semagic) appears to be able to accomplish the trick.
However, there's a way to get the pictures via wget, explained by chebe and extended by joecarnahan.
Unfortunately, their method doesn't work anymore. Therefore I have updated it to comply with the current Scrapbook system and have also generated two bash shell scripts that automate the task.

  • If you want to be able to download non-public images, get your cookies as a text file and put them into the folder you want to download your images to (as cookies.txt).
    If you don't know how to do this, follow the instructions by joecarnahan.
  • Now you need to download the images. The following steps can be executed in one go via the script script1.sh, which can be downloaded here.
    Run the script by typing ./script1.sh USERNAME, where USERNAME is of course your own username.
    If you have trouble executing the script, try chmod u+x script1.sh.

In case you want to run the commands manually:
  • The main trick is to bypass LiveJournal's robots.txt. I know this is not fair but as long as you use it responsibly and only as a measure for accessing your own photos, it should be justified.

    Get robots.txt (in all these instructions, replace USERNAME with your own username):
    wget --load-cookies cookies.txt -nc -np -r -o crawl_log.txt http://USERNAME.livejournal.com/pics/catalog/

    Suppress robots.txt:
    echo > USERNAME.livejournal.com/robots.txt

    Now re-execute the previous command to get the catolog of all your photos:
    wget --load-cookies cookies.txt -nc -np -r -o crawl_log.txt http://USERNAME.livejournal.com/pics/catalog/
  • Get the links to the images:
    grep -r original.jpg USERNAME.livejournal.com | grep _blank | grep -v -E "http:/url" | cut -d '"' -f 6 | uniq >original_urls.txt
  • In case you want to retain gallery and image names, also execute the following commands:

    Find corresponding image titles:
    grep -r "b-pics-title b-editable-elem" USERNAME.livejournal.com | grep -v "/a" | grep -v -E "/p[1-9]" | grep -v -E "http:/url" | grep -E "[0-9]/[0-9]" | cut -d '>' -f 2 | cut -d '<' -f 1 >image_names.txt

    Find corresponding gallery titles:
    grep -r ">Albums<" USERNAME.livejournal.com | grep -v -E "/p[1-9]" | grep -v -E "http:/url" | grep -E "[0-9]/[0-9]" | cut -d '>' -f 14 | cut -d '<' -f 1 >gallery_names.txt

    Optional: find gallery numbers:
    grep -r ">Albums<" USERNAME.livejournal.com | grep -v -E "/p[1-9]" | grep -v -E "http:/url" | grep -E "[0-9]/[0-9]" | cut -d '>' -f 13 | cut -d '/' -f 6 | cut -d '"' -f 1 >gallery_numbers.txt
  • Download the actual images:
    wget --load-cookies cookies.txt -i original_urls.txt -np -o dl_log.txt -x

Now that you have the images, here is how you rename them:
  • Make a backup of the folder you downloaded the photos into, in case something goes terribly wrong while renaming.
  • Get the second script script2.sh here.
    Run the script by typing ./script2.sh ic.pics.livejournal.com/USERNAME/GALLERYNUMBER/
    USERNAME is your own username and GALLERYNUMBER is the number of the gallery, which can be identified by looking into the folder ic.pics.livejournal.com/USERNAME/.
    It is very important that you add the "/" to the end of the directory.
    If you have trouble executing the script, try chmod u+x script2.sh.
  • Now all of your files should have been renamed according to the following system:
    GALLERYNAME_-_IMAGENUMBER_IMAGENAME.jpg

  • If something doesn't work, play around with the commands and scripts and I'm sure you'll come up with a solution.
    Don't blame me if you lose any data or break anything, use the scripts and commands at your own risk.
    Note that during last year's migration of photos by LiveJournal photos might have ended up in the wrong gallery.