Blog Archive

By James Kwak

Last month a reader pointed out that many of the links in the blog archive were broken. The problem is that Feedbooks, the service I was using, no longer allows you to transform RSS feeds into PDFs. I fixed the links up through April 2010, but the problem is that I no longer have an elegant way of creating new PDF archives. Basically I need something that will allow me to type in an RSS feed and will generate a PDF from that feed.

For example, this is the feed for May 2010, in forward chronological order. When I type that feed URL into Chrome, I get raw XML. When I type it into Firefox, it gives me truncated versions of each post. When I type it into Safari, it gives me full posts, but in its infinite wisdom, it sorts them in reverse chronological order; the sort by date feature only allows reverse chronological. When I type it into Google Reader, I get full posts in the right order, but I lose the post dates (they are replaced by the current date). Bloglines is like Safari — it insists on reverse chronological. RSS 2 PDF only gives me post titles. FeedShow displays nothing.

If I could get any of these to work in any browser, I could just convert that to a PDF and be done. Any suggestions?

Update: To be clear, saving a web page (or anything) as a PDF is not the problem. It’s trivial on a Mac (and not that hard on a PC, either). The problem is getting the full text of all the posts in the feed, in the proper order, with the proper dates, on one web page.

22 thoughts on “Blog Archive

  1. After some googling, I found this page, which will convert an RSS feed into a PDF. If you click the options it will allow you to do it in reverse-chronological order. It’s pretty nicely designed too, with columns and everything. I tried it on your feed, and it works fine.

  2. What I meant to say that the options allows you to do it in regular chronological order, obviously, not reverse-chronological order :)

  3. paste into an open office word document (libre) save as html. seems to be reasonable.

    not sure how to get to pdf might try rtf to pdf converter

  4. You have a very sound foundation for a Book on the crisis based upon the data from the start of the blog. It would make sense to do this if the material was to be inaccessible, but as an anthropologist I would say that the fact that you never intended to do that, and the duration and intensity of this historic crisis, makes it significant in itself.

    Go for It!

  5. @Oskar: PDF Newspaper works beautifully, but even with the paid version there is a limit of 10 items per PDF, which is not enough for a month’s archive. I could theoretically host the software myself, but that is more effort that I want to put in. But thanks for the idea.

  6. The way I would go about it is to select several prominent scholars (multidisciplinary but controlled) and I would hand them the mass of data and ask them to interpret it as raw material (field work) with a final analysis and conclusion about (? crisis / American interactions…whatever …it’s really an open question around the response to crisis…).

    Assess your input and have a symposium of the group presenting their summations and posture/positions…and then an open forum interactive discussion…next to last chapter.

    finalize with a summary chapter by Johnson & Kwak (or a hybred combination).

    The impact and results of this crisis are well past its initial climax and is merging into a multiplex of tributaries. This truly would be the ideal timing to try to generate meaning from the archives towards what is next for an agenda and inquiry. The questions raised are greater than any of the resolve. It is a great opportunity to bring these past years to a new emergent potential.

  7. @Bruce:

    Yes, it could show how the collective mind works. Taken over a longer period of time, such analysis is the social scientist dream come true–please, no regression lines, cluster analysis should suffice.

    In the end, though, we should still ask ourselves: What is our conversation representative for? Other than network dynamics, that is.

  8. @James: Well, then, I guess if there’s no web-service that satisfies your demands, we’ll have to get a little creative. If you just need a good HTML-version of the page that you can then convert into a PDF, there is something called XSL, which is basically an advanced stylesheet that you attach to the XML file which transforms it into an HTML-document. There are definitely stylesheets available online that’ll just print out an entire rss-feed (I found this example after two seconds of googling, but you might want to do some more searching).

    That is, if you were to do it with this method, these would be the steps:

    1. Download the XML-file to your computer.
    2. Open it in a text-editor, and add a line that attaches the XSL-stylesheet.
    3. Open the XML-file in a (modern) web-browser (which should now look just like regular HTML).
    4. Convert to PDF, using whatever method you prefer.

    There are other ways to do it, but this seemed the most obvious way to do it without having to do any actual programming.

  9. Then again, it might just be simpler to do it manually :). Being a programmer tends to skew your perspective on what is “easy” to regular people.

  10. I may not understand your problem but here is what I learned.
    1. I opened the supplied link ( in IE-9. It displayed the May 2010 Baseline Scenario in descending order starting with May 7, 2010 and ending on May 1, 2010.

    2. In the browser, at top right, is a box headed “Displaying” followed by 15/15. Below that is a heading “Sort by:” followed by “Date, Title, Author.” “Date” has a blue down arrow next to it. Clicking on “Date” changes the arrow to up and reverses the sort order.

    3. I then copy the page, paste it into MS Word 2010 (this works with 2007, maybe others) and save as PDF.

    Hope this is what you are looking for.

  11. @Speed: That sounds right, except that I’m a Mac person so I can’t use IE9 (and I don’t have Parallels, either).

    I could go find a Windows machine and use it, but I’m surprised there isn’t an easier solution. Safari is so, so close.

  12. @Oskar: Thanks, but before I do that I would just use Safari and live with the fact that it’s in reverse-chronological order.

  13. update: I had only diagonally read; apparently the newspaper is not enough. Good luck with changing the php files!

  14. Would BlogBooker work for you?

    I took a look at your feed urls, and you’ve got some problems with them–I couldn’t see one example of being able to return a full-month’s feed of articles by which I could suck in the xml into a custom app and format out what you need to gen the pdf.

  15. @JC: You’re not getting a full month’s worth of feeds because there’s a server-side setting that only returns 15 items, but I can change that myself.

  16. Vienna RSS reader for OS X seems to do what you want. Put in the feed URL, and all the article titles come up–well at least the 15 you have it spec’d to show.

    You can sort on the date asc/desc. Select all of the titles, and all the full articles open up in a pane. Print to pdf works just like it should. And it’s free, open source software!

Comments are closed.