| Anthony Bailey ( @ 2007-05-25 21:57:00 |
| Entry tags: | software_development |
A couple of weeks ago I noticed that I was avoiding subscribing to some informative feeds because they were a bit too informative. I don't want to see as many as a dozen separate news items across the course of a single day, because it makes too much noise in the feed aggregator and gets in the way of rarer, more valuable entries. (In my case, the aggregator takes the form of my LJ Friends page.)
I realized I wanted the equivalent of a digest e-mail for feeds - a daily summary or similar. This sounded like a job for a web service, so after failing to find any existing solution, I registered feedsum.com and coded up a little Rails app that sucks in a source feed, collates items into daily batches, and generates a derived feed in which each item summarizes all the source items for one day.
At first this was simply a stateless service, but I found that source feeds often dried up very quickly - by the end of the day, the earliest items could already have been pushed out. So now I stash source items in a database whenever the source feed is pulled. I also found that syndicators such as LiveJournal were not sufficiently patient to wait for my app to spin up in clunky development mode, starting up a fresh Ruby before pulling the source and generating a summary - they often timed out before the service was done. So finally I have had to learn how to deploy a proper production Rails app using mongrel proxies and all that. There were some fiddly one-time set-up details, but once up and running Capistrano is pure deploy joy.
Currently I'm using the service to source three LJ syndications:
slashdot_daily,
machinifeedaily and
arstechnicdaily. The service can probably handle a further light load yet so others (LJ users or any other feed consumer) are welcome to give it a beta test.
It's easy to use: to get a daily summary of some feed
http://example.com/path/index.xml
simply request
http://feedsum.com/daily/example.com/pat h/index.xml
(You can do other neat stuff like http://feedsum.com/every/8/hours/... and so forth but the daily summaries tend to be the useful ones.)
Do try it out and please tell me if e.g. it completely chokes on any source feeds. (And to avoid a rush of dupes, could anyone who uses it to register further LJ syndication accounts add a link in a comment below.)