My recent computer woes led to some corruption that makes python no longer run on my MacBook. This means that I can’t use Juice as my podcatcher anymore. To be honest, I’ve been using Juice for years without ever liking it but without much of an alternative since I refuse to use iTunes as my podcatcher. In a way, losing python was a positive because it forced me off the fence and into looking for a better alternative.
Luckily, I found it first try. I decided to try out Linc Fessenden’s bashpodder. It’s a 50 line bash script that takes a simple text file of feed URLs and fetches them. No muss, no fuss, no BS. RSS feeds in, podcasts out. I like that. There are now many variations as hackers have fiddled with the functionality, but I’m running the core vanilla mainline version. This one collects together shows into a date based directory. Because of the way it is using wget to fetch the actual files, in most cases it preserves the timestamp of the server version of the file. This actually helps me out a lot in my attempts to listen to shows in chronological order. I did make my own little hack to it, changing where it does the logging of a show URL to the history. The original script does it unconditionally, I have it check the exit code of wget and only put it in the history if that was successful. This way, a failed download will retry later.
Switching from one podcatcher to another is always a bit dicey at first. Since some of these feeds do the insane thing of keeping hundreds of episodes in them, if you aren’t careful bashpodder will fetch every one of those and fill up your hard drive. Here’s how I handled the transition. It was a bit labor intensive and required me watching it, but after the first run everything was perfect. The thing to be aware of is that there are two files – podcast.log and temp.log. The first is the permanent list of fetched files, the second is a working copy and at the end of the run the two are combined, duplicates filtered and the whole thing resaved to podcast.log. As files are fetched, it checks to see if an URL is in podcast.log and if it is, bashpodder skips it.
I ran the script from my MacBook in a terminal window. I ran it via:
sh -x bashpodder.shell
so that it was outputting all of its variables as it worked. When it would get to a new feed, it would splat out the list of file URLs that were parsed out of the RSS feed. I’d copy the files from the list I didn’t want downloaded and just put them directly into podcast.log via a file editor. You can be somewhat sloppy with this. When in doubt I let it fetch the file and I’d delete it later. If the URL goes into podcast.log more than once, no problem. It will get taken care of later. This required me riding the script for 45 minutes or so, but I mostly got the old shows into podcast.log manually. After the first run succeeded, I ran the script one more time. It fetched a few at the edges that I missed but then was completely caught up. I deleted files that I knew I had already listened to and away I went.
Now when I run it, I get only the new files. They go into that day’s directory, they sort themselves out somewhat by timestamp. I set up a cronjob to run this at 5 AM and now I’m in business. All the scripts that I use to put the files on my Insignia MP3 player work fine with the new directory structure and I’m back in business. Thanks Linc. This workflow is better than what I had, I no longer have Juice bogging down my machine and eating a lot of memory to do this simple task, and the whole thing runs in a simple bash process that I’m comfortable modifying if I want to. Right on.