- LRH Enterprises - http://lrh.net/wpblog_lrh -

How to PODCast for Automation-Part 2

This is the second part of an article about some problems that you as a POD Caster should be aware of when trying to provide compatiblity for ‘automation’ of POD grabbing (downloading) for broadcast by other stations.

At WUMD, we use PODGrabber to collect the podcasts we air. PODGrabber has two URLs per podcast, the second being considered the ‘backup’ site. On the primary URL, you set the number of tries and the interval between tries. If the pod is not grabbed before the tries are exhausted, the secondary URL is used. PODGrabber uses the RSS fields to determine if a new pod is available. You can optionally set which fields to check, Title, PubDate or Guid. You can also set it to get only the newest item when detected, or all new items detected.

When there is only one feed URL, this works fabulously! A problem can arise when there is a backup feed site that is not co-ordinated with the primary. One of those three fields must be consistant across the two feeds. This is something that the poster of the podcast must manage.

For example, it seems that podcasts feeding from Audioport add the series name to the title field. If the series is titled “Here and There” and the podcast is named “Here and There – March 30”, then the RSS title in the feed ends up being “Here and There:Here and There – March 30”. A waste of space, indeed, but the trouble is when the series originator posts the same podcast on the backup feed, and the title stays “Here and There – March 30”. This makes it difficult to watch for a new release, because, since the titles don’t match, they will look “different” and will download, even if it was last weeks release. Because this ‘satisfies’ the download of the pod, the wrong audio cut will be used.

The same kind of issues exist with PubDate and Guid, they can be different from different feeds of the same podcast file. (Not sure why, except that the poster might not have any control in some cases.)

The RSS Guid field will be unique within one site’s feed, but would not be expected to be the same across the primary and backup feeds, so this must be disabled when using a backup feed with PODGrabber.

PODGrabber tries to solve these issues, but it’s tricky. A flag can be set on either feed to force a separate download history to be kept. This means that we won’t download any file twice from either feed, but there’s no guarantee that the backup feed is the latest one, especially if the primary feed has been flawless for quite some time. The backup feed history will be out-of-date since we wouldn’t have checked it.

It could also happen that when a primary feed fails or is late, we get the podcast from the backup feed, then the following week, the primary feed is working but it has the podcast we already got from the backup feed. If Title or Pubdate don’t match what we already have, we’ll download it and have the wrong cut again. Or, we might notice there are several files, and download them all, but we’ve still downloaded a duplicate, wasting time, and we may have no idea which file is “this week”.

Much of this trouble will be avoided if the Title or PubDate is consistant across the primary and secondary feeds, so please make that happen! It seems to me that if you put yourself on the other side of the product, as a podgrabber, you would see these issues and problems and realize they need to be addressed.

In summary, if you are a podcaster creating product for other stations to air, your need to make your feed automation friendly.