Kjetil's Information Center: A Blog About My Projects

XSPF Integrity Check

This is a Python script to check the integrity of XSPF playlist files. By integrity check, I mean checking if the files that the playlists references actually do exist.

You may notice that I use a regular expression with a lambda expression to decode the URL encoding instead of using the standard urllib.unquote() routine. There is a good reason for this, namely that urllib.unquote() returns a "unicode" formatted string instead of a regular "str" Python string. I happen to use Latin-1 encoding on my filenames, and in order to properly decode these, the built-in decode() function must be used, but that one only works on regular "str" strings!

Anyway, here's my script:

#!/usr/bin/python

import xml.dom.minidom
import re
import os.path

def xspf_parse(playlist_filename, handler):
    xml_data = xml.dom.minidom.parse(playlist_filename)
    for playlist in xml_data.getElementsByTagName("playlist"):
        for tracklist in playlist.getElementsByTagName("trackList"):
            for track in tracklist.getElementsByTagName("track"):
                for location in track.getElementsByTagName("location"):
                    data = re.sub("%([0-9a-fA-F]{2})", \
                        lambda x: chr(int(x.group(1), 16)), \
                        location.firstChild.data.encode("utf-8"))
                    track_filename = data.decode("utf-8").replace("file://", "")
                    handler(playlist_filename, track_filename)

def file_check(playlist_filename, track_filename):
    if not os.path.isfile(track_filename):
        print playlist_filename, "-->", track_filename

if __name__ == "__main__":
    import sys

    if len(sys.argv) < 2:
        print "Usage: %s <xspf file> ... <xspf file>" % (sys.argv[0])
        sys.exit(1)

    for filename in sys.argv[1:]:
        xspf_parse(filename, file_check)

    sys.exit(0)
          


Topic: Scripts and Code, by Kjetil @ 11/12-2014, Article Link