I’ve had discussions with + Daniel Conover  over the relative merits of a tightly controlled…

I've had discussions with +Daniel Conover over the relative merits of a tightly controlled taxonomy vs. a non-controlled "folksonomy." My balance tended to be more heavily weighted to the latter than Dan's in general. I often favor a hybrid approach where systems accept any kind of tag but when needed a backend authority promotes some of those tags into a controlled taxonomy.

You know what has really changed my mind? Digitizing my CD collection. It is amazing what complete bullshit the data that comes in from CDDB is. I mean, sure it is usually accurate as to the titles of the songs and the name of the artist (but not 100% of the time.) In the other stuff thought, it is crap. Try getting the metadata for both discs of any two disc CD. There is no guarantee the artist name is even the same, much less the title of the disc. Capitalization, use of numbers, ampersands vs "and" and many stylistic bits vary wildly.

As much as I value user entered data, I do the manual editing of every single CD just to clean up the crap. Over and over I say "CDDB is a commercial entity. Can't Sony/Gracenote pay someone to clean this up this embarrassing mess?"

#blog #cddb

via Google+