Plucker Experiments continue

A few more cycles and I think I have the Plucker/Sitescooper stuff the
way I want it. I also figured out why Fucked Company never rendered
properly (it didn’t on the old setup either.) I’m not sure if it is in
the original HTML or an artifact of the Sitescooper conversion, but by
the time Plucker gets to it, it has multiple <head> tags. I
tried removing the extra by hand, and FC then correctly converted to
Plucker format. I wrote a simple perl script that will remove all the head
tags from an HTML document and am running that against the affected
ones. I tried to write a regex that would remove all head tags after
the first (using Regex coach) but found that variable length
look-behinds aren’t implemented. Damn. I could have done some kind of
procedural looping and all that, but I didn’t care that much and just
did a bulk deleting of every <head> I found. I’m assuming that
having 0 tags will cause Plucker less problem than multiples. We shall
see. I’m hotsyncing that experiment now.

Published by

dave

Dave Slusher is a blogger, podcaster, computer programmer, author, science fiction fan and father. Member of the Podcast Hall of Fame class of 2022.