In my entry last week, 30 minutes before I got The Phone Call, I had
written an entry that life seemed to be getting back to normal after a
bunch of uproar. I guess that showed me, huh? I did distract myself
from the stuff of recent days by finishing up my CRM114 news system. I
need to write a README doc explaining the idiosyncracies of
configuring this deal, and then I’ll tar it up and make it available
for download. I’ve been using it for a few days now and it seems to be
working pretty well. I wrote a training script in Perl that will take
two mbox files with “interesting” and “uninteresting” posts, and go
through the training cycle until everything in each file gets
classified correctly. If it gets classified the wrong way, it gets
trained again and it keeps doing that until there are zero
misclassified articles. I have about 45 in my interesting set and 100
in my uninteresting and that seems enough to be doing a really damn
good job of it. I was a little worried that if I had two articles with
mostly the same stuff (like for sale posts from the same person with the same footer and
most of the same text) in each of the training sets that it would
oscillate between the two and never arrive at a final training. Thus
far, that hasn’t happened. I’m now reclassifying around one article in
50, which unscientifically gives me a high 90s% accuracy without much
use or training. Later on I might add in something to have it mail me
the articles that exceed some threshold of “interestingness”. So far
so good on the AI learning, and it is really nice to be able to use
Mutt for my newsreader for this class of stuff (not general reading,
but things of interest to purchase.)

