No Title | Evil Genius Chronicles

No Title

February 01 2003 | 2 min read

In my entry last week, 30 minutes before I got The Phone Call, I had written an entry that life seemed to be getting back to normal after a bunch of uproar. I guess that showed me, huh? I did distract myself from the stuff of recent days by finishing up my CRM114 news system. I need to write a README doc explaining the idiosyncracies of configuring this deal, and then I'll tar it up and make it available for download. I've been using it for a few days now and it seems to be working pretty well. I wrote a training script in Perl that will take two mbox files with "interesting" and "uninteresting" posts, and go through the training cycle until everything in each file gets classified correctly. If it gets classified the wrong way, it gets trained again and it keeps doing that until there are zero misclassified articles. I have about 45 in my interesting set and 100 in my uninteresting and that seems enough to be doing a really damn good job of it. I was a little worried that if I had two articles with mostly the same stuff (like for sale posts from the same person with the same footer and most of the same text) in each of the training sets that it would oscillate between the two and never arrive at a final training. Thus far, that hasn't happened. I'm now reclassifying around one article in 50, which unscientifically gives me a high 90s% accuracy without much use or training. Later on I might add in something to have it mail me the articles that exceed some threshold of "interestingness". So far so good on the AI learning, and it is really nice to be able to use Mutt for my newsreader for this class of stuff (not general reading, but things of interest to purchase.)