talideon.com

...randomly generated messages

September 20, 2004 at 12:26PM Assigning Categories with Bayesian Filtering

Here’s a mad idea I had a few months back: if we can use bayesian filtering on spam, why not use it to automagically categorise entries? It’s not going to be foolproof, but it would be an interesting experiment, and it may just work! We wouldn’t need a ferociously sophisticated implementation, a simple naïve one would do. Once trained for any length of time, it would be able to work out what the entry’s about at least 95% of the time. Anybody willing to try the idea out? I might if I’ve time, though there’s no guarantee that I will.

Update: It’s after occurring to me that you could also use this to generate a list of closely-related entries. It might be overkill, but in places where you’ve got a large number of entries, where cycles are relatively inexpensive, and other methods might not be effective, it might be a useful way of relating them.

Technorati Search Technorati Search Irish Bloggers

Comments

No comments.

Post a comment

All form information is optional, but it’s a good idea to fill in your name and email address if you want me to take your comment seriously.

Spammers, don’t bother posting crap down here. The site is set up so that legitimate search engines (Google, for instance) won’t index pages with comments on them. Posting crud here only means you’re wasting my time and patience. Shoo!

Real names, please. Please include!
Won’t be displayed. Please include!
Displayed, if present.