: The beginnings of an auto-classifying junk filter are in place. There is now a 'filter junk' option over at littluns.ning.com/bsrch . This option uses the auto-classifyer and anything that matches is placed for consideration of being comitted to the permanent list of junk at someblogs.ning.com . I ran a query designed to bring up junk with this checkbox checked and got many entries comitted to the permanent junkblog list. I may eventually add a filter to make sure not splogs end up listed on junkblogs, but realistically since both are exclude lists, I'm not sure it matters. Anonymous blog adding / pinging has been enabled here on littluns, but if you are not logged in it runs your URL against the autoclassifyer first before allowing the ping to go through. Suggestions for stopwords / short phrases to use in identifying junk would be welcome :) While the system works if left alone, it works better if users frequently go over to someblogs.ning.com and verify/unlist/suggest blogs to go on the permanent junk list :)
: In the spirit of the SplogSpot (http://splogspot.com/) project, I have started http://someblogs.ning.com/, a search engine for so-called 'junk' blogs. This app is prone to moving at any time, but the basic framework is there. Next on the list is an auto-junk classifier. When untrusted pings are run through Littl'uns, they will be matched against a soon-here API from Some Blogs. If they are on the someblogs list, they will not be added to / shown on littluns. If they match the autoclassifier they will not be added to / shown on littluns and will be added to the someblogs 'suggest' list. This will help prevent content from personal blogs, sexblogs, etc from showing up on littluns. I suppose these blogs are not always 'junk' (I run a personal blog myself), but they certainly do not appeal to a wide audience and should not show up in our listings (my personal blog is read by my friends only, because they're the only ones who care what goes on in my life).
Junk Filter