Grokbase Groups Perl ai November 2001
FAQ
Hi,

I've just uploaded a new version of AI::Categorize to CPAN. You'll
find the new version (0.06) at http://www.cpan.org/modules/by-module/AI/
as
soon as CPAN propogates the upload.

Changes in version 0.06:

- Fixed a bug which resulted in incorrect probabilities in
NaiveBayes categorize() calculations.

- Threshold for Naive Bayes categorizer is now a settable parameter,
letting you tune performance to balance precision and recall to
suit your needs. Default threshold is 0.3 (used to be fixed at 0.5).

- Added the precision() and recall() methods, which are another set
of measures of how good a categorizer is.

- Wrote documentation for the VectorBased superclass - it was
previously vestigial docs from the kNN module (oops).

- No changes made to the kNN categorizer - however, the precision
and recall scores below show that clearly some changes are needed.
The main problem is the setting of thresholds, and I've done some
work in this area that's already improved scores, but it's not
ready yet.

- Current scores on the drmath-1.00 corpus with features_kept => 0.1:
******************* Summary *********************************
* Name miR miP miF1 error time *
* 01-NaiveBayes: 0.226 0.280 0.239 0.018 79 sec
*threshold=0.3
* 01-NaiveBayes: 0.161 0.213 0.176 0.017 93 sec
*threshold=0.5
* 02-kNN: 0.650 0.109 0.178 0.105 2069 sec *
*************************************************************
* miR = micro-avg. recall miP = micro-avg. precision *
* miF = micro-avg. F1 error = micro-avg. error rate *
*************************************************************

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupai @
categoriesperl
postedNov 7, '01 at 5:47p
activeNov 7, '01 at 5:47p
posts1
users1
websiteperl.org

1 user in discussion

Ken Williams: 1 post

People

Translate

site design / logo © 2021 Grokbase