I used the Categorizer::SVM library for large data classification (great
tool); however, I'm having trouble analyzing the result from the SVM
- Categories have scores of either 0 or 1, with 1 being that this document
belongs to this category, and 0 otherwise. Are there any scores representing
probabilities or confidence level of belong to certain category other than
these 0, 1 values?
- Suppose this document could belong to 3 possible categories: cat1, cat2,
and cat3. The best_category method simply picks the first category as the
classification decision. If you call, $hypothesis->categories, the
categories outputed don't seem to be in the order of probabilities or
confidence level. They seem to be in the fixed order....and whatever listed
first is favored.
I hope someone can clear my confusion on the scores of categories in the
Thank you very much in advance,