Skip to content

LinearSVC MultiLinearSVC classification

Richard Townsend edited this page Apr 11, 2017 · 2 revisions

Golearn includes liblinear, used for logistic regression and also linear support vector classification. It's best suited to datasets containing only a large number of numeric attributes, which often occur in natural language processing.

LinearSVC

  • The LinearSVC classifier outputs a binary class value (either 0 or 1).
  • Only FloatAttributes are used as input.
  • Only one class Attribute is supported.
  • Training requires conversion of the dataset, which may cause memory presure (see issue #94).
  • Prediction requires conversion of only the current row.
  • Penalty and loss parameters can be either "l1" or "l2". Not all combinations are supported.
  • The dual parameter decides whether liblinear optimises the primal or dual form, some choices are incompatible with combinations of "l1" and "l2".
  • C is roughly the "penalty" parameter.
  • eps decides when to stop iterating. Smaller values typically take longer.

MultiLinearSVC

The MultiLinearSVC can output a categorical class value, and uses the OneVsAllModel meta-classifier to output any CategoricalAttribute value. It works by training n binary LinearSVC classifiers - one for each given class - and classifying a instance as a given class when one of the underlying LinearSVC classifiers reports 1. Parameters and other things are precisely the same as the LinearSVC.

Limitatations

  • Currently, per-class weights are unsupported.
  • Only FloatAttributes are currently supported.
Clone this wiki locally