|
SVMmulticlassMulti-Class Support Vector MachineAuthor: Thorsten Joachims <thorsten@joachims.org>Cornell University Department of Computer Science Version: 1.01 |
|
SVMmulticlass is an implementation of the multi-class Support Vector Machine (SVM) described in [1]. While the optimization problem is the same as in [1], this implementation uses a different algorithm which is described in [2].
This implementation is an instance of SVMstruct. It was not optimized for speed by exploiting special properties of the multi-class optimization problem, but serves primarily as a easy tutorial example of how to use the SVMstruct programming interface. So, for most problems you are probably better-off learning a one-against-the-rest binary classifier for each class using SVMlight or SVMperf. More information on SVMstruct is available here.
http://download.joachims.org/svm_multiclass/v1.01/svm_multiclass.tar.gz
Please send me email and let me know that you got it. The archive contains the source code of the most recent version of SVMmulticlass, which includes the source code of SVMstruct and the SVMlight quadratic optimizer. Unpack the archive using the shell command:
gunzip –c svm_multiclass.tar.gz | tar xvf –
makeThis will produce the executables svm_multiclass_learn and svm_multiclass_classify. If the system does not compile properly, check this FAQ.
SVMmulticlass consists of a learning module (svm_multiclass_learn) and a classification module (svm_multiclass_classify). The classification module can be used to apply the learned model to new examples. See also the examples below for how to use svm_multiclass_learn and svm_multiclass_classify.
Usage is much like SVMlight. You call it like
svm_multiclass_learn -c 1.0 train.dat model.datwhich trains an SVM on the training set train.dat and outputs the learned rule to model.dat using the regularization parameter C set to 1.0. Other options are:
General options: -? -> this help -v [0..3] -> verbosity level (default 1) -y [0..3] -> verbosity level for svm_light (default 0) Learning options: -c float -> C: trade-off between training error and margin (default 0.01) -p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm, use 2 for squared slacks. (default 1) -o [1,2] -> Slack rescaling method to use for loss. 1: slack rescaling 2: margin rescaling (default 1) -l [0..] -> Loss function to use. 0: zero/one loss (default 0) Kernel options: -t int -> type of kernel function: 0: linear (default) 1: polynomial (s a*b+c)^d 2: radial basis function exp(-gamma ||a-b||^2) 3: sigmoid tanh(s a*b + c) 4: user defined kernel from kernel.h -d int -> parameter d in polynomial kernel -g float -> parameter gamma in rbf kernel -s float -> parameter s in sigmoid/poly kernel -r float -> parameter c in sigmoid/poly kernel -u string -> parameter of user defined kernel Optimization options (see [1]): -q [2..] -> maximum size of QP-subproblems (default 10) -n [2..q] -> number of new variables entering the working set in each iteration (default n = q). Set nsize of cache for kernel evaluations in MB (default 40) The larger the faster... -e float -> eps: Allow that error for termination criterion (default 0.01) -h [5..] -> number of iterations a variable needs to be optimal before considered for shrinking (default 100) -k [1..] -> number of new constraints to accumulate before recomputing the QP solution (default 100) -# int -> terminate optimization, if no progress after this number of iterations. (default 10000) Output options: -a string -> write all alphas to this file after learning (in the same order as in the training set) Structure learning options: none
For more details on the meaning of these options consult the description of SVMlight and reference [2].
The input file example_file contains the training examples. The file format is the same as for SVMlight, just that the target value is now a positive integer that indicates the class. The first lines may contain comments and are ignored if they start with #. Each of the following lines represents one training example and is of the following format:
The target value and each of the feature/value pairs are separated by a space character. Feature/value pairs MUST be ordered by increasing feature number. Features with value zero can be skipped. The target value denotes the class of the example via a positive integer. So, for example, the line
3 1:0.43 3:0.12 9284:0.2 # abcdef
specifies an example of class 3 for which feature number 1 has the value 0.43, feature number 3 has the value 0.12, feature number 9284 has the value 0.2, and all the other features have value 0. In addition, the string abcdef is stored with the vector, which can serve as a way of providing additional information when adding user defined kernels.
The result of svm_multiclass_learn is the model which is learned from the training data in example_file. The model is written to model_file. To make predictions on test examples, svm_multiclass_classify reads this file. svm_multiclass_classify is called as follows:
svm_multiclass_classify [options] example_file model_file output_file
For all test examples in example_file the predicted classes are written to output_file. There is one line per test example in output_file in the same order as in example_file.
You will find an example problem with 7 classes at
http://download.joachims.org/svm_multiclass/examples/example4.tar.gz
Download this file into your svm_multiclass directory and unpack it with
gunzip -c example4.tar.gz | tar xvf -
This will create a subdirectory example4. There are 300 examples in the file train.dat and 2000 in the file test.dat. To run the example, execute the commands:
svm_multiclass_learn -c 0.1 example4/train.dat example4/model
svm_multiclass_classify example4/test.dat example4/model example4/predictions
The accuracy on the test set is printed to stdout.
This software is free only for non-commercial use. It must not be distributed without prior permission of the author. The author is not responsible for implications from the use of this software.
[1] K. Crammer and Y. Singer. On the Algorithmic Implementation of Multi-class SVMs, JMLR, 2001.
[2] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004. [Postscript] [PDF]
[3] T. Joachims, Making Large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT Press, 1999. [Postscript (gz)] [PDF]