SVMstructSupport Vector Machine for Complex OutputsAuthor: Thorsten Joachims <thorsten@joachims.org>Cornell University Department of Computer Science Version: 3.10 |
SVMstruct is a Support Vector Machine (SVM) algorithm for predicting multivariate or structured outputs. It performs supervised learning by approximating a mapping
using labeled training examples (x1,y1), ..., (xn,yn). Unlike regular SVMs, however, which consider only univariate predictions like in classification and regression, SVMstruct can predict complex objects y like trees, sequences, or sets. Examples of problems with complex outputs are natural language parsing, sequence alignment in protein homology detection, and markov models for part-of-speech tagging. The SVMstruct algorithm can also be used for linear-time training of binary and multi-class SVMs under the linear kernel [4].
The 1-slack cutting-plane algorithm implemented in SVMstruct V3.10 uses a new but equivalent formulation of the structural SVM quadratic program and is several orders of magnitude faster than prior methods. The algorithm is described in [5]. The n-slack algorithm of SVMstruct V2.50 is described in [1][2]. The SVMstruct implementation is based on the SVMlight quadratic optimizer [3].
SVMstruct can be thought of as an API for implementing different kinds of complex prediction algorithms. Currently, we have implemented the following learning tasks:
https://osmot.cs.cornell.edu/svm_struct/current/svm_struct.tar.gz
If you are not so eager on C programming, then you might want to look at the Python API by Thomas Finley or the Matlab API by Andrea Vedaldi. They make it substantially easier to prototype in than using the original C API, but offer essentially the same functionality and call the original C-code internally. Also, both the Pyton and the Matlab APIs are identical in their structure to the C API described below, so it is easy to switch between them.
If you decide to use the C version, the file you downloaded above contains the source code of the most recent version of SVMstruct as well as the source code of the SVMlight quadratic optimizer. Unpack the archive using the shell command:
gunzip –c svm_struct.tar.gz | tar xvf –This expands the archive into the current directory, which now contains all relevant files. You can compile SVMstruct with the empty API using the command
makein the root directory of the archive. It will output some warnings, since the functions of the API are only templates and do not return values as required. However, it should produce the executables svm_empty_learn svm_empty_classify. "empty" is a placeholder where you can substitute a meaningful name for your particular instance of SVMstruct. To implement your own instantiation, you will need to edit the following files:
Finally, you can also implement your own structural SVM training algorithm in SVMstruct using the file svm_struct_learn_custom.c. By putting your algorithm into the empty function there, you can access the API and all the instance-specific functions that the algorithms already implemented in SVMstruct are using. Your custom algorithm is then selected via the -w 9 option. This makes it easy to test new algorithms and compare them against the existing algorithms.
svm_empty_learn -c 1.0 train.dat model.datwhich trains an SVM on the training set train.dat and outputs the learned rule to model.dat using the regularization parameter C set to 1.0 (note that this crashes for the empty API -- use one of the other instantiations from above for a working example). The format of the train file and the model file depend on the particular instantiation of SVMstruct. Other options are:
General Options: -? -> this help -v [0..3] -> verbosity level (default 1) -y [0..3] -> verbosity level for svm_light (default 0) Learning Options: -c float -> C: trade-off between training error and margin (default 0.01) -p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm, use 2 for squared slacks. (default 1) -o [1,2] -> Rescaling method to use for loss. 1: slack rescaling 2: margin rescaling (default 2) -l [0..] -> Loss function to use. 0: zero/one loss ?: see below in application specific options (default 0) Optimization Options (see [2][5]): -w [0,..,9] -> choice of structural learning algorithm (default 3): 0: n-slack algorithm described in [1] 1: n-slack algorithm with shrinking heuristic 2: 1-slack algorithm (primal) described in [5] 3: 1-slack algorithm (dual) described in [5] 4: 1-slack algorithm (dual) with constraint cache [5] 9: custom algorithm in svm_struct_learn_custom.c -e float -> epsilon: allow that tolerance for termination criterion (default 0.100000) -k [1..] -> number of new constraints to accumulate before recomputing the QP solution (default 100) (-w 0 and 1 only) -f [5..] -> number of constraints to cache for each example (default 5) (used with -w 4) -b [1..100] -> percentage of training set for which to refresh cache when no epsilon violated constraint can be constructed from current cache (default 100%) (used with -w 4) SVM-light Options for Solving QP Subproblems (see [3]): -n [2..q] -> number of new variables entering the working set in each svm-light iteration (default n = q). Set n < q to prevent zig-zagging. -m [5..] -> size of svm-light cache for kernel evaluations in MB (default 40) (used only for -w 1 with kernels) -h [5..] -> number of svm-light iterations a variable needs to be optimal before considered for shrinking (default 100) -# int -> terminate svm-light QP subproblem optimization, if no progress after this number of iterations. (default 100000) Kernel Options: -t int -> type of kernel function: 0: linear (default) 1: polynomial (s a*b+c)^d 2: radial basis function exp(-gamma ||a-b||^2) 3: sigmoid tanh(s a*b + c) 4: user defined kernel from kernel.h -d int -> parameter d in polynomial kernel -g float -> parameter gamma in rbf kernel -s float -> parameter s in sigmoid/poly kernel -r float -> parameter c in sigmoid/poly kernel -u string -> parameter of user defined kernel Output Options: -a string -> write all alphas to this file after learning (in the same order as in the training set) Application-Specific Options: --* string -> custom parameters that can be adapted for struct learning. The * can be replaced by any character and there can be multiple options starting with --.For more details on the meaning of these options consult references [1][3][5] and the description of SVMlight. The options starting with -- are those specific to the instantiation and are specified via the API.
This software is free only for non-commercial use. It must not be distributed without prior permission of the author. The author is not responsible for implications from the use of this software.
[1] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004. [Postscript (gz)] [PDF] [BibTeX]
[2] T. Joachims. Learning to Align Sequences: A Maximum Margin Approach, Technical Report, August, 2003. [Postscript (gz)] [PDF] [BibTeX]
[3] T. Joachims, Making Large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT Press, 1999. [Postscript (gz)] [PDF] [BibTeX]
[4] T. Joachims, Training Linear SVMs in Linear Time, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2006. [Postscript (gz)] [PDF] [BibTeX]
[5] T. Joachims, T. Finley, Chun-Nam Yu, Cutting-Plane Training of Structural SVMs, Machine Learning Journal, 77(1):27-59, 2009. [PDF] [BibTeX]