Simplified PAC-Bayesian Margin Bounds
Abstract
The theoretical understanding of support vector machines is largely based on margin bounds for linear classifiers with unit-norm weight vectors and unit-norm feature vectors. Unit-norm margin bounds have been proved previously using fat-shattering arguments and Rademacher complexity. Recently Langford and Shawe-Taylor proved a dimension-independent unit-norm margin bound using a relatively simple PAC-Bayesian argument. Unfortunately, the Langford-Shawe-Taylor bound is stated in a variational form making direct comparison to fat-shattering bounds difficult. This paper provides an explicit solution to the variational problem implicit in the Langford-Shawe-Taylor bound and shows that the PAC-Bayesian margin bounds are significantly tighter. Because a PAC-Bayesian bound is derived from a particular prior distribution over hypotheses, a PAC-Bayesian margin bound also seems to provide insight into the nature of the learning bias underlying the bound.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bartlett, P.: Personal communication (2003)
Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)
Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the the size of the network. IEEE Transactions on Information Theory (March 1998)
Bartlett, P.L., Mendelson, S.: Rademacher and gaussian complexities: risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)
Catoni, O.: Gibbs estimators. To appear in Probability Theory and Related fields
Herbrich, R., Graepel, T.: A PAC-Bayesian margin bound for linear clasifiers: Why svms work. In: Advances in Neural Information Processing Systems, NIPS (2001)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30 (1963)
Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Annals of Statistics 30 (2002)
Langford, J., Seeger, M., Megiddo, N.: An improved predictive accuracy bound for averaging classifiers. In: ICML 2001 (2001)
Langford, J., Seger, M.: Bounds for averaging classifiers. CMU Technical Report CMU-CS-01-102 (2002)
Langford, J., Shawe-Taylor, J.: PAC-Bayes and margins. In: Neural Information Processing Systems (NIPS) (2002)
McAllester, D.: PAC-Bayesian stochastic model selection. Machine Learning 5, 5–21 (2003); A short version appeared as PAC-Bayesian Model Averaging. In: COLT 1999 (1999)
McAllester, D., Ortiz, L.: Concentration inequalities for the missing mass and for histogram rule error. In: Neural Information Processing systems (NIPS) (2002)
Ruben, H.: A new asymptotic expansion for the normal probability integral and mill’s ratio. Journal of the Royal Statistical Society. Series B (Methodological) 24(1), 177–179 (1962)
Schapire, R., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. In: Machine Learning: Proceedings of the 14th International Conference (1997)
Seeger, M.: PAC-Bayesian generalization bounds for gaussian processes. Journal of Machine Learning Research 3, 233–269 (2002)
Shawe-Taylor, J., Bartlett, P., Williamson, R., Anthony, M.: A Framework for structural risk minimization. IEEE Transactions on Information Theory 44(5), 1926–1940 (1998)
Author information
Authors and Affiliations
Toyota Technological Institute at Chicago,
David McAllester
Authors
- David McAllester
You can also search for this author in PubMed Google Scholar
Editor information
Editors and Affiliations
MPI for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf
University of California, Santa Cruz
Manfred K. Warmuth
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
McAllester, D. (2003). Simplified PAC-Bayesian Margin Bounds. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-45167-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive