Learning by Refuting

Abstract

The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of efficient agnostic learning. Informally, refutation complexity of a class C is the minimum number of example-label pairs required to efficiently distinguish between the case that the labels correlate with the evaluation of some member of C (structure) and the case where the labels are i.i.d. Rademacher random variables (noise). The easy direction of this relationship was implicitly used in the recent framework for improper PAC learning lower bounds of Daniely and co-authors [Dan16, DLS14, DSS16] via connections to the hardness of refuting random constraint satisfaction problems. Our work can be seen as making the relationship between agnostic learning and refutation implicit in their work into an explicit equivalence. In a recent, independent work, Salil Vadhan [Vad17] discovered a similar relationship between refutation and PAC-learning in the realizable (i.e. noiseless) case.

Topics

    0 Figures and Tables

      Download Full PDF Version (Non-Commercial Use)