How good is your model? Guilt-free interactive data analysis.
Reliable tools for model selection and validation are indispensable in almost all applications of machine learning and statistics. Decades of theory support a widely used set of techniques, such as holdout sets, bootstrapping and cross validation methods. Yet, much of the theory breaks down in the now common situation where the data analyst works interactively with the data, iteratively choosing which methods to use by probing the same data many times. A good example are data science competitions in which…