Validating a model using statistics
As noted above, quantitative variables measure how much or how many; qualitative variables represent types or categories.For instance, suppose it is of interest to predict sales of an iced tea that is available in either bottles or cans.
Cross-validation is a method for estimating the accuracy of a model's predictions on unobserved cases; when you optimize your model using CV, you're selecting a final model based on its ability to make predictions.
If the residual analysis does not indicate that the model assumptions are satisfied, it often suggests ways in which the model can be modified to obtain better results.
In regression analysis, model building is the process of developing a probabilistic model that best describes the relationship between the dependent and independent variables.
The statistician will consider things like the following: From the machine learning perspective, classical statisticians worry too much about a lot of unimportant details; the only important thing is to get the predictions right.
(Compare Breiman.) From the classical statistics perspective, machine learning methods are opportunistic and unreliable.