Circular reasoning and your model

Here’s a quote I enjoyed1 from an interview with the inimitable John D. Cook:

It’s easy to get caught in circular reasoning. For example, how do you decide what data points are outliers? They are points that have low probability under your model. So you throw them out. Then, lo and behold, everything that’s left fits your model!

So how do you break out of the circle? You can start by visualizing your data. And after you select a model, validate it. If you’re fitting a model in order to make predictions, and your model indeed does make good predictions on new data, you can have some confidence that you’re not just playing mental games and that your model may be an approximation of reality. (emphasis added)

Full interview here.

  1. I found this quote buried in an old draft post when I imported everything into the new site. There are over 200 of these drafts – what else will I find in there??