For my purposes, I’m less concerned if I have the best logistic regression model of coach firing, and more focused on if my final matched subgroup reflects similar types of teams.

As for your second question, it might not be, although I’d worry that within each win percentage bucket, there were still issues (like GM changes or SOS differences, for example) that you’d want to think about

]]>A naive question… The crux of your classification (i.e., likelihood of firing) relies on a 25 variable logistic regression, right?

1. How do we know the model itself is any good at doing what it’s supposed to? (predicting coach firing) The model seems to be fit on training data with no out of sample test data verification.

2. Is such a complicated model substantially better than a simple classification by win percentage bucket?

Again, perhaps I am misunderstanding something. Thanks again.

]]>Don’t know about college coaching changes. I know the group that put together this data had to manually search for coach firings, and that would be even more difficult at the college level.

Thanks for reading!

]]>Is it possible to look at year 2 or year 3 win pct as the outcome? Do you know if anyone has done similar analysis with college coaching changes?

Thanks for this post; it is an excellent primer for causal inference and propensity score matching.

]]>