This is a common situation when building predictive models. The first thing I would do is confirm this is purely a predictive modeling task, where my goal is to maximize predictive accuracy, and I don't need to interpret the models.
Let's assume it is purely predictive.
In that case, I would likely keep all the correlated features in the model rather tha…
Keep reading with a 7-day free trial
Subscribe to The Data Interview to keep reading this post and get 7 days of free access to the full post archives.