描述
I get this asked about once a day, so I think we should just add it. Many people work with time series, and adding cross-validation for them would be really easy. The standard strategy is described for example here
There are basically two cases: homogeneous time series (one sample every X seconds / days), or heterogeneous time series, where each sample has a time stamp.
For the homogeneous case, we can just put the first n_samples // n_folds in the first fold etc, so it's a very simple variation of KFold. Fixed in #6586.
For heterogeneous case, we need to get a labels array and split accordingly. If we cast that to integers, people could actually provide pandas time series, and they would be handled correctly (they will be converted to nanoseconds).
I remember arguing against this addition, but I changed my mind ;)