sktime/sktime
View on GitHub[ENH] Hidden layer droput uniformization in the deep learning models.
Open
#9103 opened on Nov 22, 2025
enhancementgood first issuemodule:classificationmodule:regression
Description
Is your feature request related to a problem? Please describe.
Hidden layer droput uniformization in the deep learning models (classifiers, regressors & networks).
Current behaviour:
- In some of the TensorFlow based deep learning models, hidden layer dropout is hardcoded and not exposed to the users. I have pasted one such example below for illustration. https://github.com/sktime/sktime/blob/071551b59669a946a386617fb7706e69349e5dff/sktime/networks/rnn/_rnn_tf.py#L123
- In Deep Learning models, currently we can only set the hidden layer activations globally but there is no support for setting it layer-wise. Or, if it is set layer-wise then it is hardcoded and not exposed to the end-user. For example, in
MLPNetworkthe dropout is different in each layer because the implementation is based on the model described in section3.2.1 Multi Layer Perceptronon page 14 in Deep learning for time series classification: a review paper from 2019 https://github.com/sktime/sktime/blob/071551b59669a946a386617fb7706e69349e5dff/sktime/networks/mlp.py#L68-L77
Describe the solution you'd like
Expected/Suggested behaviour:
- Wherever dropouts are used in the hidden layers, they should be exposed to the end-user via
dropoutparameter. - Currently, parameter
dropout(wherever present) is expected to be afloat, change it to expect either atupleof floats or a singlefloat. And use the tuple of floats to set layer-wise dropouts, whenever it is specified. - Care needs to be taken in handling the dropouts specified via a tuple, such as enforcing the length of the passed tuple be equal to the number of hidden layers specified, and using the correct dropout in each corresponding hidden layer, etc.
- Care needs to be taken where it is allowed and not-allowed to set the dropout layer-wise. There may be certain network implementations either in TensorFlow or PyTorch which do not allow setting layer-wise dropouts in the network, in such cases parameter
dropoutshould strictly be afloatonly and DOCSTRING should reflect the same. - In current behaviour section above, I have mentioned only 2 examples but all deep learning models should be investigated and fix should be implemented everywhere it is needed.
- In order to maintain the consistent behaviour, while setting the default value of
dropoutparameter- use the same value/s which are currently hard-coded,
- if it is not hard-coded then use the default value from the underlying library's implementation/documentation.
Additional context
This issue originated from the discussion in the comments of #9042