Lightning-AI/pytorch-lightning

BatchSizeFinder throws KeyError: 'limit_eval_batches'

Open

#18,985 opened on Nov 10, 2023

View on GitHub
 (3 comments) (1 reaction) (0 assignees)Python (3,233 forks)batch import
bugduplicatehelp wantedrepro neededtunerver: 2.1.x

Repository metrics

Stars
 (26,687 stars)
PR merge metrics
 (Avg merge 9d 15h) (3 merged PRs in 30d)

Description

Bug description

Using the latest Lightning-AI v 2.1

I have added vanilla BatchSizeFinder as a callback to my lightningmodule. After it finishes finding the correct batchsize, it breaks while calling function batch_size_scaling. The error message is:

File ~/miniconda3/envs/dl/lib/python3.10/site-packages/lightning/pytorch/tuner/batch_size_scaling.py:155, in __scale_batch_restore_params(trainer, params)
    153     stage = trainer.state.stage
    154     assert stage is not None
--> 155     setattr(trainer, f"limit_{stage.dataloader_prefix}_batches", params["limit_eval_batches"])
    157 loop.load_state_dict(deepcopy(params["loop_state_dict"]))
    158 loop.restarting = False

On debugging the params variable only has the following keys:

dict_keys(['loggers', 'callbacks', 'max_steps', 'limit_train_batches', 'limit_val_batches', 'loop_state_dict'])

So indeed the limit_eval_batches key does not exist in my lightningmodule. Happy to provide more info if needed.

Thanks

What version are you seeing the problem on?

v2.1

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

No response

More info

No response

Contributor guide