Lightning-AI/pytorch-lightning

trainer.test() with given checkpoint logs last epoch instead of checkpoint epoch

Open

#20,052 建立於 2024年7月5日

在 GitHub 查看
 (1 留言) (1 反應) (0 負責人)Python (26,687 star) (3,233 fork)batch import
bughelp wantedrepro needed

描述

Bug description

Testing from a given checkpoint leads to logging the epoch number of the last checkpoint instead of the checkpoint specified:

trainer = Trainer(..., max_epochs=10)
lightning_module = MyLightningModule(...)
datamodule = MyDatamodule()

trainer.fit(lightning_module , datamodule=datamodule)

trainer.test(lightning_module , datamodule=datamodule, ckpt_path="last")     # <-- ok: logs correct epoch and step
ckpt_path="/.../checkpoints/epoch=2-step=396.ckpt"
trainer.test(lightning_module , datamodule=datamodule, ckpt_path=ckpt_path)  # <-- incorrect: logs last epoch and step

The second test logs epoch 10 instead of epoch 2. Similarly, the step number of the second test is incorrect.

What version are you seeing the problem on?

v2.2.1

貢獻者指南

trainer.test() with given checkpoint logs last epoch instead of checkpoint epoch · Lightning-AI/pytorch-lightning#20052 | Good First Issue