pytorch/serve
在 GitHub 查看Confused about Cumulative Inference Duration vs. PredictionTime
Open
#1,698 创建于 2022年6月20日
help wantedquestion
描述
📚 The doc issue
I am running a model on TorchServe and I am trying to see how long it takes for inference.
If I use logging and view the logs, then I can see there is something called PredictionTime:

However, if I use the Metrics API, then I got something called "Cumulative Inference Duration"

And in terms of values those 2 are very different. So I am not sure which one should I use to measure the total inference time for my requests?
Btw, there is also something else called HandlerTime in the logs

What does it mean? Where can I find related information about what are the meanings of these metrics?
Thanks,
Suggest a potential alternative/fix
No response