pytorch/serve

Confused about Cumulative Inference Duration vs. PredictionTime

Open

#1,698 创建于 2022年6月20日

在 GitHub 查看
 (3 评论) (0 反应) (0 负责人)Java (3,844 star) (790 fork)batch import
help wantedquestion

描述

📚 The doc issue

I am running a model on TorchServe and I am trying to see how long it takes for inference. If I use logging and view the logs, then I can see there is something called PredictionTime: image

However, if I use the Metrics API, then I got something called "Cumulative Inference Duration" image

And in terms of values those 2 are very different. So I am not sure which one should I use to measure the total inference time for my requests?

Btw, there is also something else called HandlerTime in the logs image

What does it mean? Where can I find related information about what are the meanings of these metrics?

Thanks,

Suggest a potential alternative/fix

No response

贡献者指南

Confused about Cumulative Inference Duration vs. PredictionTime · pytorch/serve#1698 | Good First Issue