pytorch/serve

Confused about Cumulative Inference Duration vs. PredictionTime

Open

Aperta il 20 giu 2022

Vedi su GitHub
 (3 commenti) (0 reazioni) (0 assegnatari)Java (3844 star) (790 fork)batch import
help wantedquestion

Descrizione

📚 The doc issue

I am running a model on TorchServe and I am trying to see how long it takes for inference. If I use logging and view the logs, then I can see there is something called PredictionTime: image

However, if I use the Metrics API, then I got something called "Cumulative Inference Duration" image

And in terms of values those 2 are very different. So I am not sure which one should I use to measure the total inference time for my requests?

Btw, there is also something else called HandlerTime in the logs image

What does it mean? Where can I find related information about what are the meanings of these metrics?

Thanks,

Suggest a potential alternative/fix

No response

Guida contributor