how does `default_response_timeout` work? · pytorch/serve#2452

(6 comments) (0 reactions) (0 assignees)Java (790 forks)batch import

documentationgood first issuetriaged

Repository metrics

Stars: (3,844 stars)
PR merge metrics: (No merged PRs in 30d)

Description

📚 The doc issue

I set the value of default_response_timeout to 4 i.e. 4 seconds. At the start of the model load, this happens after 4 (ish) seconds:

org.pytorch.serve.wlm.WorkerInitializationException: Backend worker did not respond in given time

My guess is because the model takes a while to load (more than 4 seconds), the worker gets killed. Is there a way to set a larger initial delay i.e. differentiate these two scenarios:

account for the initial model load with a number different from default_response_timeout
if model doesn't response in default_response_timeout after the initial load, then kill the worker

Suggest a potential alternative/fix

No response

Contributor guide

Research direction: Investigate the default response timeout configuration and how it interacts with model loading. Check the source code for WorkerInitializationException to understand when it is thrown.
Tech stack: java
Domain: backend
Issue type: Documentation
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Clear
Prerequisites: TorchServe
Newbie friendliness: 75

Repository metrics

Description

📚 The doc issue

Suggest a potential alternative/fix

Contributor guide

Get fresh easy issues in your inbox.