[Community Contributions] examples on distributed inference using 🤗 Accelerate · huggingface/accelerate#3078

(6 comments) (2 reactions) (0 assignees)Python (626 forks)batch import

contributions-welcomegood first issuewip

Repository metrics

Stars: (5,805 stars)
PR merge metrics: (Avg merge 16d 17h) (23 merged PRs in 30d)

Description

The inference/distributed directory houses examples on running distributed inference with accelerate:

Phi2 for language generation
Stable Diffusion for image generation

The strategy followed there is to load an entire model onto each GPU and sending chunks of a batch through each GPU’s model copy at a time. Synthetic data generation has become an essential toolkit for every ML Engineer. So, it'd be beneficial to extend these examples to include some more use cases:

Image captioning
Speech data generation

Some nice to haves:

Include artifact serialization as done in this
Keep the artifact serialization code under a thread to not block GPU execution

How can you help?

You could help us contribute an example on any of the above-mentioned use cases or you can come up with your own 🤗 Help us make the art of synthetic data generation scalable, easy, and accessible.

Contributor guide

Research direction: Study the existing distributed inference examples in the `examples/inference/distributed` directory. Understand how to load a model onto each GPU and process batches. Choose a model for image captioning (e.g., BLIP) or speech generation (e.g., Whisper) and adapt the pattern. Optionally, add artifact serialization using threads to avoid blocking GPU execution.
Tech stack: pythonpytorch
Domain: machine learningbackend
Issue type: Feature
Difficulty: 3
Estimated time: Half day
Activity status: Active
Clarity: Clear
Prerequisites: PythonPyTorchHugging Face Accelerate
Newbie friendliness: 65

Repository metrics

Description

How can you help?

Contributor guide

Get fresh easy issues in your inbox.