The batch method from ChatModels and all the Runnables does not really support the OpenAI batch API.
#28,508 opened on Dec 4, 2024
Description
Checked other resources
- I added a very descriptive title to this issue.
- I searched the LangChain documentation with the integrated search.
- I used the GitHub search to find a similar question and didn't find it.
- I am sure that this is a bug in LangChain rather than my code.
- The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
Example Code
The chatmodel.batch method does not really use the Openai batch API. And there is not any other method which does it.
The batch API is 50% cheaper than the normal API as exchange of waiting for them to do it asynchronously. When you don't need real-time results this becomes quite handy.
Additionally, other providers are starting to implement something similar. So it will probably become a standard.
A lot of people want this actually see #21643.
Error Message and Stack Trace (if applicable)
No response
Description
When i use the chatmodel.batch method I expect to receive a batchid given by the openai API, so that later I can get the predictions when they are done though any other method.
What ends up happening is that instead of invoking the batch API, it invokes the sync API. Afeter few time I receive the results from the syc API and end up paying the same.
What I would like to happen is that I have a method which gives me the identifier, I store the identifier, and with other rutine I ask for the results and run the chain with all the runnables. So, basically, it would be in batch for all. But we make pauses during the LLM calls.
System Info
langchain==0.3.7 langchain-core==0.3.19 langchain-openai==0.2.9 langchain-text-splitters==0.3.2