marshmallow-code/marshmallow

Support streaming dump

Open

#1,696 opened on Nov 24, 2020

View on GitHub
 (5 comments) (5 reactions) (0 assignees)Python (6,787 stars) (640 forks)batch import
help wanted

Description

If you feed a generator into a many=True schema, Marshmallow builds up the entire generator into memory before serializing it. This makes serializing a collection of many elements take longer and consume more memory than is necessary, possibly even exceeding the available memory of the system or time limits in the environment. These concerns are especially common in web services, where Marshmallow is often used for serializing JSON response bodies, and where web workers often run in memory-constrained environments, and clients or gateways will time out if the service takes too long to start streaming a response.

Users can currently hack around this with something like this:

from typing import Iterable
from marshmallow import Schema


def dumps_many(obj: Iterable, schema: Schema):
    schema.many = False
    yield "["
    it = iter(obj)
    i = next(it, None)
    while i is not None:
        yield schema.dumps(i)
        i = next(it, None)
        if i is not None:
            yield ","
    yield "]"
    schema.many = True


if __name__ == "__main__":
    import sys
    from marshmallow.fields import Int

    class MySchema(Schema):
        i = Int(required=True)

    obj = (dict(i=i) for i in range(int(sys.argv[1])))
    print(repr("".join(dumps_many(obj, MySchema(many=True)))))


# $ python3 foo.py 0
# '[]'
# $ python3 foo.py 1
# '[{"i": 0}]'
# $ python3 foo.py 2
# '[{"i": 0},{"i": 1}]'
# $ python3 foo.py 9999999999999  # you get the idea
# ...

But it would be great if Marshmallow offered first-class support for this.

Looks like this was previously discussed briefly in https://github.com/marshmallow-code/marshmallow/pull/1164#issuecomment-473316007 where @deckar01 said

We might want to explore streaming with generators in 3.x.

Is now a good time to add this to Marshmallow v3? Could be another really strong reason for v2 users to upgrade.

Thanks for your consideration and for the great work on Marshmallow!

Contributor guide

Support streaming dump · marshmallow-code/marshmallow#1696 | Good First Issue