huggingface/datasets

Consider using "Sequence" instead of "List"

Open

#5,354 opened on Dec 12, 2022

View on GitHub
 (18 comments) (0 reactions) (1 assignee)Python (18,313 stars) (2,496 forks)batch import
enhancementgood first issue

Description

Feature request

Hi, please consider using Sequence type annotation instead of List in function arguments such as in Dataset.from_parquet(). It leads to type checking errors, see below.

How to reproduce

list_of_filenames = ["foo.parquet", "bar.parquet"]
ds = Dataset.from_parquet(list_of_filenames)

Expected mypy output:

Success: no issues found

Actual mypy output:

test.py:19: error: Argument 1 to "from_parquet" of "Dataset" has incompatible type "List[str]"; expected "Union[Union[str, bytes, PathLike[Any]], List[Union[str, bytes, PathLike[Any]]]]"  [arg-type]
test.py:19: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
test.py:19: note: Consider using "Sequence" instead, which is covariant

Env: mypy 0.991, Python 3.10.0, datasets 2.7.1

Contributor guide