scrapinghub/dateparser

Wrong prioritization of languages

Open

#770 建立於 2020年8月21日

在 GitHub 查看
 (6 留言) (0 反應) (1 負責人)Python (2,318 star) (443 fork)batch import
good first issue

描述

I think there is something wrong in dateparser prioritization of languages, as introducing 'en' even in the last position hurts extraction of dates that were extracted properly when English was not there.

import dateparser
dateparser.parse("11/12", languages=['en'])
Out[3]: datetime.datetime(2020, 11, 12, 0, 0)

This is right

dateparser.parse("11/12", languages=['es'])
Out[4]: datetime.datetime(2020, 12, 11, 0, 0)

This is also right, because the standard in Spain is DD/MM But now if we add English to the languages list in the last position...

dateparser.parse("11/12", languages=['es', 'en'])
Out[5]: datetime.datetime(2020, 11, 12, 0, 0)

We got it parsed like in English, even if Spanish is first in the list of languages. This is unexpected to me, I would have expected prioritizing Spanish instead.

貢獻者指南