enhancementgood first issuehelp wanted
説明
GLUE datasets are standard for evaluating NLU tasks.
In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks.