chakra-core/ChakraCore

[RegExp] Unicode-mode RegExp incorrectly matches lone surrogates

Open

#98 创建于 2016年1月14日

在 GitHub 查看
 (2 评论) (0 反应) (0 负责人)JavaScript (9,000 star) (1,374 fork)batch import
BugSeverity: 2help wanted

描述

I was able to observe this in Edge 25.10586.0.0:

/[\ud800-\ud805]+/u.exec("\u{10000}\ud801\ud802") should return ["\ud801\ud802] but instead returns["\ud800"], which is the first half of "\u{10000}".

The spec requires the input string to be interpreted as a sequence of code points, i.e. surrogate pairs to be combined. So matching the lead surrogate of "\u{10000}" is incorrect.

A bit more reduced would be: /\ud800+/u.exec("\u{10000}") should return null, but returns ["\ud800"] instead.

贡献者指南