sparklemotion/nokogiri

[feature request] HTML5 parser for JRuby implementation

Open

#2227 opened on Apr 29, 2021

View on GitHub
 (4 comments) (2 reactions) (0 assignees)Ruby (5,615 stars) (806 forks)batch import
help wantedplatform/jruby

Description

This issue is a placeholder for collaboration with the JRuby community to find a way to provide HTML5-compliant parsing for Nokogiri's JRuby implementation.

#2204 provides an HTML5 parser for the CRuby implementation by leveraging the Gumbo parser, implemented in C, and a C extension that is tightly coupled to libxml2. As a result, the Nokogiri::HTML5 module will not be immediately available on JRuby, which uses Xerces in place of libxml2.

The Nokogiri maintainers feel it is important to think about and we hope to work on this in the future. If you're interested in helping with HTML5 support on JRuby, please comment on this issue or ping the maintainers on the mailing list or the Discord channel.

Contributor guide