whatwg/html

Escaped characters in source

Open

#3,683 opened on May 14, 2018

View on GitHub
 (4 comments) (0 reactions) (0 assignees)HTML (7,654 stars) (2,520 forks)batch import
good first issue

Description

The current source file has a large number of encoded entities. This makes it rather hard to edit and read. As UTF-8 is everywhere, is it time to replace these with their Unicode representation?

For example:

  <li value="9"><cite lang="sh">Црна мачка, бели мачор</cite>, 1998</li>

Becomes:

  <li value="9"><cite lang="sh">Црна мачка, бели мачор</cite>, 1998</li>

And

<p w-nodev>In an algorithm, steps in <span data-x="synchronous section">synchronous
  sections</span> are marked with &#x231B;.</p>

Could be changed to:

<p w-nodev>In an algorithm, steps in <span data-x="synchronous section">synchronous
  sections</span> are marked with ⌛.</p>

There is one obvious exception - invisible / non-printing characters.

Would you be interested in a pull request to transform all the &#x... references to decoded equivalent?

This builds upon the HTML5.3 work done in https://github.com/w3c/html/pull/1280

Contributor guide