akka/akka-http

'relaxed' parsing of URI's with UTF-8 characters

Open

#3,722 opened on Jan 6, 2021

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Scala (1,311 stars) (598 forks)batch import
discusshelp wantednice-to-have (low-prio)

Description

In 'relaxed' mode, the Uri apply method:

/**
   * Parses a valid URI string into a normalized URI reference as defined
   * by http://tools.ietf.org/html/rfc3986#section-4.1.
   * Percent-encoded octets are UTF-8 decoded.
   * Accepts unencoded visible 7-bit ASCII characters in addition to the RFC.
   * If the given string is not a valid URI the method throws an `IllegalUriException`.
   */

As specified, other UTF-8 characters, such as é, will not be accepted. Browsers are more lenient and do convert those to percent-encoding.

The behavior of Uri might be reasonable, but it would be nice to have a utility function that would also accept (invalid) URL's containing UTF-8 characters.

Somewhat related to #86

Contributor guide

'relaxed' parsing of URI's with UTF-8 characters · akka/akka-http#3722 | Good First Issue