sphinx-doc/sphinx

lang attribute set to source language for untranslated text

Open

#13,841 opened on Aug 16, 2025

View on GitHub
 (3 comments) (0 reactions) (0 assignees)Python (5,625 stars) (1,985 forks)batch import
easyhelp wantedi18ntype:proposal

Description

Is your feature request related to a problem? Please describe.

When working with mixed-language web content, HTML elements should use the lang attribute to define their language, if that language differs from the lang attribute of the <html> page. Without this present, documentation produced with Sphinx fails WCAG SC 3.1.2 Language of Parts.

For example, I have documentation written in English, and I use Sphinx to build the HTML of an in-progress Finnish translation. I set the Sphinx config to translation_progress_classes = True, build the docs, and get:

<!DOCTYPE html>
<html lang="fi">
<!-- […] -->
<h1 class="translated">Tervetuloa ”Sphinx Wagtail teema” -dokumentaatioon!</h1>
<!-- ❌ Untranslated text should use lang="en" -->
<p class="untranslated">This is the Sphinx theme used for the official Wagtail docs.</p>

Describe the solution you'd like

All untranslated text should have a lang set to the source language. From the example above,

-<p class="untranslated">This is the Sphinx theme used for the official Wagtail docs.</p>
+<p lang="en" class="translated">This is the Sphinx theme used for the official Wagtail docs.</p>

Describe alternatives you've considered

A poor workaround would be to do this at the theme level, by using translation_progress_classes = True and JavaScript to add the lang attribute to the relevant elements.

Additional context

See Accessibility of multilingual content with mixed translation in the Python forum.

I have attempted to implement this myself alongside the AddTranslationClasses transform, it’s a simple node['lang'] = "en", however there are two issues. First docutils doesn’t seem to support setting the lang attribute, so we need to override starttag in HTML5Translator:

    def starttag(self, node: Element, tagname: str, *args: Any, **atts: Any) -> str:
        # Respect lang already decided by Sphinx (e.g., on <html>).
        if 'lang' not in atts:
            if lang := node.attributes.get('lang'):
                atts['lang'] = lang
        return super().starttag(node, tagname, *args, **atts)

Second, and more problematic, I can’t see a way to fetch the language of the source document. The existing language configuration option is for the target language. Adding a source_language configuration option that also defaults to en would solve this, but I’m not sure if there is a better way without introducing the extra option.

Contributor guide