Redocly/redoc

Unexpected slug when header contains both of parentheses and Japanese characters

Open

#1,754 建立於 2021年9月20日

在 GitHub 查看
 (1 留言) (0 反應) (0 負責人)TypeScript (21,877 star) (2,272 fork)batch import
help wanted

描述

Describe the bug We found a behavior where the Japanese disappear when the markdown header contains both Japanese and parentheses. For example, we write markdown as below, each slug looks like the arrowhead.

# (a) → (a)

# (あ) → ()

# (い)  → ()

As we can see, the same slug will be generated even if the values inside the parentheses are different.

Expected behavior

# (a) → (a)

# (あ) → (あ)

# (い)  → (い)

We can reproduce unexpected slug bug using the below test codes by adding them into https://github.com/Redocly/redoc/blob/master/src/utils/__tests__/helpers.test.ts .

test('safeSlugify disappears Japanese word when contains parentheses', () => {
      expect(safeSlugify('(a)')).toEqual('(a)');
      expect(safeSlugify('(あ)')).toEqual('()');
      expect(safeSlugify('(い)')).toEqual('()');
    });

Possible solutions

This behavior is due to the fact that the slugify package removes Japanese characters. Since the slugify function allows you to optionally specify the characters to be removed, we can solve this problem by excluding the Japanese character set in addition to the default value as below,

export function safeSlugify(value: string): string {
  // default regex is here: https://github.com/simov/slugify/blob/1142e000f2b99552afb13d4118acbc25177df140/slugify.js#L38
 // Japanese unicode range is here: https://stackoverflow.com/questions/19899554/unicode-range-for-japanese
  const slug= (
    slugify(value,{remove:/[^\w\s$*_+~.()'"!\-:@\u3000-\u303f\u3040-\u309f\u30a0-\u30ff\uff00-\uffef\u4e00-\u9faf]+/g}) ||
    value
      .toString()
      .toLowerCase()
      .replace(/\s+/g, '-') // Replace spaces with -
      .replace(/&/g, '-and-') // Replace & with 'and'
      .replace(/\--+/g, '-') // Replace multiple - with single -
      .replace(/^-+/, '') // Trim - from start of text
      .replace(/-+$/, '')
  ); // Trim - from end of text
  return slug
}

I would like to send a Pull Request contains these changes, so I ask maintainers whether it looks good.

貢獻者指南

Unexpected slug when header contains both of parentheses and Japanese characters · Redocly/redoc#1754 | Good First Issue