dotnet/runtime

String comparisons with the CompareOptions.StringSort value produce incorrect results under .NET 5 and later

Open

#102579 opened on May 22, 2024

View on GitHub
 (4 comments) (0 reactions) (0 assignees)C# (17,886 stars) (5,445 forks)batch import
area-System.Globalizationdocumentationhelp wanted

Description

Description

When comparing strings, CompareOptions.StringSort should apply low sort weights to hyphens and other non-alphanumeric characters. This works in .NET Framework projects. In .NET 5 and later, however, the weightings are not applied and the results of sorting with CompareOptions.StringSort are the same as when CompareOptions.None is chosen.

Note: I am using the default ICU Unicode processing for .NET 5+ testing.

Reproduction Steps

This code is adapted from the CompareOptions Enum documentation page here. The word list has been copied verbatim.

using System;
using System.Collections.Generic;
using System.Globalization;

public class SamplesCompareOptions
{
	public static void Main()
	{
		var wordList = new List<string> { "cant", "bill's", "coop", "cannot", "billet", "can't", "con", "bills", "co-op" };

		wordList.Sort((x, y) => string.Compare(x, y, CultureInfo.CurrentCulture, CompareOptions.None));
		Console.WriteLine("\nAfter default sort (CompareOptions.None):");
		foreach (string word in wordList)
		{
			Console.WriteLine(word);
		}

		wordList.Sort((x, y) => string.Compare(x, y, CultureInfo.CurrentCulture, CompareOptions.StringSort));
		Console.WriteLine("\nAfter sorting with CompareOptions.StringSort:");
		foreach (string word in wordList)
		{
			Console.WriteLine(word);
		}
	}
}

DotNetFiddle for the code here.

Expected behavior

The CompareOptions.StringSort should apply a correct weighted ordering to the unordered collection of strings. The results are correct in .NET Framework 4.7.2 and Roslyn 4.8:

After default sort (CompareOptions.None):
billet
bills
bill's
cannot
cant
can't
con
coop
co-op

After sorting with CompareOptions.StringSort:
bill's
billet
bills
can't
cannot
cant
co-op
con
coop

Actual behavior

In .NET 5 and later, CompareOptions.StringSort is incorrect, producing the same results as CompareOptions.None:

After default sort (CompareOptions.None):
bill's
billet
bills
can't
cannot
cant
co-op
con
coop

After sorting with CompareOptions.StringSort:
bill's
billet
bills
can't
cannot
cant
co-op
con
coop

Regression?

According to testing on dotnetfiddle.net, the correct results were produced in .NET Framework 4.7.2 and Roslyn 4.8. .NET 5 and later produce the incorrect sort order.

Known Workarounds

A potential workaround may be to switch from ICU to NLS, but I have not tested this.

Configuration

My system:

  • .NET 8
  • Windows 11 latest
  • x64

I don't think the issue is specific to my OS or architecture, as the same problem can be seen via dotnetfiddle.

Other information

No response

Contributor guide