Clean code & ReverseString

In some of his latest video, Nick Chapsas is reviewing various C# advises. The one we will be talking is about Keep it simple stupid, for which the following code is considered to be bad:

public string ReverseString(string input)
{
    char[] charArray = input.ToCharArray();
    Array.Reverse(charArray);
    return new string(charArray);
}

As opposed to the following, considered to be good:

public string ReverseString(string input)
{
    return new string(input.Reverse().ToArray());
}

It then started a discussion about which version is simpler.

Personally, I can read both as easily, and anyone who has been reading C# for a little while should.

So my point is that I do not care, as what makes the code simple is the fact that it is the implementation of a function for which the intent is clear.

The interesting part is when Nick Chapsas started to benchmark both versions and shows that the first one is faster and allocates less memory. In fact, before Linq optimizations in dotnet 6.0, the difference would have been even greater. The second version is slower because it relies on the IEnumerable abstract. There are more tests and more allocation because of this.
Nick Chapsas also showed an even faster implementation using spans.

So, whenever I can, I am explicit in my implementation. For instance, when I declare a private list field I generally declare it as List<T>, instead of IList<T>, or worse, IEnumerable<T>, because I then have access to functions specific to this data structure.
So here, I would have written the first version.

Also, someone in the comments pointed out that the code is incorrect because of the surrogates pair, and suggested to try the algorithm on a string with smileys.

So, this code is clean because it is a function, and therefore:

  • we can easily understand what the code does by simply reading the function name
  • we can easily test it
  • we can easily benchmark it
  • we can easily change the code for the faster version
  • we can easily fix the implementation when the code is proven faulty.

The implementation of the reverse that handle grapheme clusters (text elements), such as diacritic characters and emoji is:

public static string Reverse(this string input)
{
    return string.Create(input.Length, input, CopyRunesBackwards);

    static void CopyRunesBackwards(Span<char> span, string state)
    {
        var offset = span.Length;
        var enumerator = StringInfo.GetTextElementEnumerator(state);
        while (enumerator.MoveNext())
        {
            var t = enumerator.GetTextElement().AsSpan();
            offset -= t.Length;
            t.CopyTo(span[offset..]);
        }
        Debug.Assert(0 == offset, "String not completly initialized");
    }
}

Unfortunately, this implementation is slower and allocates more:

  • the enumerator is a class
  • GetTextElement returns a substring

. But you known how it is:

Make it work, make it right, make it fast.

So, getting inspiration from StringRuneEnumerator and TextElementEnumerator, I implemented a GraphemeClusterEnumerator. Coding it took more time than anticipated because it requires System.Text.Unicode.TextSegmentationUtility, which is an internal class, so I had to grab some code from the dotnet/runtime repo in github. The shape of the implementation is the same as above but using structs reduce the number of allocations.

Benchmark

The different implementations benchmark results are:

BenchmarkDotNet v0.13.8, Windows 11 (10.0.22000.2416/21H2/SunValley)
Intel Core i7-8700K CPU 3.70GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET SDK 7.0.401
  [Host]     : .NET 7.0.11 (7.0.1123.42427), X64 RyuJIT AVX2
  DefaultJob : .NET 7.0.11 (7.0.1123.42427), X64 RyuJIT AVX2
Method Input Mean Error StdDev Median Ratio RatioSD Allocated Alloc Ratio
Reverse_Array **** 4.602 ns 0.0906 ns 0.0756 ns 4.596 ns 1.00 0.00 - NA
Reverse_Linq 41.157 ns 0.5132 ns 0.4549 ns 41.166 ns 8.94 0.16 80 B NA
Reverse_Span 3.306 ns 0.1261 ns 0.1808 ns 3.209 ns 0.74 0.04 - NA
Reverse_Text 3.209 ns 0.0914 ns 0.0855 ns 3.186 ns 0.70 0.02 - NA
Reverse_Optimized 3.307 ns 0.0622 ns 0.0552 ns 3.286 ns 0.72 0.02 - NA
Reverse_Array 😀😟 20.203 ns 0.4121 ns 0.3441 ns 20.070 ns 1.00 0.00 64 B 1.00
Reverse_Linq 😀😟 85.967 ns 0.7060 ns 0.6258 ns 86.001 ns 4.26 0.08 144 B 2.25
Reverse_Span 😀😟 14.995 ns 0.3527 ns 0.3464 ns 14.932 ns 0.74 0.02 32 B 0.50
Reverse_Text 😀😟 103.386 ns 2.0087 ns 1.9728 ns 103.014 ns 5.11 0.12 144 B 2.25
Reverse_Optimized 😀😟 63.314 ns 0.6747 ns 0.6311 ns 63.150 ns 3.14 0.06 32 B 0.50
Reverse_Array 0123456789 24.129 ns 0.4867 ns 0.5794 ns 23.844 ns 1.00 0.00 96 B 1.00
Reverse_Linq 0123456789 163.535 ns 2.7770 ns 2.5976 ns 163.151 ns 6.75 0.23 288 B 3.00
Reverse_Span 0123456789 17.467 ns 0.2907 ns 0.2720 ns 17.372 ns 0.72 0.02 48 B 0.50
Reverse_Text 0123456789 401.356 ns 4.9837 ns 4.1616 ns 399.765 ns 16.52 0.47 336 B 3.50
Reverse_Optimized 0123456789 261.415 ns 2.5037 ns 2.2194 ns 260.985 ns 10.78 0.32 48 B 0.50
Reverse_Array abcde(...)vwxyz [26] 26.849 ns 0.5541 ns 0.4912 ns 26.583 ns 1.00 0.00 160 B 1.00
Reverse_Linq abcde(...)vwxyz [26] 278.826 ns 4.8125 ns 4.0186 ns 278.790 ns 10.38 0.27 464 B 2.90
Reverse_Span abcde(...)vwxyz [26] 19.804 ns 0.1992 ns 0.1663 ns 19.815 ns 0.74 0.02 80 B 0.50
Reverse_Text abcde(...)vwxyz [26] 992.318 ns 10.9564 ns 9.7126 ns 991.909 ns 36.97 0.65 752 B 4.70
Reverse_Optimized abcde(...)vwxyz [26] 660.109 ns 8.5522 ns 6.6770 ns 659.187 ns 24.55 0.57 80 B 0.50
Reverse_Array grinn(...)ce 😟 [34] 27.702 ns 0.5980 ns 0.6887 ns 27.426 ns 1.00 0.00 192 B 1.00
Reverse_Linq grinn(...)ce 😟 [34] 340.099 ns 2.1584 ns 1.8024 ns 340.646 ns 12.21 0.39 584 B 3.04
Reverse_Span grinn(...)ce 😟 [34] 19.777 ns 0.3133 ns 0.2931 ns 19.653 ns 0.71 0.02 96 B 0.50
Reverse_Text grinn(...)ce 😟 [34] 1,257.443 ns 20.1528 ns 23.2079 ns 1,255.964 ns 45.41 1.24 928 B 4.83
Reverse_Optimized grinn(...)ce 😟 [34] 882.679 ns 11.5303 ns 10.7855 ns 878.964 ns 31.77 0.92 96 B 0.50

References & Useful Links

Leave a comment

Please note that we won't show your email to others, or use it for sending unwanted emails. We will only use it to render your Gravatar image and to validate you as a real person.