Serving Industries Worldwide

Innovative Ways - Satisfied Clientele

Managing culture issues in .NET programming, a challenge for programmer –String comparison

alt text

ASP.NET software companies or developers find it challenging to manage String operations like sorting, comparison etc. if it is culture dependent. Such issues are difficult to troubleshoot and it has potential to break the system.  

What is StringComparison?

The StringComparison is enumeration that is used to specify what string comparison should use among the options available.

  • Current culture or the invariant culture
  • Word or ordinal sort rules
  • Case-sensitive or case-insensitive

Developer passes the InvariantCulture object to a method that has aCultureInfo parameter, such as

  • Compare(string strnameA, string strnameB, bool ignoreCasetype, CultureInfo culture)

    This function compares two specified String objects, ignoring or honoring their case, and using culture-specific information to influence the comparison

    This returns an integer that indicates their relative position in the sort order.

    Below example shows how to use StringComparision.

    string stringname1 = "sister";

    string stringname2 = "Sister";

    // Cultural (linguistic) comparison.

    intStrresult=String.Compare(stringname1,stringname2,StringComparison.InvariantCultureIgnoreCase, System.Globalization.CultureInfo.InvariantCulture);
    MessageBox.Show(Strresult);

    //output: 0

  • Equals(string strA, string strB, CultureInfo culture)

    This function compares two specified String objects, using culture-specific information to influence the comparison.

    This returns a Boolean value true if both value are same else returns false.

    Below example shows how to use Stringcomparision for this.

    string stringname1 = "case";

    string stringname2 = "Case";

    bool result = String.Compare(stringname1, stringname2, System.Globalization.CultureInfo.InvariantCulture);
    boolStrresult2=String.Compare(stringname1,stringname2,System.Globalization.CultureInfo.

    InvariantCultureIgnoreCase);MessageBox.Show(Strresult);
    MessageBox.Show(Strresult2);
    //output: false
    //Output: true

Here, ‘InvariantCultureIgnoreCase’ is used to compare a string using culture-insensitive sort rule, the invariant culture, and ignoring the case of string being compared.

And ‘InvariantCulture’ used to compare strings using culture-sensitive sort rules and the invariant culture.

Why is it difficult to manage?

In normal cases, String comparison is not an issue. However the issue arises while using culturally sensitive routines in places. It is difficult to manage string comparison where multiple culture is used in one project.

 In Turkish culture, for most Latin alphabets, the letter i (Unicode 0069) is the lowercase version of I (Unicode 0049). The Turkish alphabet, however, has two versions of the letter I, one with a dot and one without.

English culture1 – i (u0069) <-> Turkish – i(u0130)

English culture1 – i (u0069) <-> English – I(u0049)

English culture1 – I (u0049) <-> Turkish – I(u0131)

Here is some code that demonstrates what this mean:

Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US");

Console.WriteLine("Culture = {0}", Thread.CurrentThread.CurrentCulture.DisplayName);

Console.WriteLine("(file == FILE) = {0}", (string.Compare("file", "FILE", true) == 0));

Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");

Console.WriteLine("Culture = {0}",Thread.CurrentThread.CurrentCulture.DisplayName);

Console.WriteLine("(file == FILE) = {0}", (string.Compare("file", "FILE", true) == 0));

//Output:

//Culture = English (United States)

//(file == FILE) = True

//Culture = Turkish (Turkey)

//(file == FILE) = False

Here, solution and recommendation is to use ‘InvariantCulture’ if developer knows the criteria of the application, whether it includes different cultures or not.

Common mistakes by programmers

  • The content of the code which relies on default culture values make no any sense to the code. It is difficult to know whether developer actually meant an ordinal comparison of two strings or whether a case sensitive difference as shown in below example.

    string strprotocol = “HTTP”;
    if(!String.Equals(strprotocol, “http”))
    {
            throw new InvalidOperationException();
    }

    Below solution makes it clear that comparison is invariant (if application is for multiple cultures) or current/ordinal(not for multiple culture) culture one and differences in case are ignored.

    string strprotocol = “HTTP”;
    if(!String.Equals(strprotocol, “http”, StringComparison.InvariantIgnoreCase))
    {
              throw new InvalidOperationException();
    }

Conclusion

If asp.net software development company or developer is writing an application to run on multiple locales, even if there is no globalization or localization logic in the software, asp.net software development companies or developers should be knowing about these issues to make sure that the code functions properly in all scenarios.

It is suggested to consider following points during StringComparison.

  • One should use an overload of the String.Equals method to test equality of two strings.
  • One should use the CompareTo and String.Compare methods to sort strings and not to check equality.
  • One should use any of the culture to compare strings.
  • OrdinalIgnoreCase and StringComparison.Ordinal use the binary values directly, and are best suited for matching. One can use one of these, when he is not sure about comparison settings.
  • One should not use culture-sensitive formatting to persist numeric data or symbolic data or date and time data in string form.
  • In most cases, one should use string operations based on InvariantCulture. Persisting linguistically meaningful but culturally agnostic data is an exception.