Convert from two letters language code to four letters Language Tag with PowerShell

Introduction

This post is about how to convert two letter language code to four letter language tag format (Language Tag defined by Microsoft).

Windows Language Tag was defined based on IETF BCP 47 best practice that defines Tags for Identify Languages and all the values and the Windows version where released are documented here (Windows Language Code Identifier (LCID) Reference).

Language Tag history

As a summary, Language Tag like en-GB almost always are a combination of language-COUNTRY. Where language uses Language Code ISO (ISO 639-1) and COUNTRY uses Country Code ISO (ISO 3166-1). But there are some exceptions where some extension text are needed.

For example:

  • “de-AT” represents German (‘de’) as used in Austria (‘AT’).
  • “sr-Latn-RS” represents Serbian (‘sr’) written using Latin script (‘Latn’) as used in Serbia (‘RS’).
  • “es-419” represents Spanish (‘es’) appropriate to the UN-defined Latin America and Caribbean region (‘419’).

Another interesting note is that language tags and their subtags, including private use and extensions, are to be treated as case insensitive: there exist conventions for the capitalization of some of the subtags, but these MUST NOT be taken to carry meaning. Thus, the tag “mn-Cyrl-MN” is not distinct from “MN-cYRL-mn” or “mN-cYrL-Mn” (or any other combination), and each of these variations conveys the same meaning: Mongolian written in the Cyrillic script as used in Mongolia.

And there are some conventions to have in mind:

  • [ISO639-1] recommends that language codes be written in lowercase (‘mn’ Mongolian).
  • [ISO15924] recommends that script codes use lowercase with the initial letter capitalized (‘Cyrl’ Cyrillic).
  • [ISO3166-1] recommends that country codes be capitalized (‘MN’ Mongolia).

Why convert to Language Tag?

Sometimes, we are working with functions or methods that requires Language Tag like “es-ES” instead of two letter language code. For example the method SetSingleValueProfileProperty, when used for saving changes into “SPS-MUILanguages” property, it requires a valid four letter MS Language Tag.

How to convert?

We will use CultureInfo.aspx) class behing System.Globalization namespace.
CultureInfo constructor allows initialize a new instance of CultureInfo class.
We will use the locale name constructor instead the integer code one.
CultureInfo Constructor (String).aspx)

The string parameter called “name” is:
Type: System.String
A predefined CultureInfo name, Name of an existing CultureInfo, or Windows-only culture name. name is not case-sensitive.

For a list of predefined culture names, see the National Language Support (NLS) API Reference. In the constructor we can use whatever of them in a not case-sensitive way.

For example whatever of these parameters in the constructor: “es”, “eS”, “Es”, “ES”, “es-es”, “eS-ES”, “ES-es”, “ES-ES” and any combination, will create CultureInfo for the Spanish of Spain language (LCID = 3082 and Language Tag or Culture Name = es-ES):

clip_image001

PowerShell code

Here, we have the PowerShell code to convert from two letters language code case insensitive to Language Tag (four letter language-COUNTRY format):

1
2
3
4
5
6
7
8
9
10
$langToConvert = "es";
try {
$cultureInfo = New-Object system.globalization.cultureinfo($langToConvert);
$languageCulture = $cultureInfo.TextInfo.CultureName;
$languageCulture
}
catch
{
Log "`n[Main] Errors found:`n$_" -ForegroundColor Red
}

As an interesting note, in the Windows Language Code Identifier Reference you can find a table like that:

clip_image002

And the Language ID field is the same that LCID used in the CultureInfo object:

clip_image003

$cultureInfo.TextInfo.LCID is 3082, which means 0x0C0A in HEX.

 

Author: José Quinto
Link: https://blog.josequinto.com/2016/04/21/convert-from-two-letters-language-code-to-four-letters-language-tag-with-powershell/
Copyright Notice: All articles in this blog are licensed under CC BY-SA 4.0 unless stating additionally.