My national language (Maltese) is described in Maltese language - Wikipedia as “the only Semitic language written in the Latin alphabet in its standard form”.  I was recently asked whether SQL Server can store Maltese characters, or specifically the characters which deviate from the Latin character set.

The Maltese alphabet - Wikipedia basically contains 4 characters which are an extension of the Latin alphabet.  The characters (including their lower-case values) are Ċċ, Ġġ, Ħħ, and Żż.  Words containing these letters can be stored in an SQL Server database only if the character data type can store Unicode characters; i.e. the nchar and nvarchar (Transact-SQL) data types.  Of course the consuming application should be able to display the data retrieved - for example when outputting to HTML or XML the charset and encoding properties respectively have to be set to UTF-8.

Another item to watch out for is the use of functions.  Sometimes when developing an application you’ll want to search a character string to find an ASCII value, or conver an ASCII numeric value to a character.  SQL Server has two built-in functions, namely ASCII (Transact-SQL) and CHAR (Transact-SQL).  These two functions however will work only with either numeric values between 0 and 255 or with the first 256 characters of the Latin character set respectively.  In the case of Unicode characters SQL Server has two other functions which can be used as a replacement.  The function names are UNICODE (Transact-SQL) (replaces ASCII) and NCHAR (Transact-SQL) (replaces CHAR).

Thus, if for example you want to retrieve the first 200 characters of the extended character set (which includes the Maltese characters) you can use the following sample:

DECLARE @i int = 256;
WHILE (@i < 200)
BEGIN
    SET @i += 1;
    PRINT CAST(@i AS nvarchar(10)) + ' ' + NCHAR(@i) + ' (' +
        CAST(UNICODE(NCHAR(@i)) AS nvarchar(10)) + ')';
END;

Note that the above code sample will work only with SQL Server 2008 and later versions.  For SQL Server 2000 and 2005 the syntax has to be modified slightly.

The ASCII values for the Maltese characters mentioned above are:

Character | ASCII Value ——— | ———– Ċ | 266 ċ | 267 Ġ | 288 ġ | 289 Ħ | 294 ħ | 295 Ż | 379 ż | 380  

When working with Unicode characters we have to bear in mind these little details as well as Collation sequences, fonts used to display text in applications and reports, exported or output files, and more.  But that’s another story.