Chapter 3. Localization

1. Multibyte, UNICODE and Locales

The pgExpress Driver can handle Multibyte/UNICODE/Locales in the following ways:

  1. Using the built-in dbExpress™ Locale support.

    Basicly, you just would have to set in your dbxconnections file:

    LocaleCode = XXXX
    XXXX is the TLocaleCode type value for your Locale. For instance:
    LocaleCode = 1041

    would set the current locale to 1041 (= Japanese).

    Check the Delphi help for TLocaleCode type, TSQLConnection.LocaleCode, and Driver parameters (you can use the help's 'Find' feature if you don't locate these easily).

  2. Using automatic PostgreSQL server-client conversion. The pgExpress Driver implements this by using the parameter in the dbxconnections(.ini) file (could also be a value in the TSQLConnection.Parameter property).

    Since dbExpress™ do not provide custom parameters support, and it does not support a ClientCharset parameter, we have to use the following hack: providing both the Server and Client encodings ServerCharset parameter (the client encoding is optional). The format used is:

    ServerCharset = ServerEncoding[/ClientEncoding]
              

    Examples (you will actually use only one line at once):

    ServerCharset = EUC_JP
    ServerCharset = EUC_TW/UNICODE
              

    If no client/server encodings are set, defaults are using depending on the database encoding.

    The valid encodings, from the PostgreSQL documentation, are:

    Table 3.1. PostgreSQL Encodings

    EncodingDescription
    SQL_ASCIIASCII
    EUC_JPJapanese EUC
    EUC_CNChinese EUC
    EUC_KRKorean EUC
    EUC_TWTaiwan EUC
    BIG5Chinese BIG5
    UNICODEUnicode (UTF-8)
    MULE_INTERNALMule internal code
    LATIN1ISO 8859-1 ECMA-94 Latin Alphabet No.1
    LATIN2ISO 8859-2 ECMA-94 Latin Alphabet No.2
    LATIN3ISO 8859-3 ECMA-94 Latin Alphabet No.3
    LATIN4ISO 8859-4 ECMA-94 Latin Alphabet No.4
    LATIN5ISO 8859-9 ECMA-128 Latin Alphabet No.5
    LATIN6ISO 8859-10 ECMA-144 Latin Alphabet No.6
    LATIN7ISO 8859-13 Latin Alphabet No.7
    LATIN8ISO 8859-14 Latin Alphabet No.8
    LATIN9ISO 8859-15 Latin Alphabet No.9
    LATIN10ISO 8859-16 ASRO SR 14111 Latin Alphabet No.10
    ISO-8859-5ECMA-113 Latin/Cyrillic
    ISO-8859-6ECMA-114 Latin/Arabic
    ISO-8859-7ECMA-118 Latin/Greek
    ISO-8859-8ECMA-121 Latin/Hebrew
    KOI8KOI8-R(U)
    WINWindows CP1251
    ALTWindows CP866

    The values for valid server encoding and client encodings and their detailed descriptions are in the PostgreSQL's documentation at Multibyte Section ). Internally, pgExpress will interpret the values in the following way:

    1. If you provide only a ServerEncoding, the pgExpress will try to setup a default ClientEncoding for it. The default ClientEncoding will be set by the libpq the same as the ServerEncoding, except for the UNICODE and MULE_INTERNALServerEncoding, whose have no default value.
    2. If you provide a ClientEncoding parameter, it will be set regardless of what is defined in the ServerEncoding parameter. If you want to set only a ClientEncoding value, just omit the ServerEncoding value (but include the / separator), like this:
      ServerCharset = /latin2

      This will set the ClientEncoding to latin2 regardless of the ServerEncoding.

      Other examples:

      ServerCharset = latin2

      The libpq will set the ClientEncoding to latin2 because it's the default encoding for the latin2 ServerEncoding.

      ServerCharset = latin2/latin3

      This will set the ClientEncoding to latin3.

    3. As of BDS 2006 and pgExpress 4.x, there is also UNICODE support thorugh use of TWideStringField fields. To use UNICODE, the following requirements must be met:

      • Use the pgExpress Driver for dbExpress™ protocol 3.0
      • Set the application's client charset to UNICODE. If the database is already UNICODE, and protocol 3.0 driver is used, this will be automatically set.
      • Set the UnicodeAsWideChar special param so that the string fields are mapped as TWideStringField fields; otherwise, they would be mapped as TStringField fields.

      Note

      Unhappily as of BDS 2006 there is no TWideMemo support for dbExpress™. The VCL/CLX will remap any fields defined as such as ftBlob/ftBinary.

If you have problems opening bases with UNICODE encoding ("Invalid encoding" errors), add a client encoding to your dbxconnections(.ini) entry, that will convert from multibyte to single byte encoding, according to your locale, such as:

ServerCharset = /latin1

...or:

ServerCharset = /latin1