Delphi dbExpress and Interbase: steps and risks of migrating UTF8?

Our database currently uses Win1252 as the only character encoding. We will soon have to support Unicode in the database tables, which means that we must complete this migration for four databases and about 80 Delphi applications that work internally in a 24/7 environment. Are there any recommendations for migrating databases to UTF-8 (or UNICODE_FSS) for Delphi applications? Some issues are listed below. Thanks so much for your answers!

  • are there tools that help with migrating existing databases (250 to 2 GB in size, without Blob fields), by dumping data, re-creating the database using UNICODE_FSS or UTF-8, and loading the data back?
  • Are there any known issues with Delphi 2009, dbExpress, and Interbase 7.5 related to Unicode character sets?
  • Would you recommend upgrading your databases to Interbase 2009 first ? (This update is planned, but does not have high priority)
  • can we just do the database migration and will Delphi process Unicode character sets automatically or will we have to change all character types of characters in each datamodule (dfm and source code) too?
  • What strategy would you recommend working with parallel migration in the normal development and maintenance of an existing application? The application works inside the company, so the development and administration of the database is carried out internally.

Update: from the discussion section of the InterBase forum: Unicode Databases in InterBase - really? (this is not a stream from me, but it shows that some problems still exist in InterBase XE).

Here are some reports that I submitted: QC # 92867 - String fields are empty coming from views only if the view includes Union, and when using ClientDataSet. This was found as a lack of data on several of my reports that no longer work.

QC # 91494 - IB Character column data Character fields (for example: Char (1)) are filled with spaces when retrieving through a stored procedure. Tests fail - for example: If Active = "Y". I actively use stored procedures with forms and they do not work.

QC # 91355 - IBSqlMonitor error. IBSqlMonitor's output is somewhat distorted, making this tool useless. (So ​​even my spade is broken!)

Unconfirmed - Permanent fields in the TClientDataSet Error for TWideString.

Other related QC entries:

QC # 94455 SQL Unicode Char Type Error (InterBase XE)

+4
delphi delphi-2009 utf-8 dbexpress interbase
source share
8 answers

Both Database Workbench and IBExpert can perform data migration.

I will come back to you on other issues when I am in Entwickler Tage.

- Jeroen

+1
source share

Problem : UPDATE in an empty string field no longer finds an entry. If the UTF8 character field is empty, the DataSetProvider generates the wrong SELECT for the update action.

Symptom: message record not found or not edited by another user

Solution : upgrade to Delphi 2010 Update 4 or use the workaround described in QC

+1
source share

Problem : CHAR fields no longer work and must be replaced with VARCHAR.

Symptom: SELECT queries in a column that is now using UTF8 and imported from WIN1252 with ASCII values ​​no longer return any value. Perhaps this is a bug I should report to QC.

Solution : replace all occurrences of CHAR( in the metadata of the DDL script database with VARCHAR(

0
source share

Problem : persistent string fields require a size property, which is the logical size of the field multiplied by four (see also: Is it possible to configure TStringField to work like TWideStringField in Delphi? )

Symptom: Access Violations

Solution : Delete the persistent field and add it again to update the Size property. (side effect: DisplayWidth will also increase the size, which will lead to problems with the user interface)

0
source share

Problem : UDFs (user-defined functions) with string parameters may break due to size restrictions.

Symptom:

 Dynamic SQL Error. SQL error code = -204. Data type unknown. Implementation limit exceeded. COLUMN DSQL internal. 

for this UDF:

 DECLARE EXTERNAL FUNCTION STRLEN CSTRING(32767) RETURNS INTEGER BY VALUE ENTRY_POINT 'IB_UDF_strlen' MODULE_NAME 'ib_udf'; 

Solution : fix the UDF parameters in the ad.

0
source share

Problem : dbExpress uses WideString as the data type inside, so all existing .AsString calls to read / configure and the parameter will no longer work

Symptom: special characters will not be saved / read correctly

Solution : replace all occurrences of .AsString with .AsWideString, but be careful not to change where the AsString method is not called in the field or parameter.

0
source share

Problem : dbExpress requires TStringField objects for WIN1252 fields. For UTF8 database fields for dbExpress, TWideStringField objects are needed.

Symptom: error message expected: WideString found: string '

Solution : Replace all occurrences of TStringField with TWideStringField. This requires all form files (dfm) to be text, not binary. Modified forms and datamodules will not be backward compatible.

0
source share

Problem : exporting metadata and tabular data for the WIN1252 database will lead to the creation of a CP1252-encoded file, but the import requires a UTF8 file (tested with IBExpert)

Symptom: script import errors in InterBase

Solution : use iconv to convert the script file to UTF8

0
source share

All Articles