In the first example, only the collation needs to be changed. The data should not change character sets. To resolve the collation issue, you need to rebuild the database with new collation settings using the old data.
In an English environment, consider an old database using the 850LATIN1 collation. If the database contains data inserted from a Windows 'windowed' application, it is likely that the data is actually from the CP1252 character set, which does not match CP850 used by the 850LATIN1 collation. This situation will often be discovered when an ORDER BY clause seems to sort accented characters incorrectly. To correct this problem, you would create a new database using the 1252LATIN1 collation, and move the data from the old database to the new database without translation, since the data is already in the character set (CP1252) that matches the new database's collation.
The simplest way to ensure that translation does not occur is to start the server with the -ct- option. Then, rebuild the database normally.
For more information about rebuilding a database, see Rebuilding databases.
For more information about specifying collations when creating databases, see Creating a database with a named collation.
In the second situation, both the collation and the character set need to be changed. To resolve the collation and character set issues, you need to rebuild the database with the new collation settings, and change the character set of the data.
Suppose that the 850LATIN1 database had been used properly such that it contains characters from the CP850 character set. However, you want to update both the collation and the character set, perhaps to avoid character set translation. You would create a new database using 1252LATIN1, and move the data from the old database to the new database with translation, thus converting the CP850 characters to CP1252.
The translation of the database data from one character set to another occurs using the client-server translation feature of the server, which translates the character data during the communication between the client application and the server. The database's collation determines the character set for the database server side of the communication. The locale of the operating system determines the client's default character set, however, the client's character set can be overridden by the CharSet (CS) connection parameter.
For more information about the CharSet (CS) connection parameter, see CharSet connection parameter [CS].
Since character set translation takes place during the communication between the client application and the server, an external unload or reload is necessary. An internal unload and reload does not do character set translation. Similarly, if character set translation occurs in both the unload and the reload steps, you perform the translation and then immediately undo the translation and still end up where you began. Character set translation can occur in either the unload or the reload steps, but not in both.
To convert a database from one collation to another, and translate the data's character set (using translation on reload)
Unload the data from the source database.
You can use the Unload utility to produce a reload.SQL file and a set of data files in the character set of the source database. Since we do not want any translation during this phase, ensure that character set translation is not enabled on the server running the source database. For servers before version 8, ensure that -ct is not specified. If you are using a server that is version 8 or higher, ensure that -ct- (no character set translation) is specified when the server is started.
If the unload/reload is occurring on a single machine, use the -ix option to do an internal unload and an external reload. If the unload/reload occurs across machines, use the Unload utility with the -xx option to force an external unload and an external reload.
Remember that an "external" unload or reload means that an application (the dbunload and dbisql utilities) opens a cursor on the database and either reads or writes the data to disk. Character set translation occurs. An "internal" unload or reload means that an UNLOAD TABLE or LOAD TABLE is used so that the server reads or writes the data itself. Character set translation does not occur.
If you want to unload data from specific tables, use the -t option, or the Interactive SQL OUTPUT statement.
For more information on the Unload utility, see The Unload utility.
Create a target database with the appropriate collation using the Initialization utility including the -z option to specify the collation sequence for the target database.
The database should be created and reloaded using the version of the server and tools corresponding to the server that they will use to run it.
For more information on specifying collations when creating databases, see Creating a