Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved


Chapter 2 Customizing the Installation : Configuring Unicode Processing

Configuring Unicode Processing
You can configure the conversion, collation, and case processing of Unicode data in TIBCO Object Service Broker. A set of configuration files is provided that allows you the following configuration choices:
http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/
These source configuration files are read by the UNIGEN utility, which produces assembler source code. The source code is then assembled and the resulting object code is processed by SMP/E to replace the original configuration data in the system. The original data corresponds to the IBM-037 code page. There are no External User Syntaxes defined by default.
Five types of configuration data are used for the following:
File Formats
Each of the first 4 source configuration files consists of lines no longer than 80 characters:
Data lines can include comments (which follow an asterisk) after the required fields. The formats of data lines for the four types of files are shown below. The names in parentheses are the names of the files used to configure the system.
Unicode to EBCDIC Mapping (UNITOEBC)
Data mapping lines contain two significant fields (separated by white space):
Example:

 
* TIBCO Object Service Broker Unicode to EBCDIC conversion file
* Based on EBCDIC code page IBM-037.
0030 F0 *The character '0'
0031 F1 *The character '1'

 
A Unicode character can be mapped only once. You can map more than one Unicode character to the same EBCDIC character.
EBCDIC to Unicode Mapping (EBCTOUNI)
Data mapping lines contain two significant fields (separated by white space):
Example:

 
* TIBCO Object Service Broker EBCDIC to Unicode conversion file
* Based on EBCDIC code page IBM-037.
F0  0030    *The character '0'
F1  0031    *The character '1'

 
An EBCDIC character can be mapped only once.
Unicode Case Mapping (UNICASE)
Case mapping lines contain three significant fields (separated by white space):
A case indicator: either U or u if the character is uppercase, or L or l if the character is lowercase.
Example:

 
* TIBCO Object Service Broker Unicode Case Mapping File
* Based on Unicode locale en_US.
0041 U 0061   * A
FF22 U FF42   * B

 
Unicode Collation (UNICOLL)
Data lines contain a single significant field:
The data lines list the code points in order of their collation. The file must contain 65,536 unique data lines to specify all possible code points.
Unicode to/from External User Syntax Mapping (UNIXC01-UNIXC16)
The format is described at the following site:
http://icu.sourceforge.net/userguide/conversion-data.html
Data lines contain three significant fields (separated by white space):
A value (|0 or |1 or |2 or |3) indicating the fallback code to be used for this mapping. Only codes 0 and 1 are honored by TIBCO Object Service Broker.
The fifth source configuration file type is a ucm (UniCode Mapping) file which specifies a mapping between Unicode and a user-defined external syntax. You can have up to 16 files of this type to map up to 16 different external user syntaxes.
Sample Unicode Configuration Files
The following sample configuration files are shipped with TIBCO Object Service Broker. These names are the member names in the UNICODE data set provided. The 3- or 4-digit numbers in the filenames refer to the IBM-xxx EBCDIC code page they are based on. You can use the files as they are, or you can modify copies of these files to create the desired configuration specification.
 
Specifying Unicode Configuration (Optional)
 
You can choose the configuration files (if any) to modify. The UNIGEN in-stream PROC that appears in the JCL takes three parameters:
IN – the member name of the input configuration file. For correct performance, you should choose files that correspond to the NLS code page of your system.
OUT – the member name of the output assembler source that the UNIGEN program generates. The suggested values for OUT are UNITOEBC, EBCTOUNI, UNICASE, and UNICOLL to correspond to the values of P, 1-4 respectively.
This will generate assembler source code in <HLQNONV>.<SMP>.SRCSAMP. It should end with RC=0.
Modify USERMODA to match ++SRC statements with OUTPUT members generated in the previous step.
Pay close attention to DD statements that are commented out as they may relate to TIBCO Service Gateways also installed in your site.
It should end with RC=4. Return code 4 is expected because the binder warning messages that appear, IEW2609W, IEW2646W, and IEW2651W, are normal and can be ignored.
Specifying External User Syntaxes (Optional)
 
It invokes the UNIGEN utility to define mappings between Unicode and external user syntaxes. The UNIGENXC in-stream PROC takes four parameters:
IN – the member name of the input ucm file describing the mapping.
FB – a TRUE or FALSE value indicating whether or not fallback codes present in the input file are to be used.
Customize UNIGENXC as shown above. You can run the PROC up to 16 times to define up to 16 external user syntaxes.
This will generate assembler source code in <HLQNONV>.<SMP>.SRCSAMP. It should end with RC=0.
Modify USERMODC to match ++SRC statements with OUTPUT members generated in the previous step.
Pay close attention to DD statements that are commented out as they may relate to TIBCO Service Gateways also installed in your site.
It should end with RC=4. Return code 4 is expected because the binder warning messages that appear, IEW2609W, IEW2646W, and IEW2651W, are normal and can be ignored.
 

Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved