A multi-dimensional index (MDI) enables you to efficiently and flexibly retrieve information you need for business analysis. It looks at data differently from transaction processing systems in which the goal is to retrieve records based on a key. (WebFOCUS uses a B-tree index for this type of retrieval). The MDI is for retrieval only. It is not used for MODIFY or Maintain Data requests.
Business analysts may be interested in specific facts (data values, also called measures) about multiple categories of data in the data source. Categories of data, such as region or department, are referred to as dimensions. A multi-dimensional index uses dimensions and all of their hierarchical relationships to point to specific facts.
The MDI is a multi-field index that contains at least two dimensions. This index behaves like a virtual cube of values that intersect at measures of interest. The more dimensions used in a query, the better the retrieval performance.
For example, suppose that the CENTORD data source has an MDI with dimensions STATE, REGION, and PRODCAT. The MDI is used to retrieve the facts (LINEPRICE and QUANTITY data) that lie at the intersection of the dimension values specified in the following request:
TABLE FILE CENTORD SUM QUANTITY LINEPRICE WHERE REGION EQ 'EAST' WHERE STATE EQ 'DC' WHERE PRODCAT EQ 'Cameras' END
The MDI also provides the following other retrieval enhancing features: MDI JOIN, Dimensional JOIN, MDI WITHIN, MAXVALUES, MDI Encoding, and AUTOINDEX for MDI.
How to: |
All MDI attributes are specified in the Access File for the data source. The only attribute needed in the Master File is the ACCESSFILE attribute to point to the Access File containing the MDI specifications.
An MDI can be partitioned into multiple MDI files. However, even if the data source on which the MDI is built is partitioned, each MDI partition spans all data source partitions.
MASTER = masterfile,$ DATA = database_filename1,$ . . . DATA = database_filenamen ,$ MDI = mdiname, TARGET_OF = segname ,$ DIM = field1 [MAXVALUES = n1] [WITHIN = dimname1],$ . . . DIM = fieldn [MAXVALUES = nn] [WITHIN = dimnamen],$ MDIDATA = mdifile1 ,$ . . . MDIDATA = mdifilen,$
where:
Is the Master File name.
Are fully qualified physical file names in the syntax native to your operating environment. If the name contains blanks or special characters, it must be enclosed in single quotation marks. Multiple DATA declarations describe concatenated partitions.
Is the logical name of the MDI, up to 8 characters.
Is the segment that contains the facts pointed to by the MDI. If the target data is distributed among several segments, the target should be the top segment that contains MDI data in order to avoid the multiplicative effect.
Are the fields to use as dimensions. At least two dimensions are required for an MDI.
Are fully qualified physical file names for the MDI in the syntax native to your operating environment. If the name contains blanks or special characters, it must be enclosed in single quotation marks. Multiple MDIDATA declarations describe concatenated partitions.
Is the number of distinct values the field can have. This number must be a positive integer.
Defines a hierarchy of dimensions. This dimension is defined within the dimname dimension. For example, CITY WITHIN STATE.
This example shows an MDI with two partitions:
MASTERNAME = CAR,$ DATA = /user1/car.foc,$ MDI = carmdi, TARGET_OF = ORIGIN,$ DIM = CAR,$ DIM = COUNTRY,$ DIM = MODEL,$ MDIDATA = /user1/car1.mdi,$ MDIDATA = /user1/car2.mdi,$
This example shows an MDI with two partitions:
MASTERNAME = CAR,$ DATA = c:\user1\car.foc,$ MDI = carmdi, TARGET_OF = ORIGIN,$ DIM = CAR,$ DIM = COUNTRY,$ DIM = MODEL,$ MDIDATA = c:\user1\car1.mdi,$ MDIDATA = c:\user1\car2.mdi,$
This example shows an MDI with two partitions:
MASTER = CAR,$ DATA = USER1.CAR.FOCUS,$ MDI = CARMDI, TARGET_OF = ORIGIN,$ DIM = CAR,$ DIM = COUNTRY,$ DIM = MODEL,$ MDIDATA = USER1.CAR1.MDI,$ MDIDATA = USER1.CAR2.MDI,$
Each MDI is specified in an Access File for the data source. WebFOCUS uses the Access File in its retrieval analysis for each TABLE request.
You then use the REBUILD MDINDEX command to build the MDI. The MDI has the following DCB attributes: RECFM=F,LRECL=4096,BLKSIZE=4096.
A multi-dimensional index gives complex queries high-speed access to combinations of dimensions across data sources. If you know what information users want to retrieve and why, you can make intelligent choices about index dimensions.
An Access File can define more than one MDI. If the Access File defines multiple MDIs, the AUTOINDEX facility chooses the best index to use for each query.
The first step in designing an MDI is to find out what kind of information users need from the data source. You can get advice about your MDIs directly from WebFOCUS.
Reference: |
The choice of index dimensions depends on knowing the data and on analyzing what is needed from it. Examine the record selection (IF and WHERE) tests in your queries to see how many index dimensions each application actually uses to select records. If different applications need different subsets of dimensions, consider separate MDIs for the separate subsets. Although WebFOCUS can produce high-speed reporting performance with indexes of up to 30 dimensions, smaller indexes usually generate less retrieval overhead. You can create an unlimited number of MDIs.
The following are good candidates for dimensions in an MDI:
Including a field that is updated frequently (such as an AUTODATE field) in the MDI, requires frequent rebuilding of the MDI in order to keep it current. WebFOCUS can advise you on selecting MDI dimensions.
DEFINE fields described in the Master File can be used as dimensions. Dynamic DEFINE fields cannot be dimensions.
An MDI is for retrieval only. FIND and LOOKUP are not supported on an MDI.
The following guidelines apply to each MDI:
MAXVALUES |
Number of Bytes Required |
---|---|
1 through 253 |
1 |
254 through 65,533 |
2 |
Greater than 65,533 |
4 |
To allow for expansion, if the maximum number of values is close to a limit, make MAXVALUES big enough to use a larger number of bytes. For example, if you have 250 values, specify 254 for MAXVALUES, and reserve 2 bytes for each dimension value.
The REBUILD command is used to create or maintain a multi-dimensional index. This command can be issued in a FOCEXEC.
The best MDI is built by specifying the dimensions in order of best cardinality (most distinct values).
To issue the REBUILD command in a FOCEXEC, you place the REBUILD command and the user-supplied information needed for REBUILD processing in the FOCEXEC.
If the MDI file might be larger than two gigabytes or if you plan to add more data partitions to it, the MDI index file must be partitioned from the initial REBUILD phase. After the index has been created, you can use it in a retrieval request. You cannot use an MDI for modifying the data source. If you update the data source without rebuilding the MDI and then attempt to retrieve data with it, WebFOCUS displays a message indicating that the MDI is out of date. You must then rebuild the MDI.
The following FOCEXEC creates the CARMDI MDI and contains each user-supplied value needed for REBUILD processing on a separate line of the FOCEXEC, entered in uppercase:
REBUILD MDINDEX NEW CAR CARMDI NO
WebFOCUS allows you to use an MDI in a TABLE or SQL query. The performance is best when all of the dimensions in the MDI are used in selection criteria for the query.
There are two ways to use an MDI with a TABLE query:
IF (or WHERE) field EQ value_1 OR value_2 OR value_3. . .
IF (or WHERE) field EQ A*
where field is a dimension in an MDI.
You can use an MDI with an SQL query by issuing an SQL SELECT statement with a WHERE test using a field of an MDI. For example,
SELECT field_1, field_2 FROM table WHERE field_3 = value;
where field_3 is a dimension in an MDI.
Note: AUTOINDEX must be turned on for this feature to be operational.
How to: |
WebFOCUS provides a query command that generates statistics and descriptions for your MDIs. The command ? MDI allows you to display information about MDIs for a given FOCUS/XFOCUS Master File that hosts the target of your MDI.
? MDI mastername {mdiname|*} [HOLD [AS holdfile]]
where:
Is the logical name of the Master File. If you do not include any other parameters, a list of all MDI names specified is displayed with the command TARGET_OF in the Access File for this mastername. If the Access File for the mastername does not have any MDI information, a message will display.
Is the logical name of an MDI. Specifying this parameter displays all the dimensions that are part of this MDI.
mdiname must be specified as TARGET_OF in the Access File for this mastername, or a message will display. If any of the dimensions are involved in a parent-child structure, a tree-like picture will display.
Displays a list of all dimensions, by MDI, whose targets are specified inside the Access File for this mastername.
Saves the output in a text file.
Is the file in which the output is saved. If this is not included with the AS phrase, the file is named HOLD.
How to: |
When an Access File defines multiple MDIs, retrieval efficiency for a query may become a function of the index it uses to access the data source. The AUTOINDEX facility analyzes each retrieval request and chooses the MDI or B-tree index that provides the best access to the data. You can issue the AUTOINDEX command in a FOCEXEC or in a profile. The AUTOINDEX facility can be enabled or disabled. The default is to disable AUTOINDEX on reverse byte platforms and to enable it on forward byte platforms.
In its analysis, AUTOINDEX considers the following factors:
If the selection criteria in a request do not involve any MDI fields, WebFOCUS looks for an appropriate B-tree index to use for retrieval. If a field is both a B-tree index and a dimension in an MDI, the MDI is used for retrieval if two-thirds of the fields in selection tests are dimensions in the MDI. If it is less than two-thirds, the B-tree index is used. If there are multiple B-tree indexes, the one highest in the hierarchy is used.
If everything else is equal, WebFOCUS uses the first MDI it finds in the Access File.
SET AUTOINDEX = {ON|OFF}
where:
Optimizes MDI retrieval. AUTOINDEX can only be set to ON when the ALL parameter is set to OFF.
Disables the AUTOINDEX facility. No MDI will be used for retrieval.
Note: AUTOINDEX defaults to ON on reverse byte platforms and to OFF on forward byte platforms.
TABLE FILE filename.mdiname
You can also assure access with a specific MDI by creating an Access File that describes only that index.
How to: |
Reference: |
Joining to an MDI uses the power of the MDI to produce a fast, efficient join. Instead of joining one field in the source file to an indexed field in the target file, you can join to multiple dimensions of the MDI.
When the join is implemented, the answer set from the source file is created, and the values retrieved for the source fields serve as index values for the corresponding dimensions in the target MDI.
You can join to an MDI in two ways:
The source fields must form a one-to-one correspondence with the target dimensions. The MDI engine uses the source field values to get pointers to the target segment of the MDI, expediting data retrieval from the target file.
You can think of the source fields as mirror dimensions. If you put tighter constraints on the mirror dimensions, a smaller answer set is retrieved from the source file, and fewer index I/Os are required to locate the records from the target file. The speed of the join improves dramatically with the speed of retrieval of the source file answer set. Therefore, you can expedite any TABLE request against the source file by incorporating selection criteria on fields that participate in either a B-tree or MDI.
The following formula computes the time for a TABLE request that uses an MDI Join:
Total Time = Time to Retrieve the answer set from the source file (Ts) + Time to retrieve the MD index pointers (Tp) + Time to retrieve data from the target file (Tt)
Using a B-tree index or MDI in data retrieval reduces all types of retrieval time, reducing the total retrieval time.
JOIN field_1 [AND field_2 ...] IN sfile [TAG tag_1] TO ALL mdiname IN tfile [TAG tag_2] [AS joinname] [END]
where:
Are the join fields from the source file.
Is the source Master File.
Are one-character to eight-character names that can serve as unique qualifiers for field names in a request.
Is the logical name of the MDI, built on tfile, to use in the join.
Is the target Master File.
Is a one-character to eight-character join name. A unique join name prevents a subsequent join from overwriting the existing join, allows you to selectively clear the join, and serves as a prefix for duplicate field names in a recursive join.
Is required to terminate the JOIN command if it is longer than one line.
JOIN field_1 [AND field_2 ...] IN sfile [TAG tag_1] TO ALL dim_1 [AND dim_2 ...] IN tfile [TAG tag_2] [AS joinname] [END]
where:
Are the join fields from the source file.
Is the source Master File.
Are one-character to eight-character names that can serve as unique qualifiers for field names in a request.
Are dimensions in tfile.
Is the target Master File.
Is a one-character to eight-character join name. A unique join name prevents a subsequent join from overwriting the existing join, allows you to selectively clear the join, and serves as a prefix for duplicate field names in a recursive join.
Is required to terminate the JOIN command if it is longer than one line.
How to: |
Reference: |
WebFOCUS encodes indexed values any time a field or dimension of an MDI has a MAXVALUES attribute specified or is involved in a parent-child relationship. Encoded values are stored in the MDI file at rebuild time and can be retrieved and decoded with a TABLE request that specifies the MDIENCODING command. The MDIENCODING command allows the user to get output from the MDI file itself without having to read the data source.
The following two rules apply to fields in a TABLE request that uses MDIENCODING:
SET MDIENCODING = {ON|OFF}
where:
Enables retrieval of output from the MDI file without reading the data source.
Requires access of the data source to allow retrieval of MDI values.
TABLE FILE mastername.mdiname request ON TABLE SET MDIENCODING ON END
where:
Is the Master File.
Is the logical name of the MDI.
Is the TABLE request that decodes the MDI.
The following examples show correct MDI encoding:
TABLE FILE COMPANY.I DATA1 PRINT CITY BY STATE ON TABLE SET MDIENCODING ON END
TABLE FILE COMPANY.I DATA1 COUNT CITY IF STATE EQ NY ON TABLE SET MDIENCODING ON END
TABLE FILE COMPANY.I DATA1 PRINT CATEGORY ON TABLE SET MDIENCODING ON END
The following example is incorrect because CATEGORY is not part of the CITY-STATE hierarchy.
TABLE FILE COMPANY.I DATA1 PRINT CITY BY STATE IF STATE EQ NY IF CATEGORY EQ RESTAURANT ON TABLE SET MDIENCODING ON END
The following TABLE request accesses the CAR data source. It will use the CARMDI index for retrieval because CARMDI is the only MDI described in the Master File:
TABLE FILE CAR SUM RETAIL_COST DEALER_COST BY BODYTYPE -* WHERE Condition utilizing MDI fields: WHERE (COUNTRY EQ 'JAPAN' OR 'ENGLAND') AND (CAR EQ 'TOYOTA' OR 'JENSEN' OR 'TRIUMPH') AND (MODEL EQ 'COROLLA 4 DOOR DIX AUTO' OR 'INTERCEPTOR III' OR 'TR7') END
If the data source has grown due to the addition of new data partitions, and these partitions need to be added to the MDI, you must perform the following steps:
The following FOCEXEC contains commands that add a partition to a multi-dimensional index named CARMDI defined on the CAR data source:
REBUILD MDINDEX ADD CAR CARMDI NO
After the MDI is rebuilt to include the new data partitions, any retrieval query that uses the MDI will use the newly added data partitions within that MDI.
How to: |
Use the SET MDIPROGRESS command to view messages about the progress of your MDI build. The messages will show the number of data records accumulated for every n records inserted into the MDI as it is processed.
SET MDIPROGRESS = {0|n}
where:
Is an integer greater than 1000, which displays a progress message for every n records accumulated in the MDI build. 100,000 is the default value.
Disables progress messages.
How to: |
The SET MDICARDWARN command displays a warning message every time a dimension cardinality exceeds a specified value, offering you the chance to study the MDI build. When the number of equal values of a dimension data reaches a specified percent, a warning message will be issued. In order for MDICARDWARN to be reliable, the data source should contain at least 100,000 records.
Note: In addition to the warning message, a number displays in brackets. This number is the least number of equal values for the dimension mentioned in the warning message text.
SET MDICARDWARN = n
where:
Is a percentage value from 0 to 50.