Bayesian Reliability Optimization for Continuous Response Example

Overview

The Bayesian Reliability Optimization for Continuous Response node allows you to obtain the optimized Bayesian reliability (posterior probability of obtaining a pre-defined desirability) over the input space. The key advantage of the Bayesian formulation is that it explicitly takes into account the correlation structure of the data, the variability of the process distribution, and the model parameter uncertainty. The use of a non-informative prior distribution enables a fast, closed form computation of the posterior distribution.

In the example used by Derringer and Suich (1980), the problem is to find the most desirable tire tread compound. In this example, the four Y variables are PICO Abrasion Index, 200 percent modulus, elongation at break, and hardness. The characteristics of the product in terms of the response variables depend on the ingredients, which are the X variables: hydrated silica level, silane coupling level, and sulfur. The goal of the analysis is to find the levels of the ingredients that produce the most desirable tire tread compound in terms of the four outcome measures.

Specifying the Analysis

Derringer and Suich (1980) used a central composite (response surface) design to investigate the effects of the tire tread ingredients on the desirability of the product. The data for the completed experiment of 20 runs (as listed on p. 217 of Derringer Suich, 1980) are contained in the example data file Tiretre.sta.

First, open the Tiretre.sta data file and the Bayesian Reliability Optimization for Continuous Response node:    

  1. On the Datamining tab, in the Tools group, select Workspaces/ Example Procedures.
  2. When the Select Data Source dialog box displays in a new Workspace, select the Files button, navigate to the file and select it (Program Files>Statistica>Statistica 13>Examples>Datasets> Tiretre.sta).  The Dataset icon will display on the Workspace
  3. The Example Custom Procedures tab now displays on the Statistica toolbar under the Workspace tab.
  4. Ensure that the Dataset icon is still selected in the Workspace, and double click Bayesian Reliability Optimization for a Continuous Response.
  5. When the node displays in the Workspace, connected to the Dataset, close the selection box. If the two nodes are not connected, right-click on the dataset and select New Connection from This Node.. and drag the connection to the analytic node.
  6. On the node workspace, double click on the Input Data icon to display a Select dependent variables and predictors dialog box.
  7. Select the Variables button.
  8. Specify Abrasion, Modulus, Elong, and Hardness, as the continuous dependent variables in the first selection column.
  9. Specify Silica, Silane, and Sulfur as the continuous predictor variables in the third selection column.
  10. Select OK, then OK again to exit the selection dialog box.

The node allows up to second-order designs. Main effects, quadratic terms and two-way interactions can be included in the design.
  
    1.  
    1. To specify the design, double click on the Bayesian Reliability Optimization for a Continuous Response node.
    1.  On the Quick tab, select the Second-order terms Edit button to display the Edit Text dialog box.

  
NOTE: In this example, all first and second order terms will be included in the model.  The node automatically includes the main effects in the design. Only the higher order terms need to be specified as a comma separated list.
  
    1.  
    1. For the purpose of this example, enter the following text in the Edit Text dialog box:

 v5*v5,v6*v6,v7*v7,v5*v6,v5*v7,v6*v7
    1. Click OK.



  
The main purpose of this node is to find the point in the input space that maximizes the posterior probability of the form Pr(D(Y)≥D^* |x), where D^* is the minimum desirability threshold. The node uses a Monte-Carlo estimate of this probability since direct computation will be comparatively slow.
  
    1.  
    1. For this example. leave the box Number of simulations determined by node checked (default) to let the node use a smart default for approximation of the posterior probability. To use your own custom number of simulations, un-check this box and change the number in the Number of simulations numeric input box.
    1. This node provides three different methods to optimize the posterior probability. They are Numerical search, Grid search, and Random search. For this example, the use Numerical search.
      • Numerical search: Uses the L-BFGS-B optimization technique in the Optim package in R.  
      • Random search: Finds the optimal point by randomly selecting points (by default 3,000) in the factor space.  The number of random searches can be specified on the Random search tab.  
      • Grid search: The node will divide each factor into a grid of equally spaced intervals (by default 4 grid points are used) and will perform the optimization search over the grid.  
    1. The Bayesian reliability is based on the desirability function.  Click the Desirability tab to input parameters for the desirability function.   Lower, Upper, and Target values must be specified for each response.
    1. Click on the Lower bound values for desirability function Edit box, to specify the lower bounds for each response as a comma separated list. In this example, specify the Lower bounds as 80, 500, 300, 55.  
    1. Click OK.


    1. Similarly, specify the upper and target values.

 For the Upper values, enter 200, 2000,800,85.

      For the Target values, enter 129.5,1300,465,68.
  

You can set the S Curvature and T Curvature values of the desirability function.  These values specify the degree of desired accuracy around the target values.  They determine how quickly or slowly the desirability falls off as a response moves away from the target value.  
  
  • A value of 1 means that the fall off is linear.
  • A value greater than 1 means the falloff is quicker.
  •  A value less than 1 means the falloff is slower.
  
  •  

     

    1. In this example, set both of the Curvature parameters to 1 for each response, thus inducing a linear curvature (1,1,1,1).

  

      
  

    1.  
    1. For this example, leave the default value of 0.5 as the Desirability threshold.

  
The purpose of the node is to find the design point that maximizes the desirability function being greater than some specified threshold, that is, Pr(D(Y)≥D^* |x), where D^* is the minimum desirability threshold.  
  
    1.  
    1. Click to move to the Graph Options tab of the dialog box.
    1. In the Factor for X axis field, select Silica, and click OK to exit the edit box.
    1. In the Factor for Y axis field, select Sulphur, and click OK to exit the edit box.

  
The node produces graphical output that can help visualize the probability surface as a function of the input factors.   

 Graphical options can be specified on the Graph Options tab.

 For this example, you will plot the probability as a function of Silica and Sulphur while holding Silane to its maximum value of 1.633.    
  
    1.    
    1. To set Silane to 1.633, click on the Fixed factors for graph and enter 1.633 into the text box.


    1. Click OK to close the dialog box.

      Output

Run the node to generate the results.  You can monitor the process by watching the bottom status bar. It may take several minutes. When finished, Reporting document will display in the workspace.

The reporting documents will contain data results displaying the optimal design point found based on the numerical search method along with the value of the maximum probability.

The output also produces the Bayesian reliability graphs for Silica and Sulphur where the third factor Silane is set to maximum value.

Summary

This example illustrates the simultaneous optimization of several response variables by means of Bayesian reliability, which, unlike standard frequentist approaches, captures the covariance structure between the responses and addresses parameter uncertainty. It optimizes the posterior probability through a moderately fast computation and also allows you to graphically interpret the design over different factors.