Alternating Least Squares Method
Use the Alternating Least Squares model (A, B and D matrices) for incomplete matrices to predict and recommend over a subset of cases:
You can use a spreadsheet containing a subset of cases in a single column to generate predictions for all the entries, including those which are observed, into a spreadsheet for the subset of cases.
You can generate top or bottom ’r;k’ column variable recommendations for the subset of cases into a spreadsheet.
For each user, the number of recommendations for each case will be equal to min(k, the number of column variables).
You can specify whether to have the top or bottom k recommendations. The Default is top k.
You can specify the value of k. The default is k = 5.
How the node works:
When you input a data set, the node checks to make sure it is correct. The node generates a compressed matrix according to your specifications. It then Identifies whether a warm start has been selected. Next it computes singular values and model matrices (A and B) or singular vectors based on your specifications.
- Fixed lambda Singular values and the model matrices or singular vectors will be generated based on a fixed lambda specified by the user.
- Grid of lambda Singular values and the model matrices or singular values will be generated based on optimized lambda over a user specified grif of lambda
- No Bisection Optimized lambda will be obtained strictly over the user specified lambda grid.
- Bisection Optimized lambda will be obtained using the user specified lambda grid and also interpolating between grid points using bisection method.
Ouput
- Display model output on user specified single or multiple spreadsheets
- Plot for the singular values
- An optional spreadsheet with prediction for the full matrix
Incomplete matrix
The algorithm will consider the missing entries of your input matrix as missing and will fit the model by iteratively imputing the missing entries.
Sparse matrix
The algorithm will consider the missing entries of your input matrix as zeros and will try to fit the model without imputing the missing entries.
Row indices
Specify the column in the input spreadsheet for storing row indices.
NOTE: Only one column from the input spreadsheet should be selected for the row indices.
Column indices
Specify the column in the input spreadsheet storing column indices.
NOTE: Only one column from the input spreadsheet should be selected for the column indices.
Values
Specify the column in the input spreadsheet storing the matrix values.
NOTE: Only one column from the input spreadsheet should be selected for the matrix values.
Row Sorted
Specify if the input matrix is row sorted in ascending order.
NOTE: Duplicate cell entries will be ignored only in case of row sorted data.
Rank
Specify if the input matrix is row sorted in ascending order.
NOTE: Duplicate cell entries will be ignored in case of row sorted data.
Lambda
Enter any value >=0.
NOT APPLICABLE IF LAMBDA GRID IS SELECTED.
Specify the penalizing parameter lambda.
Lambda Grid
Select whether or not to consider a grid of lambda values to fit the model.
APPLICABLE ONLY IF THE MATRIX TYPE IS INCOMPLETE.
NOTE: The node will produce the output model for the optional lambda.
Grid Search with Bisection
Select whether to run the node over the grid and move to bisection method if necessary.
APPLICABLE ONLY IF YES IS SELECTED FOR THE LAMBDA GRID AND THE MATRIX TYPE IS INCOMPLETE.
The specified grid can skip the optimal lambda for the specified rank. Selecting bisection will make sure the node finds the right model with the rank you specified.
If the bisection is set to NO, the node can end up with a model with a rank lower than what you specified,
Rule of Convergence
Choose one of the following to use as a stopping rule for the computation:
- Default Frobenius norm convergence
- Relative MSE to check
Number of Data Chunks for multithreading
- If the matrix is row-sorted, enter a perfect square interger >=1 and <=1024
- If the matrix is not row-sorted, enter any interger >=1 and <=1024
Specify the number of chunks of the data to make for the multithreaded operations to run. It must be a number whose square root produces an integer.
NOTE: The node uses the input matrix geometry for splitting the matrix for multithreading. Switching options for Compressed R-LIke Matrix reshuffles the rows and columns of the matrix and creates a different matrix geometry without changing any of the matrix properties. Also, too many chunks can slow down the computation.
Maximum Number of Iterations
Choose the maximum number of iterations to let the internal iterations run. Enter an integer > = 1 and < = 1000.
NOTE: Entering a very low value can stop the computation before the convergence. Entering a very high value can cause the computation to take a long time to finish if there is a low specified value of lambda with respect to the rank.
Compressed R-Like Matrix
Standard compressed R-like form (sparse or incomplete matrix form in R) is a triple column format where the first column stores the observed numeric row indices, second column stores the observed numeric column indices and third column stores the corresponding observed values.
- Select 0 based if the input matrix is in standard compressed R-like form with numeric indices starting from 0.
- Select 1 based if the input matrix is in standard compressd R-like form with numeric indices starting from 1.
- Select False for any other input format.
Warm Start
Select whether or not to use the previously fitted model to initiate the model parameters for the current run of the computation.
APPLICABLE ONLY IF A MODEL NODE FROM A PERVIOUS RUN IS CONNECTED TO THIS NODE. Warm Start is not applicable if the Lambda Grid option has been selected.
Model Matrix type
An option set to Singular Vectors displays singular vectors (U,V) instead of (A =UD) and (B =VD) (D = diagonal matrix of the square root of the singular values).
Set the option to A and B matrices for model deployment in the Alternating Least Squares Deployment node.
Complete Missing Entries
Set this option to True if a complete matrix should be displayed with missing entries replaced by predictions from the model.
Only applicable for incomplete Matrices.