Macro (SVB) Programs Example - Creating a Self-Updating Dataset
The functionality for a spreadsheet to run Statistical analyses on itself whenever its data is changed can be programmed into it by attaching an SVB macro to certain spreadsheet events as demonstrated in this example. Note that practically all spreadsheet (and other) events can be customized, thus providing the tools to build very sophisticated and highly customized automated data operations right "into" the spreadsheet.
How is the macro created? First, create a data file and set up the necessary cells. Data file Statistics Updating Demo.sta (available in the Examples\Macros\Document Event Examples directory, located in the directory in which you installed Statistica) demonstrates this functionality. In this particular data file, a cat clinic's records are kept within the spreadsheet, and its Statistical results are kept in another linked spreadsheet window. Whenever the data is altered in the data file, the corresponding results spreadsheet are updated.
When the spreadsheet is first opened, it appears as:
Then, its corresponding results spreadsheet is displayed as in the following image:
Now, while both of these spreadsheets are open, let's update the data file and watch it automatically update the results spreadsheet. For example, let's say that Isabelle comes in for a checkup. Her weight has increased by .9 pounds and she has also had a birthday since her last visit. We will update the data file by changing Isabelle's weight to 13.5, her age to 5, and her number of previous visits to 4. Instead of having to manually run descriptive statistics on this spreadsheet to update the results, the data file automatically does it for you. After altering the data file, the results spreadsheet now appears as:
' Document Events Option Base 1 Dim StatsSpreadsheet As New Spreadsheet Dim this As Spreadsheet Const NumOfStats As Integer = 10
Select Document in the Object box of the SVB editor (Document Events window); select the DataChanged event in the Proc box, and type the following program into the SVB editor:
Private Sub Document_DataChanged(ByVal Flags As Long, _ ByVal FirstCase As Long, ByVal FirstVar As Long, _ ByVal LastCase As Long, ByVal LastVar As Long, _ ByVal bLast As Boolean) CreateStatSheet ProcessStats End Sub
Select Document in the Object box of the SVB editor (Document Events window); select the Open event in the Proc box, and type the following program into the SVB editor:
Private Sub Document_Open() CreateStatSheet ProcessStats End Sub
Set the cursor below the Document_Open function (so that you are outside of any function) and type (or copy and paste) the following additional functions into the SVB editor:
Private Sub ProcessStats() Set this = EventSource Dim SumArray() As Variant ReDim SumArray(this.NumberOfVariables) Dim ValidObs() As Variant ReDim ValidObs(this.NumberOfVariables) Dim MinVals() As Variant ReDim MinVals(this.NumberOfVariables) Dim MaxVals() As Variant ReDim MaxVals(this.NumberOfVariables) Dim Medians() As Variant ReDim Medians(this.NumberOfVariables) Dim LowerQuartiles() As Variant ReDim LowerQuartiles(this.NumberOfVariables) Dim UpperQuartiles() As Variant ReDim UpperQuartiles(this.NumberOfVariables) Dim Variances() As Variant ReDim Variances(this.NumberOfVariables) Dim StandardDev() As Variant ReDim StandardDev(this.NumberOfVariables) For i = 1 To this.NumberOfVariables SumArray(i) = GetSum(i) ValidObs(i) = GetValidObs(i) MinVals(i) = GetMin(i) MaxVals(i) = GetMax(i) Medians(i) = GetMedian(i) LowerQuartiles(i) = GetLowerQuartile(i) UpperQuartiles(i) = GetUpperQuartiles(i) Variances(i) = GetVariance(i) StandardDev(i) = GetStandardDev(i) Next i For i = 1 To this.NumberOfVariables 'MEANS 'prevent division by zero If (ValidObs(i)) Then StatsSpreadsheet.Value(1,i) = SumArray(i)/ValidObs(i) Else 'zero number of observations for the variable StatsSpreadsheet.Value(1,i) = "----" End If 'MEDIANS StatsSpreadsheet.Value(2,i) = Medians(i) 'STANDARD DEVIATION StatsSpreadsheet.Value(3,i) = StandardDev(i) 'VALID NUMBER OF OBSERVATIONS StatsSpreadsheet.Value(4,i) = ValidObs(i) 'SUMS If SumArray(i) Then StatsSpreadsheet.Value(5,i) = SumArray(i) Else StatsSpreadsheet.Value(5,i) = "----" End If 'MINS StatsSpreadsheet.Value(6,i) = MinVals(i) 'MINS StatsSpreadsheet.Value(7,i) = MaxVals(i) 'LOWER QUARTILES StatsSpreadsheet.Value(8,i) = LowerQuartiles(i) 'HIGHER QUARTILES StatsSpreadsheet.Value(9,i) = UpperQuartiles(i) 'VARIANCE StatsSpreadsheet.Value(10,i) = Variances(i) Next i End Sub Function GetSum(VariableNum) As Double For i = 1 To this.NumberOfCases 'ignore missing data If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then TotalVal = TotalVal + this.Value(i,VariableNum) End If Next i GetSum = TotalVal End Function Function GetValidObs(VariableNum) As Double Dim TotalValid As Long TotalValid = 0 For i = 1 To this.NumberOfCases 'ignore missing data and make sure case selections 'allow its inclusion If Not(this.MissingData(i, VariableNum)) And _ this.SelectionCondition.Evaluate(i) Then TotalValid = TotalValid + 1 End If Next i GetValidObs = TotalValid End Function Function GetMin(VariableNum) As Variant 'Missing Data is the lowest value Dim CurrentMin As Double Dim FirstValidCase As Long FirstValidCase = 1 If GetValidObs(VariableNum) = 0 Then GetMin = "------" Exit Function End If While Not this.SelectionCondition.Evaluate(FirstValidCase) And _ FirstValidCase <> this.NumberOfCases FirstValidCase = FirstValidCase + 1 Wend 'all of the cases are being ignored If FirstValidCase > this.NumberOfCases Then GetMin = "------" Exit Function End If CurrentMin = this.Value(FirstValidCase,VariableNum) For i = (FirstValidCase + 1) To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then If CurrentMin > this.Value(i,VariableNum) Then CurrentMin = this.Value(i,VariableNum) End If End If Next i GetMin = CurrentMin End Function Function GetMax(VariableNum) As Variant 'Missing Data is the lowest value Dim CurrentMin As Double Dim FirstValidCase As Long FirstValidCase = 1 If GetValidObs(VariableNum) = 0 Then GetMax = "------" Exit Function End If While Not this.SelectionCondition.Evaluate(FirstValidCase) And _ FirstValidCase <> this.NumberOfCases FirstValidCase = FirstValidCase + 1 Wend 'all of the cases are being ignored If FirstValidCase > this.NumberOfCases Then GetMax = "----" Exit Function End If CurrentMax = this.Value(FirstValidCase,VariableNum) For i = (FirstValidCase + 1) To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then If CurrentMax < this.Value(i,VariableNum) Then CurrentMax = this.Value(i,VariableNum) End If End If Next i GetMax = CurrentMax End Function Function GetMedian(VariableNum) As Variant Dim Temp As Double Dim MedianPos As Integer MedianPos = 1 If GetValidObs(VariableNum) = 0 Then GetMedian = "----" Exit Function End If Dim CaseArray() As Double ReDim CaseArray(GetValidObs(VariableNum)) 'move case values to temporary array for sorting Dim CurrentCasePos As Long CurrentCasePos = 1 For i = 1 To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then CaseArray(CurrentCasePos) = this.Value(i,VariableNum) CurrentCasePos = CurrentCasePos + 1 End If Next i For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1 For i = 1 To Outer 'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp End If Next i Next Outer 'is the number of cases odd? If GetValidObs(VariableNum) = 1 Then GetMedian = CaseArray(1) ElseIf (GetValidObs(VariableNum) Mod 2) Then MedianPos = Fix((GetValidObs(VariableNum)/2)) + 1 GetMedian = CaseArray(MedianPos) Else MedianPos = (GetValidObs(VariableNum)/2) GetMedian = (CaseArray(MedianPos) + CaseArray(MedianPos + 1))/2 End If End Function Function GetLowerQuartile(VariableNum) As Variant Dim Temp As Double Dim LowerPos As Integer LowerPos = 1 Dim CaseArray() As Double If GetValidObs(VariableNum) = 0 Then GetLowerQuartile = "----" Exit Function End If ReDim CaseArray(GetValidObs(VariableNum)) 'move case values to temporary array for sorting Dim CurrentCaseArray As Long CurrentCaseArray = 1 For i = 1 To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then CaseArray(CurrentCaseArray) = this.Value(i,VariableNum) CurrentCaseArray = CurrentCaseArray + 1 End If Next i For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1 For i = 1 To Outer 'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp End If Next i Next Outer LowerPos = Fix((GetValidObs(VariableNum)/4)) + 1 GetLowerQuartile = CaseArray(LowerPos) End Function Function GetUpperQuartiles(VariableNum) As Variant Dim Temp As Double Dim LowerPos As Integer Dim UpperPos As Integer LowerPos = 1 Dim CaseArray() As Double If GetValidObs(VariableNum) = 0 Then GetUpperQuartiles = "----" Exit Function End If ReDim CaseArray(GetValidObs(VariableNum)) 'move case values to temporary array for sorting Dim CurrentCaseArray As Long CurrentCaseArray = 1 For i = 1 To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then CaseArray(CurrentCaseArray) = this.Value(i,VariableNum) CurrentCaseArray = CurrentCaseArray + 1 End If Next i For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1 For i = 1 To Outer 'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp End If Next i Next Outer LowerPos = Fix((GetValidObs(VariableNum)/4)) UpperPos =GetValidObs(VariableNum) - LowerPos GetUpperQuartiles = CaseArray(UpperPos) End Function Function GetVariance(VariableNum) As Variant Dim Sum As Double Dim N As Long Dim Mean As Double Dim SumOfSquares As Double SumOfSquares = 0 Sum = GetSum(VariableNum) N = GetValidObs(VariableNum) 'if there aren't any valid observations then return 0 If Sum = 0 Or N < 2 Then GetVariance = "----" Exit Function End If Mean = Sum/N For i = 1 To this.NumberOfCases If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then 'calculate squares of differences 'iterate through each case, subtract the means from it, 'and then square it. Add all of these up and you will 'have the sum of squares. SumOfSquares = SumOfSquares + _ ((this.Value(i,VariableNum) - Mean) ^ 2) End If Next i GetVariance = SumOfSquares / (N - 1) End Function Function GetStandardDev(VariableNum) 'it there aren't any valid cases then don't calculate std. dev. If GetVariance(VariableNum) = "----" Then GetStandardDev = "----" Else GetStandardDev = Sqr(GetVariance(VariableNum)) End If End Function Sub CreateStatSheet Set this = EventSource 'are there any extra rows at the bottom? Delete them If (StatsSpreadsheet.Cases.Count > NumOfStats) Then StatsSpreadsheet.DeleteCases _ (NumOfStats + 1, StatsSpreadsheet.Cases.Count) End If If (StatsSpreadsheet.Variables.Count > this.NumberOfVariables) Then StatsSpreadsheet.DeleteVariables(this.NumberOfVariables + 1, _ StatsSpreadsheet.Variables.Count) ElseIf (StatsSpreadsheet.Variables.Count < this.NumberOfVariables) Then StatsSpreadsheet.AddVariables("",StatsSpreadsheet.NumberOfVariables, _ this.NumberOfVariables - StatsSpreadsheet.NumberOfVariables) End If 'set it to output spreadsheet StatsSpreadsheet.InputSpreadsheet = False 'set the results spreadsheet header StatsSpreadsheet.Header.Value = this.Name & "'s current statistics" 'move the variable names from here to the stats spreadsheet For i = 1 To this.NumberOfVariables StatsSpreadsheet.Variable(i).ColumnName = _ this.Variable(i).ColumnName StatsSpreadsheet.Variable(i).AutoFit StatsSpreadsheet.Variable(i).HorizontalAlignment = 2 Next i 'set up all of the stats names StatsSpreadsheet.CaseName(1) = "Means" StatsSpreadsheet.CaseName(2) = "Medians" StatsSpreadsheet.CaseName(3) = "Standard Deviation" StatsSpreadsheet.CaseName(4) = "Valid Number of Observations" StatsSpreadsheet.CaseName(5) = "Sums" StatsSpreadsheet.CaseName(6) = "Minimum Values" StatsSpreadsheet.CaseName(7) = "Maximum Values" StatsSpreadsheet.CaseName(8) = "25th Percentiles" StatsSpreadsheet.CaseName(9) = "75th Percentiles" StatsSpreadsheet.CaseName(10) = "Variance" StatsSpreadsheet.CaseNameWidth = 2.1 'add a warning message to the results spreadsheet to help 'prevent it from being closed StatsSpreadsheet.EventCode = _ "Private Sub Document_BeforeClose(Cancel As Boolean)" & vbCrLf & _ "If (MsgBox(""Closing this spreadsheet will prevent the statistics " & _ "from the attatched dataset from updating. Are you sure you wish to " & _ "close this document"",vbExclamation Or vbYesNo) = vbNo) Then" & _ vbCrLf & "Cancel = True" & vbCrLf & "End if" & vbCrLf & "End Sub" StatsSpreadsheet.RunEvent 'make sure that the stats Spreadsheet can be seen If Not(StatsSpreadsheet.Visible) Then StatsSpreadsheet.Visible = True End If End Sub
Saving the spreadsheet and AutoRun. Finally, before saving the macro and the data file, click on the data spreadsheet once more, and select Autorun from the View - Events menu.
This causes the new macro to run automatically every time you open the data spreadsheet. Next, save the spreadsheet and run the macro.
This simple example illustrates how you can build data files that can process their own Statistical and mathematical analyses automatically, without requiring you to run the descriptive statistics manually to review your altered data's results.