Macro (SVB) Programs Example - Creating a Self-Updating Dataset
The functionality for a spreadsheet to run Statistical analyses on itself whenever its data is changed can be programmed into it by attaching an SVB macro to certain spreadsheet events as demonstrated in this example. Note that practically all spreadsheet (and other) events can be customized, thus providing the tools to build very sophisticated and highly customized automated data operations right "into" the spreadsheet.
How is the macro created? First, create a data file and set up the necessary cells. Data file Statistics Updating Demo.sta (available in the Examples\Macros\Document Event Examples directory, located in the directory in which you installed Statistica) demonstrates this functionality. In this particular data file, a cat clinic's records are kept within the spreadsheet, and its Statistical results are kept in another linked spreadsheet window. Whenever the data is altered in the data file, the corresponding results spreadsheet are updated.
When the spreadsheet is first opened, it appears as:
Then, its corresponding results spreadsheet is displayed as in the following image:
Now, while both of these spreadsheets are open, let's update the data file and watch it automatically update the results spreadsheet. For example, let's say that Isabelle comes in for a checkup. Her weight has increased by .9 pounds and she has also had a birthday since her last visit. We will update the data file by changing Isabelle's weight to 13.5, her age to 5, and her number of previous visits to 4. Instead of having to manually run descriptive statistics on this spreadsheet to update the results, the data file automatically does it for you. After altering the data file, the results spreadsheet now appears as:
' Document Events Option Base 1 Dim StatsSpreadsheet As New Spreadsheet Dim this As Spreadsheet Const NumOfStats As Integer = 10
Select Document in the Object box of the SVB editor (Document Events window); select the DataChanged event in the Proc box, and type the following program into the SVB editor:
Private Sub Document_DataChanged(ByVal Flags As Long, _ ByVal FirstCase As Long, ByVal FirstVar As Long, _ ByVal LastCase As Long, ByVal LastVar As Long, _ ByVal bLast As Boolean) CreateStatSheet ProcessStats End Sub
Select Document in the Object box of the SVB editor (Document Events window); select the Open event in the Proc box, and type the following program into the SVB editor:
Private Sub Document_Open() CreateStatSheet ProcessStats End Sub
Set the cursor below the Document_Open function (so that you are outside of any function) and type (or copy and paste) the following additional functions into the SVB editor:
Private Sub ProcessStats()
Set this = EventSource
Dim SumArray() As Variant
ReDim SumArray(this.NumberOfVariables)
Dim ValidObs() As Variant
ReDim ValidObs(this.NumberOfVariables)
Dim MinVals() As Variant
ReDim MinVals(this.NumberOfVariables)
Dim MaxVals() As Variant
ReDim MaxVals(this.NumberOfVariables)
Dim Medians() As Variant
ReDim Medians(this.NumberOfVariables)
Dim LowerQuartiles() As Variant
ReDim LowerQuartiles(this.NumberOfVariables)
Dim UpperQuartiles() As Variant
ReDim UpperQuartiles(this.NumberOfVariables)
Dim Variances() As Variant
ReDim Variances(this.NumberOfVariables)
Dim StandardDev() As Variant
ReDim StandardDev(this.NumberOfVariables)
For i = 1 To this.NumberOfVariables
SumArray(i) = GetSum(i)
ValidObs(i) = GetValidObs(i)
MinVals(i) = GetMin(i)
MaxVals(i) = GetMax(i)
Medians(i) = GetMedian(i)
LowerQuartiles(i) = GetLowerQuartile(i)
UpperQuartiles(i) = GetUpperQuartiles(i)
Variances(i) = GetVariance(i)
StandardDev(i) = GetStandardDev(i)
Next i
For i = 1 To this.NumberOfVariables
'MEANS
'prevent division by zero If (ValidObs(i)) Then
StatsSpreadsheet.Value(1,i) = SumArray(i)/ValidObs(i)
Else
'zero number of observations for the variable
StatsSpreadsheet.Value(1,i) = "----"
End If
'MEDIANS
StatsSpreadsheet.Value(2,i) = Medians(i)
'STANDARD DEVIATION
StatsSpreadsheet.Value(3,i) = StandardDev(i)
'VALID NUMBER OF OBSERVATIONS
StatsSpreadsheet.Value(4,i) = ValidObs(i)
'SUMS If SumArray(i) Then
StatsSpreadsheet.Value(5,i) = SumArray(i)
Else
StatsSpreadsheet.Value(5,i) = "----"
End If
'MINS
StatsSpreadsheet.Value(6,i) = MinVals(i)
'MINS
StatsSpreadsheet.Value(7,i) = MaxVals(i)
'LOWER QUARTILES
StatsSpreadsheet.Value(8,i) = LowerQuartiles(i)
'HIGHER QUARTILES
StatsSpreadsheet.Value(9,i) = UpperQuartiles(i)
'VARIANCE StatsSpreadsheet.Value(10,i) = Variances(i)
Next i
End Sub
Function GetSum(VariableNum) As Double
For i = 1 To this.NumberOfCases
'ignore missing data
If this.SelectionCondition.Evaluate(i)
And _ Not(this.MissingData(i,VariableNum)) Then
TotalVal = TotalVal + this.Value(i,VariableNum)
End If
Next i
GetSum = TotalVal
End Function
Function GetValidObs(VariableNum) As Double
Dim TotalValid As Long
TotalValid = 0
For i = 1 To this.NumberOfCases
'ignore missing data and make sure case selections
'allow its inclusion
If Not(this.MissingData(i, VariableNum)) And _ this.SelectionCondition.Evaluate(i)
Then
TotalValid = TotalValid + 1
End If
Next i
GetValidObs = TotalValid
End Function
Function GetMin(VariableNum) As Variant
'Missing Data is the lowest value Dim CurrentMin As Double Dim FirstValidCase As Long
FirstValidCase = 1
If GetValidObs(VariableNum) = 0 Then
GetMin = "------" Exit Function
End If
While Not this.SelectionCondition.Evaluate(FirstValidCase) And _ FirstValidCase <> this.NumberOfCases
FirstValidCase = FirstValidCase + 1
Wend
'all of the cases are being ignored
If FirstValidCase > this.NumberOfCases Then
GetMin = "------" Exit Function
End If
CurrentMin = this.Value(FirstValidCase,VariableNum)
For i = (FirstValidCase + 1) To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
If CurrentMin > this.Value(i,VariableNum) Then
CurrentMin = this.Value(i,VariableNum)
End If
End If
Next i
GetMin = CurrentMin
End Function
Function GetMax(VariableNum) As Variant
'Missing Data is the lowest value Dim CurrentMin As Double Dim FirstValidCase As Long
FirstValidCase = 1
If GetValidObs(VariableNum) = 0 Then
GetMax = "------" Exit Function
End If
While Not this.SelectionCondition.Evaluate(FirstValidCase) And _ FirstValidCase <> this.NumberOfCases
FirstValidCase = FirstValidCase + 1
Wend
'all of the cases are being ignored If FirstValidCase > this.NumberOfCases Then
GetMax = "----" Exit Function
End If
CurrentMax = this.Value(FirstValidCase,VariableNum)
For i = (FirstValidCase + 1) To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
If CurrentMax < this.Value(i,VariableNum) Then
CurrentMax = this.Value(i,VariableNum)
End If
End If
Next i
GetMax = CurrentMax
End Function
Function GetMedian(VariableNum) As Variant
Dim Temp As Double Dim MedianPos As Integer
MedianPos = 1
If GetValidObs(VariableNum) = 0 Then
GetMedian = "----" Exit Function
End If
Dim CaseArray() As Double ReDim CaseArray(GetValidObs(VariableNum))
'move case values to temporary array for sorting Dim CurrentCasePos As Long
CurrentCasePos = 1
For i = 1 To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
CaseArray(CurrentCasePos) = this.Value(i,VariableNum) CurrentCasePos = CurrentCasePos + 1
End If
Next i
For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1
For i = 1 To Outer
'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then
Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp
End If
Next i
Next Outer
'is the number of cases odd? If GetValidObs(VariableNum) = 1 Then
GetMedian = CaseArray(1)
ElseIf (GetValidObs(VariableNum) Mod 2) Then
MedianPos = Fix((GetValidObs(VariableNum)/2)) + 1 GetMedian = CaseArray(MedianPos)
Else
MedianPos = (GetValidObs(VariableNum)/2) GetMedian = (CaseArray(MedianPos) + CaseArray(MedianPos + 1))/2
End If
End Function
Function GetLowerQuartile(VariableNum) As Variant
Dim Temp As Double Dim LowerPos As Integer LowerPos = 1 Dim CaseArray() As Double
If GetValidObs(VariableNum) = 0 Then
GetLowerQuartile = "----" Exit Function
End If
ReDim CaseArray(GetValidObs(VariableNum))
'move case values to temporary array for sorting Dim CurrentCaseArray As Long CurrentCaseArray = 1
For i = 1 To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
CaseArray(CurrentCaseArray) = this.Value(i,VariableNum) CurrentCaseArray = CurrentCaseArray + 1
End If
Next i
For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1
For i = 1 To Outer
'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then
Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp
End If
Next i
Next Outer
LowerPos = Fix((GetValidObs(VariableNum)/4)) + 1 GetLowerQuartile = CaseArray(LowerPos)
End Function
Function GetUpperQuartiles(VariableNum) As Variant
Dim Temp As Double Dim LowerPos As Integer Dim UpperPos As Integer LowerPos = 1 Dim CaseArray() As Double
If GetValidObs(VariableNum) = 0 Then
GetUpperQuartiles = "----" Exit Function
End If
ReDim CaseArray(GetValidObs(VariableNum))
'move case values to temporary array for sorting Dim CurrentCaseArray As Long CurrentCaseArray = 1
For i = 1 To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
CaseArray(CurrentCaseArray) = this.Value(i,VariableNum) CurrentCaseArray = CurrentCaseArray + 1
End If
Next i
For Outer = GetValidObs(VariableNum) - 1 To 1 Step -1
For i = 1 To Outer
'ascending order (low-high) If CaseArray(i) > CaseArray(i + 1) Then
Temp = CaseArray(i + 1) CaseArray(i + 1) = CaseArray(i) CaseArray(i) = Temp
End If
Next i
Next Outer
LowerPos = Fix((GetValidObs(VariableNum)/4)) UpperPos =GetValidObs(VariableNum) - LowerPos GetUpperQuartiles = CaseArray(UpperPos)
End Function
Function GetVariance(VariableNum) As Variant
Dim Sum As Double Dim N As Long Dim Mean As Double Dim SumOfSquares As Double SumOfSquares = 0 Sum = GetSum(VariableNum) N = GetValidObs(VariableNum)
'if there aren't any valid observations then return 0 If Sum = 0 Or N < 2 Then
GetVariance = "----" Exit Function
End If
Mean = Sum/N
For i = 1 To this.NumberOfCases
If this.SelectionCondition.Evaluate(i) And _ Not(this.MissingData(i,VariableNum)) Then
'calculate squares of differences 'iterate through each case, subtract the means from it, 'and then square it. Add all of these up and you will 'have the sum of squares.
SumOfSquares = SumOfSquares + _ ((this.Value(i,VariableNum) - Mean) ^ 2)
End If
Next i
GetVariance = SumOfSquares / (N - 1)
End Function
Function GetStandardDev(VariableNum)
'it there aren't any valid cases then don't calculate std. dev. If GetVariance(VariableNum) = "----" Then
GetStandardDev = "----"
Else
GetStandardDev = Sqr(GetVariance(VariableNum))
End If
End Function
Sub CreateStatSheet
Set this = EventSource
'are there any extra rows at the bottom? Delete them If (StatsSpreadsheet.Cases.Count > NumOfStats) Then
StatsSpreadsheet.DeleteCases _ (NumOfStats + 1, StatsSpreadsheet.Cases.Count)
End If
If (StatsSpreadsheet.Variables.Count > this.NumberOfVariables) Then
StatsSpreadsheet.DeleteVariables(this.NumberOfVariables + 1, _ StatsSpreadsheet.Variables.Count)
ElseIf (StatsSpreadsheet.Variables.Count < this.NumberOfVariables) Then
StatsSpreadsheet.AddVariables("",StatsSpreadsheet.NumberOfVariables, _ this.NumberOfVariables - StatsSpreadsheet.NumberOfVariables)
End If
'set it to output spreadsheet StatsSpreadsheet.InputSpreadsheet = False 'set the results spreadsheet header StatsSpreadsheet.Header.Value = this.Name & "'s current statistics" 'move the variable names from here to the stats spreadsheet
For i = 1 To this.NumberOfVariables
StatsSpreadsheet.Variable(i).ColumnName = _ this.Variable(i).ColumnName StatsSpreadsheet.Variable(i).AutoFit StatsSpreadsheet.Variable(i).HorizontalAlignment = 2
Next i
'set up all of the stats names StatsSpreadsheet.CaseName(1) = "Means" StatsSpreadsheet.CaseName(2) = "Medians" StatsSpreadsheet.CaseName(3) = "Standard Deviation" StatsSpreadsheet.CaseName(4) = "Valid Number of Observations" StatsSpreadsheet.CaseName(5) = "Sums" StatsSpreadsheet.CaseName(6) = "Minimum Values" StatsSpreadsheet.CaseName(7) = "Maximum Values" StatsSpreadsheet.CaseName(8) = "25th Percentiles" StatsSpreadsheet.CaseName(9) = "75th Percentiles" StatsSpreadsheet.CaseName(10) = "Variance" StatsSpreadsheet.CaseNameWidth = 2.1
'add a warning message to the results spreadsheet to help 'prevent it from being closed
StatsSpreadsheet.EventCode = _ "Private Sub Document_BeforeClose(Cancel As Boolean)" & vbCrLf & _ "If (MsgBox(""Closing this spreadsheet will prevent the statistics " & _ "from the attatched dataset from updating. Are you sure you wish to " & _ "close this document"",vbExclamation Or vbYesNo) = vbNo) Then" & _ vbCrLf & "Cancel = True" & vbCrLf & "End if" & vbCrLf & "End Sub"
StatsSpreadsheet.RunEvent
'make sure that the stats Spreadsheet can be seen If Not(StatsSpreadsheet.Visible) Then
StatsSpreadsheet.Visible = True
End If
End Sub
Saving the spreadsheet and AutoRun. Finally, before saving the macro and the data file, click on the data spreadsheet once more, and select Autorun from the View - Events menu.
This causes the new macro to run automatically every time you open the data spreadsheet. Next, save the spreadsheet and run the macro.
This simple example illustrates how you can build data files that can process their own Statistical and mathematical analyses automatically, without requiring you to run the descriptive statistics manually to review your altered data's results.
