RSS Feeds

VBA vs OOo Basic Performance

I am currently working on a HOWTO on VBA to OOB migration. As part of this I have been doing a benchmark comparison on their relative performance. The preliminary results are:

Run Time in Secs On 1.2Ghz P3M Laptop VBA OOB Ratio
ShellSort(Intensive Basic) 0.29 19.24 1:66
Test Harness (Lots of UNO / COM Calls) 1.11 19.66 1:18

The sample one-sigma was typically 3% or so over a batch of runs giving a broad comparison 1 Sec VBA = 1 Min OOB for pure code. On my Laptop OOB runs at about 50 KIPS which is fine for non-looping algo's, but it could take a VBA migrator by nasty surprise when migrating a heavy one.

This was run under XP/SP2 agains MSOffice 2003. Also note that from PERFMON I could see that the Calc and Basic RTS are running one-task in SOFFICE.BIN. This was all user compute. It just looks like the design/performance of the Basic interpreter/RTS means that it runs like a total dog. FYI VBscript runs at about a 1:2 ratio -- that is 30 times faster then OOB on the sort routine at least. I haven't done the timing on using VBscript over the Automation Bridge yet.

BTW With the exception of toggling one True/False this runs unchanged on both VBA and OOB.

The Benchmark routine — A Shell Sort

   
Option Explicit  
'
' Module ShellSort  
'  
Public Function ShellSort(ByRef A() As Variant) As String  
'  
' Complements to Donald Knuth, "Art of Computer Programming  
'                               Vol 3 - Sorting and Searching"  
'  
' There are various ShellSort variants.  All use a process of sorting		 
' a set of collapsing subsequences.  If you think of a bubble sort, the worse  
' runtime is O(N*Nb) where Nb is the largest distanc of an entry in an
' unsorted position from its sorted position.  The trick that the Shell
' algortihm exploits is to do the initial passes to do large swap bounds,  
' decreasing these as the sort progresses, so the journey of an individual  
' element from its start position look quite like a binary chop.  
'  
' The exchange sort variant below runs in about half the time of a bubble sort  
' variant.  You can also play bounds with the nesting strategy. This  
' implementation uses a Sedgewick Sequence. RTFM for more info.  
'  
' The average sort time is O(N^(7/6)) and worst case O(N^(4/3)). For comparison  
' The average sort time is O(N*Log N) and worst case O(N^2) for Quicksort.  
   Dim Alb As Variant, Aub As Variant, Aspan As Variant  
   Dim nSedgewick As Variant, nShell As Long, inc As Long  
   Dim i As Long, j As Long, tmp As Variant  
   Dim nMoves As Long, sDebug As String  
   sDebug = ""  
   Alb = LBound(A)  
   Aub = UBound(A)  
   Aspan = Aub - Alb + 1  
   nSedgewick = Array(1, 5, 19, 41, 109, 209, 505, 929, 2161, 3905, 8929, 16001, _  
                       36289, 64769, 146305, 260609, 587521, 1045505, 2354689, _  
                       4188161, 9427969, 16764929, 37730305, 67084289, _
                       150958081, 268386305, 999999999999#)  
   nShell = 0  
   Do While nSedgewick(nShell + 1) < Aspan: nShell = nShell + 1: Loop
   Do While nShell >= 0:  
      ' sort by insertion in increments of h  
      inc = nSedgewick(nShell)  
      For i = Alb + inc To Aub  
         tmp = A(i)  
         For j = i - inc To Alb Step -inc  
            If A(j) <= tmp Then Exit For  
            A(j + inc) = A(j)  
         Next j  
         A(j + inc) = tmp  
      Next i  
      sDebug = sDebug & Chr(10) & inc & ", " & nMoves & ", " & (Aub - Alb - inc)  
      nShell = nShell - 1  
   Loop  
 ' The real reason that this is a function is to avoid the variant syntax  
 ' of sub calls in VBA vs. OOB  
   ShellSort = sDebug  
End Function

Module TestHarness

Option Explicit  
#Const OOB = True     ' Set this True or False depending on whether  
                      ' your code is in Calc or Excel  
Const nArray = 20000  
Sub test()  
Dim x(), timer1, timer2, sDebug As String  
ReDim x(nArray - 1)  
' Disable screen updating and AutoCalculate  
#If OOB Then  
Msgbox "def OOB"  
   ThisComponent.addActionLock  
   ThisComponent.lockControllers  
   ThisComponent.enableAutomaticCalculation (False)  
#Else  
   Application.ScreenUpdating = False  
   Application.Calculation = xlCalculationManual  
#End If  
' timer1 times the load and fetch overhad (UNO/COM intensive)  
timer1 = getTimeMS()  
LoadArray x, 1  
' timer1 times the sort (pure Basic runtime no UNO/COM)  
timer2 = getTimeMS()  
sDebug = ShellSort(x())  
timer2 = getTimeMS() - timer2  
StoreArray x, 2  
timer1 = getTimeMS() - timer1 - timer2  
' Now post the debug back to the sheet  
StoreArray Array(sDebug, _  
                 "Sort Time = " & (timer2 / 1000!) & " sec" & Chr(10) & _
                 "Load/Store Time = " & (timer1 / 1000!) & " sec" _  
                 ), 3  
' Put Screen Updating and AutoCalculate back to defaults  
#If OOB Then  
   ThisComponent.enableAutomaticCalculation (True)  
   ThisComponent.unLockControllers  
   ThisComponent.removeActionLock  
#Else  
   Application.ScreenUpdating = True  
   Application.Calculation = xlCalculationManual  
#End If  
End Sub  
Sub FillSheetColumn()  
   Dim vArray(nArray), i, y  
   For i = 0 To UBound(vArray)  
      vArray(i) = Chr(Int(25 * Rnd() + Asc("A"))) & CStr(Int(10000000 * 
Rnd()))  
   Next i  
   StoreArray vArray, 1  
End Sub  
Private Sub LoadArray(x, ByVal nCol)  
   Dim i  
   #If OOB Then  
      With ThisComponent.Sheets(0)  
         For i = 0 To UBound(x)  
            x(i) = .getCellbyPosition(nCol - 1, i).String  
         Next i  
      End With  
   #Else  
      With ActiveWorkbook.Sheets(1)  
         For i = 0 To UBound(x)  
            x(i) = .Cells(i + 1, 1).Value  
         Next i  
      End With  
   #End If  
End Sub  
Private Sub StoreArray(x, ByVal nCol)  
   Dim i  
   #If OOB Then  
      With ThisComponent.Sheets(0)  
         For i = 0 To UBound(x)  
            .getCellbyPosition(nCol - 1, i).String = x(i)  
         Next i  
      End With  
   #Else  
      With ActiveWorkbook.Sheets(1)  
         For i = 0 To UBound(x)  
            .Cells(i + 1, nCol).Value = x(i)  
         Next i  
      End With  
   #End If  
End Sub  
Function getTimeMS() As Long  
   #If OOB Then  
      getTimeMS = CLng(CSng(GetSystemTicks) / 0.98)  
   #Else  
      getTimeMS = CLng(Timer() * 1000)  
   #End If  
End Function

Out of interest, I also recoded this routine in Perl to see how the Perl RTS copes with this algorithm. I just focussed on the sort timings and didn't bother to get the automation bridge bit going, but here are the timings:

RTS       Time (sec)   Ratio
OOB 19.24
VBA 0.29 1:66
Perl 0.86 1:22

So Perl is of the same order as VBscript and these set the benchmark for what this sorts of byte-code interpreter could achieve. I spend a couple of hours going through the 66K lines of compiler and RTS code so now I understand why OOB runs like such a dog. To really bottom the fine details would need to do an execution code walk, but that involves a lot of work to set up a test environment.

The nub of the problem is that compiler has a very simple code generation strategy and just about leaves everything to the byte-code interpreter in the RTS. The storing of variables, execution of functions all work off the stack and the code seems reasonable lean. The problem comes when you want to access the variables. So a = b+c gets translated to

FIND “b”, type1  
GET  
FIND “c”, type2  
GET  
PLUS  
FIND “d”, type3  
SET

The problem is that the FIND has to do a lot of work, and this really hits you hard when you have something like:

For i = 0 To 20000  
  ThisComponent.Sheets(0).CellbyPosition(nCol - 1, i).String = x(i)  
Next i

This will call this find function (or one of its variants) 160,000 times. The work involved for a local variant which binds to primitive data type is reasonable lightweight (60,000 calls), array element access slightly worse (another 20,000), but the killer is the object and method accesses (the other 80,000) where the RTS does all the necessary reflection / inspection on the objects and methods / properties. 99.99% of this is duplicated and entirely redundant. What is worse here is that you think you might help by using the With construct, but is just syntactic sugar which is substituted at compile-time, so the following is equivalent to the above:

With ThisComponent.Sheets(0)     
  For i = 0 To 20000  
      .CellbyPosition(nCol - 1, i).String = x(i)  
  Next i  
End With

However recoding this as:

k = ThisComponent.Sheets(0)  
nm1 = nCol - 1  
For i = 0 To j  
  k.getCellbyPosition(nm1, i).setString(x(i))  
Next i  

manually hoists unnecessary work out the loop reducing the total runtime of this piece of code by by nearly 60%!

What is very clear is that the Basic compiler and RTS ignore two essential tenants of good environment design

Now the first of these would require a major piece of re-architecture so couldn't be taken lightly, but I reckon that you could implement a look-aside cache to bypass 90% of this repeated find work with perhaps a few weeks work. Isn't this a worthy project if we have a 10x speed up of the Basic performance as a result?