VBA vs OOo Basic Performance II

This article follows on from VBA vs OOo Basic Performance where I discussed OOo Basic performance and coding efficiency, and one of the specific issues was “how long does it take to load a reasonably large Calc column into an array for processing in Basic.  So I set myself a little challenge: what is the fastest way to load the values from an worksheet range Sheet1.D1:D2000 (which happen to contain random numbers from 0 to 1,000,000,000 into a vector a(0 to 19999)?  Part of my own reasons is that I am characterising the OOo Basic (OOB) compiler and RTS, but I decided to post this because it might give those readers you are interested in speeding up slow running code.

I lay out seven different variants below and their timing on my Laptop.  The run-times for what might superficially seem to be stuff that the compiler would all compile down to the same code anyway are in fact spread by a factor of four.  But the bottom line is that there are some simple rules of thumb for getting your code to run fast:

  • Don’t even bother trying if you are only falling through it once. Just focus on Keeping It Simple Stupid here.
  • But once your response times are getting into seconds and this is an inconvenience then you might want to think about optimisation
  • Minimise the number of UNO calls that you make by using local variables to cache UNO method and property chains e.g. if you are going to refer to compound objects frequently then use a variant. For example, I have got into the habit of always having oDoc, oCtlr, oSheets, oThisSheet in scope.
  • Avoid the With … End With construct in OOB. Using them actually makes your run-times slightly worse. Better and faster to use a simple local variant and refer to it as you can see from imrovement that you get going from tests 1 then 3 to 5.
  • Where you do need to refer to objects by fixed name, then if the object supports the getByName it actually works out slightly faster as well as being easier the read using the pseudo-property oSheets.Sheet1 then the explicit method call oSheets.getByName(“Sheet1”) as you can see from test 2.
  • Always use an UNO aggregate method in preference to looping around and elemental one — for example use CellRange.DataArray rather than looping around Cell.Value or Cell.String
  • My last example is really a bit of extreme programming to force a spreadsheet callback to a Basic macro. Why? Because macros pass a range as a two dimensional array rather than a List of Lists (LoLs), and Basic can access the former far more efficiently

Even though this last approach is by far the fastest, I don’t really recommend it unless you are extremely confident in your programming skill, but what this does demonstrate is that Basic currently handles LoLs inefficiently which is a pity given that the are a core mechanism for UNO data bundling.  Ideally we should extend the OOB syntax in line with VBA and allow programmatic constructs such as a(i) (0) to manipulate LoLs.

Perhaps a simpler interim might be to add a couple of helper routines. For example if we had a ColumnSlice(<array> or <LoL>, <index,…) to return by value a single column from an Matrix or LoL; then the code for Testt 6 could be simplified to the following which would then be the fasted method whilst still retaining an acceptable level of simplicity:

y = ThisComponent.Sheets.Sheet1.getCellRangeByPosition(3,0,3,n).DataArray
z = ColumnSlice(y,0) 
For i = 0 To n : x(i) = z(i) : Next i

(And yes, this is another one for an enhancement request)

The Test Routines

I have the full test harness if anyone wants it, but to shorten this post, I trimmed down to the key extract.  The timings were on my laptop averaged over ten iterations.

Test 1 — 10.68 Secs

Sub Test1(x, n) : Dim i  ' Full routine shown for the 
1st Test 
With ThisComponent.Sheets.getByName("Sheet1") 
For i = 0 To n : x(i) = .getCellByPosition(3,i).getString() : Next i 
End With 
End Sub

Test 2 — 10.81 Secs

For i = 0 To n : x(i) = ThisComponent.Sheets.getByName("Sheet1").getCellByPosition(3,i).getString() : Next i

Test 3 — 10.16 Secs

For i = 0 To n : x(i) = ThisComponent.Sheets.Sheet1.getCellByPosition(3,i).String : Next i

Test 4 — 8.09 Secs

s = ThisComponent.Sheets 
For i = 0 To n : x(i) = s.Sheet1.getCellByPosition(3,i).String : Next i

Test 5 — 5.17 Secs

s = ThisComponent.Sheets.Sheet1 
For i = 0 To n : x(i) = s.getCellByPosition(3,i).String : Next i

Test 6 — 3.52 Secs

y = ThisComponent.Sheets.Sheet1.getCellRangeByPosition(3,0,3,n).DataArray	
For i = 0 To n : r = y(i) : x(i) = r(0) : Next i

Test 7 — 2.79 Secs

c = ThisComponent.Sheets.Sheet1.getCellByPosition(3,n+2) 
zGbl = x 
c.Formula = "=TEST7CALLBACK(" & n & ";D1:D" & (n+1) & ")" 
ThisComponent.calculate() 
c.clearContents(-1) 
Function TEST7CALLBACK(n, a)  ' a is Variant(1 to n+1, 1 to 1) 
  Dim i :   For i = 0 To n : zGbl(i) = a(i+1,1) : Next i 
  TEST7CALLBACK = zGbl(1) 
End Function

Leave a Reply