Understanding memory leaks in OOo and avoiding their impact
- Mon 26th February 2007, 10:17 pm
- Comments (0)
- Post a comment
I've answered a couple of recent topics which related to a general slowdown of systems running OOo, and a common thread here is the impact of memory leaks. So I thought that it might help to give a brief over overview of what a memory leak is; why they occur in OOo; and what you can do to mitigate the effect.
What is a memory leak?
In essence a memory leak is a consequence of one or more bugs in a program which result in memory resources being requested from the OS, and then some or all of the resources are not properly returned to the OS when the need for them is complete, and therefore 'put beyond use' for the rest of the life of the program. (The Wiki Memory Leak article gives a good overview.) Computer languages such as C++ allow the programmer to request extra memory dynamically to be used for specific purposes: for example if you enter a new cell in Calc to hold the new contents, or in OOo Basic to hold the contents of any Basic variable. Yet all such freedoms come at price: to avoid running out of memory the programmer must return the memory to the OS when done with it. Getting this wrong will result in memory leaks.
Why they occur in OOo?
To help with all this one of the OOo development team developed a excellent SvRefBase framework to simplify this burden for other OOo developers coding up write, calc, etc. This framework is:
Functional. In OOo storage structures entities such as variables or cell contents can be referred to by many other entities. Such data storage can only be safely freed and returned to the OS when it is no longer being user — that is no other references remain. This SvRefBase memory management framework implements a usage counting based system for referring to such dynamic resources. Each reference is tracked automatically and when the reference count goes to zero, the resource is freed up automatically and returned to the OS.
Simple to use. C++ implements a concept called overloading which enables the complexities of all this to be hidden from the programmer. Objects are described by constructs called classes which can be built in hierarchies. For all classes based on the SvRefBase class hierarchy, if the programmer follows some simple programming rules, then the rest of the detail is then hidden at a source level and handled by the compiler.
Efficient. When I look at the machine code that the compiler generates for such manipulation, it is pretty hard to implement this scheme more efficiently.
So an overall 10 out of 10 for the designer here! So what can possibly go wrong, if it is so good? One answer is that to use this system you still have to follow some rules, and because the system is so good then programmers (being but human) tend to forget this. One of the key programming rules here is that if you create circular references in your structures then you must break such circles when you are finished with the resources, otherwise the system will not be able to clean up properly for you. An example of this is in a bug that I identified in the topic SDK and Memory - Garbage Collection. This is a memory leak within the OOo Basic runtime system (RTS) in the code used to implemented function and subroutine calls. In this case:
A method variable is used to call a function (or an UNO call) and to hold the return value.
The method can have an argument list and so refers to a variableArray to hold this.
A variableArray is in fact a vector of references to the variables or constants which hold param1, param2, ... However system also adopts the convention that param0 is used to refer to the return value.
And since this return value is actually stored in the original method, the calling code set up param0 to refer back to it.
Hence we have a method -> variableArray -> parameter -> the method again. The RTS code does breaks this circle in the act of overwriting param0, so no leak occurs for a function which returns a value. However, Basic subroutines and UNO calls such as cell.setValue(2) don't return a value, so in these cases this circular reference is left intact. When the RTS has finished executing the call it moves on assuming that the garbage collection system will gather up the memory objects used in its execution, but in this case the method still has a reference to it, as does the parameter array, and the param0 placeholder — so this memory can’t be freed and therefore about 2Kbytes leaks on each call. Some OOo applications invoke LOTS of calls so these 2Kbytes soon add up. One a quick scan, I’ve picked up at least a few dozen topics posted in the oooforum which discuss various manifestations of this specific bug.
What you can do to mitigate the effect?
A modern OS such as Linux or Window XP uses a Virtual Memory Management (VMM) system within its kernel to handle the demands for memory from individual processes. These VMM systems have developed over success OS generations and today's are pretty sophisticated. At a process level, most memory is properly ring fenced and when a process is terminated all of the memory is return to a pool for reuse.
Two of the key techniques used in VMM are (i) to allow the processes which use common memory areas (such as the code within runtime libraries) to share the same physical memory in chunks called pages, and (ii) only to load the active pages into the physical RAM on the PC and move any inactive modified pages into an overflow pagefile on disk. During healthy running the VMM system will cycle logical memory to and from physical RAM using a process call page faulting. Problems start to occur when the amount of memory needed is greater than can be physically fitted into memory at any time this, and this page faulting then dominates the effective work being carried out on the PC. At this point as far as the user is concerned the PC basically slows to a crawl. This is known as Thrashing.
Both OSes provide simple management tools (Task Manager and top ) to help diagnose when this is happening. In WinXP you can start Task Manager from the task bar, and use View -> Select Columns to chose Mem Usage , VM Size and PF Delta; you can see the heavy hitters by clicking on the relevant column heading you can sort these in descending order. In Linux you can enter top in a terminal window to do the equivalent.
My first recommendation is always be aware that if your PC slows down inexplicably then a process may be thrashing, so use Task Manager or top to see if any are and specifically if Soffice.bin is. Closing your documents and exiting Soffice.bin will nearly always bring your PC back to normal. Also be aware that on WinXP Soffice.bin does not exit by default, even when you close all documents. To prevent this happening click on Tools -> Options -> OpenOffice.Org - Memory and uncheck "Load Open Office during system start-up". This will result in the OOo process only being active when you are working on one or more documents. OK this might slow down start-up a few seconds, but this is worth it for the general improvement in system stability.
My second recommendation is that many leaks are cumulative, so if you are start noticing a slowdown use Task Manager, top or ps –e v to check to see if Soffice.bin is growing in size and starting to fault. If it is then save and close your documents then close down and restart OOo. (The cycle of death and birth is there for a reason.) You need to get a handle on exactly how long you can go without starting to die and to bounce OOo on a routine basis before this happens for your pattern of use on this document. If there is a specific combination of circumstances which causes OOo to start to leak like a sieve, then file an issue on the OOo site.
The third aspect is really up to the OOo developers to hunt down and kill those leaks. One of the beauties of Open source is that any proficient C++ programmer can download the source, do this and then post the fix. Such work is genuinely pro bono, and this one fix will could end up benefiting more users than any other code that you've written in your entire career. This is what I did for the bug that I described above.
My only criticism of the whole SvRefBase system is that when I've implemented similar systems in the past myself, I have always included a diagnostic layer to enable the developers to list off, to trace and to locate all memory leakages. I have also had regular blitzes on these to keep them under control. At the moment SvRefBase doesn't include this diagnostics and tracking down these leaks doesn't seem to be a priority for the OOo development team.
- Comments (0)
- Post a comment