Ken> Unfortunately, Python has some problems in this area. In
Ken> particular, since ubiquitous lists and dictionaries are dynamically
Ken> resized as needed, memory fragmentation seems inevitable.
That's not necessarily true. Also, I would say that Python has made
tradeoffs in this area, not that it necessarily "has problems". There is a
significant time/space tradeoff to be made. By hanging onto free lists and
structuring its allocator to group objects of similar sizes together,
allocation time of small objects (ints, floats, short strings, small dicts,
tuples and lists, etc) is greatly improved. (Small object allocation
dominates the dynamic memory allocation profile of Python.)
Ken> Also, memory freed by python apparently is not returned to the OS
Ken> according to this article:
Correct. There is a hierarchy of memory allocators in any system. In Unix
systems the system calls brk and sbrk are used to allocate and free large
chunks of memory from the operating system. These are generally called by
malloc, realloc and (sometimes) free. They are rarely, if ever, called
directly by applications. Applications call malloc, free, realloc, etc.
Python has its own allocator, obmalloc, which sits on top of malloc. Only
brk and sbrk can actually free up memory pages (truly return them to the
operating system). Some malloc implementations will do that (and then only
if an entire page of memory is free), but many won't. You might find the
large comments sprinkled throught obmalloc.c helpful:http://svn.python.org/view/python/trunk/Objects/obmalloc.c
Note the big row of equal signs in the first comment block. Anything
retained above that is memory charged to the application (size you see in
top(1) for example). This is why the malloc library's behavior is
important. Python's allocator can return arenas to the malloc library by
calling free(), but there's no guarantee that the malloc library will.
Ken> Are there any good general countermeasures to make an application
Ken> use less memory? Any good overview articles on this subject?
If your Python application is dealing with large lists of numbers you might
investigate the array module or the external numpy module. In theory you
could also try to find a different malloc library which has better policies
about returning memory to the system.