[Note: proposed change to mmap.mmap at the end of this message.]
Nicholas FitzRoy-Dale wrote
I'm doing regular expression search on an MMAPed file without issues.
From code to index an mbox-style file:
mboxMap = mmap.mmap (handle.fileno(), getFileLength (self.sourceFilename),
mmap.MAP_SHARED, mmap.PROT_READ)
Well, there's my problem. I've nearly no clue on how to use
the mmap module, so I assumed I could use the defaults.
Just tried out what you did, and it works.
The following doesn't work:
$ ./python
Python 2.2a3+ (#7, Sep 19 2001, 03:31:19)
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
handle = open("/etc/passwd", "r")
import mmap
map = mmap.mmap(handle.fileno(), 0)
import re
pat = re.compile("^(.*(?=dalke).*)$", re.MULTILINE)
m = pat.search(map)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: expected string or buffer
>>>
In this case, the mmap call is missing the MAP_SHARED flag
(which is the default, so should be fine). It's also missing
the PROT_READ flag, so the default of PROT_READ | PROT_WRITE
is used.
I guess that's the problem, but you can see why the error message
threw me off.
Would it be useful if mmap.mmap was changed to something like
def proposed_mmap(file, size = 0, flags = mmap.MAP_SHARED,
prot = None):
# Roughly like PyObject_AsFileDescriptor in Objects/fileobject.c
if hasattr(file, "fileno"):
fileno = file.fileno()
else:
fileno = file
# See if we need to figure out the default for "prot"
if prot is None:
# File-like object may have a "mode" attribute defined
# If so, use it, otherwise default to "rw"
mode = getattr(file, "mode", "rw")
prot = 0
if "r" in mode:
prot |= mmap.PROT_READ
if "w" in mode:
prot |= mmap.PROT_WRITE
if prot == 0:
prot = mmap.PROT_NONE
return mmap.mmap(fileno, size, flags, prot)
This would allow people like me to do
handle = open("filename")
map = mmap.mmap(handle)
and have it just work. (Unless I do 'mmap.mmap(open("filename"))'
since then I'll lose a reference count and the file handle gets
closed from underneath me. I think.)
Comments?
Andrew
dalke at dalkescientific.com