FAQ
I've run into an issue with glob and matching filenames with brackets '[]'
in them. The problem comes when I'm using part of such a filename as the
path I'm passing to glob. Here's a trimmed down dumb example. Let's say I
have a directory with the following files in it.

foo.par2
foo.vol0+1.par2
foo.vol1+1.par2
zzz [foo].par2
zzz [foo].vol0+1.par2
zzz [foo].vol1+1.par2

While processing one of the files I want to do certain things in batch so
I've been using glob as a means to get all of the files in a set. The
following code will print the filenames for parity volumes in each set
while working with the base checksum, unless there are brackets in the
name.


#re2 = re.compile(r'vol', re.IGNORECASE)

#for nuke in glob.glob('*.par2'):
# if not re2.search(nuke):
# list = glob.glob(nuke[:-5]+'*vol*')
# for name in list: print os.path.join(os.getcwd(),name)



I'm sure there is something obvious I'm missing. I figured I could use
something like re.escape on the trimmed filename for matching but that
hasn't worked either. Using win32api.FindFiles instead of glob works but
I'd obviously rather do it the _right_ way and have it work properly in
*nix too.

Search Discussions

  • Wittempj at Feb 7, 2005 at 10:00 am
    code like below willprint all files ending on 'par2', except tose not
    containong 'vol' from the 5th position. is that what you need?
    -import glob
    -for nuke in glob.glob(r"""c:\temp\*.par2"""):
    - try:
    - nuke.index('vol', 5)
    - print nuke
    - except ValueError, e:
    - print e
  • Python Dunce at Feb 8, 2005 at 1:06 am

    "wittempj at hotmail.com" <wittempj at hotmail.com> wrote in comp.lang.python:

    code like below willprint all files ending on 'par2', except tose not
    containong 'vol' from the 5th position. is that what you need?
    -import glob
    -for nuke in glob.glob(r"""c:\temp\*.par2"""):
    - try:
    - nuke.index('vol', 5)
    - print nuke
    - except ValueError, e:
    - print e
    Not quite. I'm sorry my example wasn't very clear. While working with any
    single file I need to be able to build a list of all the other files in a
    particular set. Basically I just need globbing of the base filename.

    glob.glob(basename+'.*some_extension')

    So if I was working with 'foo.par2' at the moment...

    glob.glob(filename[:-5]+'.*par2')

    would catch all of the files belonging to the set including 'foo.par2'
    'foo.vol0+1.par2' 'foo.vol1+1.par2' etc.

    This works great (as expected) until you are working with a filename with
    brackets '[]' in it. Then glob just returns an empty list. So if I happen
    to be processing 'foo [bar].par2'

    glob.glob(filename[:-5]+'.*par2')

    doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2')
    works perfectly, but I don't want to rely on win32api functions. I hope
    that made more sense :).
  • Michael Hoffman at Feb 8, 2005 at 1:49 am

    Python Dunce wrote:

    So if I happen
    to be processing 'foo [bar].par2'

    glob.glob(filename[:-5]+'.*par2')

    doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2')
    works perfectly, but I don't want to rely on win32api functions. I hope
    that made more sense :).
    If you look in the source for glob.py, you will find that it calls the
    fnmatch module, and this is the docstring for fnmatch.translate():

    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    So you cannot do what you want with glob.

    You can replace [] with ? in your glob string, if you are sure that
    there won't be other characters there. That's a bit of a hack, and I
    wouldn't do it.

    In my mind it would probably be best to do:

    re_vol = re.compile(re.escape(startpart) + ".*vol.*")
    lst = [filename for filename in os.listdir(".") if re_vol.match(filename)]

    I changed "list" to "lst" because the former shadows a built-in.
    --
    Michael Hoffman
  • Python Dunce at Feb 9, 2005 at 1:18 am

    Michael Hoffman <cam.ac.uk at mh391.invalid> wrote in comp.lang.python:

    Python Dunce wrote:
    So if I happen
    to be processing 'foo [bar].par2'

    glob.glob(filename[:-5]+'.*par2')

    doesn't return anything. Using
    win32api.FindFiles(filename[:-5]+'.*par2') works perfectly, but I don't
    want to rely on win32api functions. I hope that made more sense :).
    If you look in the source for glob.py, you will find that it calls the
    fnmatch module, and this is the docstring for fnmatch.translate():

    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    So you cannot do what you want with glob.

    You can replace [] with ? in your glob string, if you are sure that
    there won't be other characters there. That's a bit of a hack, and I
    wouldn't do it.

    In my mind it would probably be best to do:

    re_vol = re.compile(re.escape(startpart) + ".*vol.*")
    lst = [filename for filename in os.listdir(".") if
    re_vol.match(filename)]

    I changed "list" to "lst" because the former shadows a built-in.
    Thanks, that should do the trick! I had tried basically the same thing
    once but I was getting back empty lists. I think it was just a brain fart
    involving a case sensitive regex that didn't match the files I was testing
    it on :/.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedFeb 7, '05 at 7:22a
activeFeb 9, '05 at 1:18a
posts5
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase