FAQ
After converting a text file containing doctests to use Windows line
endings, I'm getting spurious errors:

ValueError: line 19 of the docstring for examples.txt has inconsistent
leading whitespace: '\r'


I don't believe that doctest.testfile is documented as requiring Unix
line endings, and the line endings in the file are okay. I've checked in
a hex editor, and they are valid \r\n line endings.

In doctest._load_testfile, I find this comment and code:

# get_data() opens files as 'rb', so one must do the equivalent
# conversion as universal newlines would do.
return file_contents.replace(os.linesep, '\n'), filename

which I read as an attempt to normalise line endings in the file to \n.

(But surely this will fail? If you're running, say, Linux or MacOS,
linesep will already be '\n' not '\r\n', and consequently the replace
does nothing, any Windows line endings aren't normalised, and doctest
will choke on the \r characters. It's only useful if running on Windows.)

But the above only occurs when using a package loader. Otherwise,
_load_testfile executes:

return open(filename).read(), filename

which doesn't do any line ending normalisation at all.

To my mind, this is a bug in doctest. Does anyone disagree? I think the
simplest fix is to change it to:

return open(filename, 'rU').read(), filename


Comments?



--
Steven

Search Discussions

  • Patrick Maupin at Apr 11, 2010 at 4:01 am

    On Apr 10, 10:16?pm, Steven D'Aprano <st... at REMOVE-THIS- cybersource.com.au> wrote:
    After converting a text file containing doctests to use Windows line
    endings, I'm getting spurious errors:

    ValueError: line 19 of the docstring for examples.txt has inconsistent
    leading whitespace: '\r'

    I don't believe that doctest.testfile is documented as requiring Unix
    line endings, and the line endings in the file are okay. I've checked in
    a hex editor, and they are valid \r\n line endings.

    In doctest._load_testfile, I find this comment and code:

    ? ? # get_data() opens files as 'rb', so one must do the equivalent
    ? ? # conversion as universal newlines would do.
    ? ? return file_contents.replace(os.linesep, '\n'), filename

    which I read as an attempt to normalise line endings in the file to \n.

    (But surely this will fail? If you're running, say, Linux or MacOS,
    linesep will already be '\n' not '\r\n', and consequently the replace
    does nothing, any Windows line endings aren't normalised, and doctest
    will choke on the \r characters. It's only useful if running on Windows.)

    But the above only occurs when using a package loader. Otherwise,
    _load_testfile executes:

    ? ? return open(filename).read(), filename

    which doesn't do any line ending normalisation at all.

    To my mind, this is a bug in doctest. Does anyone disagree? I think the
    simplest fix is to change it to:

    ? ? return open(filename, 'rU').read(), filename

    Comments?

    --
    Steven
    Seems like a bug to me. I often assume that I don't know where a
    string is coming from, so one of the first steps I usually take when
    parsing a string is:

    s = s.replace('\r\n', '\n').replace('\r', '\n')

    And, out of long-standing pre-Python habit, I always open files in
    binary mode and then have my way with them. I know universal mode is
    available, but honestly, I don't care for all the bookkeeping on what
    kinds of line endings have been seen -- I just want to normalize the
    data.

    Regards,
    Pat

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedApr 11, '10 at 3:16a
activeApr 11, '10 at 4:01a
posts2
users2
websitepython.org

2 users in discussion

Steven D'Aprano: 1 post Patrick Maupin: 1 post

People

Translate

site design / logo © 2022 Grokbase