FAQ
Hi,


I'm running pep8 across my code, and getting warnings about my long lines (> 80 characters).


I'm wonder what's the recommended way to handle the below cases, and fit under 80 characters.


First example - multiple context handlers:


             with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:


and in my case, with indents, the 80-character marks is just before the ending "as output".


What's the standard recognised way to split this across multiple lines, so that I'm under 80 characters?


I can't just split after the "as input," as that isn't valid syntax, and there's no convenient parentheses for me to split over.


Is there a standard Pythonic way?


Second example - long error messages:


             self.logger.error('Unable to open input or output file - %s. Please check you have sufficient permissions and the file and parent directory exist.' % e)


I can use triple quotes:


             self.logger.error(
                 """Unable to open input or output file - %s. Please check you
                 have sufficient permissions and the file and parent directory
                 exist.""" % e)


However, that will introduce newlines in the message, which I don't want.


I can use backslashes:


             self.logger.error(
                 'Unable to open input or output file - %s. Please check you\
                 have sufficient permissions and the file and parent directory\
                 exist.' % e)


which won't introduce newlines.


Or I can put them all as separate strings, and trust Python to glue them together:


             self.logger.error(
                 'Unable to open input or output file - %s. Please check you'
                 'have sufficient permissions and the file and parent directory'
                 'exist.' % e)


Which way is the recommended Pythonic way?


Third example - long comments:


             """ NB - We can't use Psycopg2's parametised statements here, as
             that automatically wraps everything in single quotes.
             So s3://my_bucket/my_file.csv.gz would become s3://'my_bucket'/'my_file.csv.gz'.
             Hence, we use Python's normal string formating - this could
             potentially exposes us to SQL injection attacks via the config.yaml
             file.
             I'm not aware of any easy ways around this currently though - I'm
             open to suggestions though.
             See
             http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error
             for further information. """


In this case, I'm guessing a using triple quotes (""") is a better idea with multi-line comments, right?


However, I've noticed that I can't seem to put in line-breaks inside the comment without triggering a warning. For example, trying to put in another empty line in between lines 6 and 7 above causes a warning.


Also, how would I split up the long URLs? Breaking it up makes it annoying to use the URL. Thoughts?


Cheers,
Victor

Search Discussions

  • Victor Hooi at Nov 28, 2013 at 2:03 am
    Hi,


    Also, forgot two other examples that are causing me grief:


         cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                         [[os.path.relpath(filename, foobar_input_folder)] for filename in filenames])


    I've already broken it up using the parentheses, not sure what's the tidy way to break it up again to fit under 80? In this case, the 80-character mark is hitting me around the "for filename" towards the end.


    and:


                         if os.path.join(root, file) not in previously_processed_files and os.path.join(root, file)[:-3] not in previously_processed_files:


    In this case, the 80-character mark is actually partway through "previously processed files" (the first occurrence)...


    Cheers,
    Victor

    On Thursday, 28 November 2013 12:57:13 UTC+11, Victor Hooi wrote:
    Hi,



    I'm running pep8 across my code, and getting warnings about my long lines (> 80 characters).



    I'm wonder what's the recommended way to handle the below cases, and fit under 80 characters.



    First example - multiple context handlers:



    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:



    and in my case, with indents, the 80-character marks is just before the ending "as output".



    What's the standard recognised way to split this across multiple lines, so that I'm under 80 characters?



    I can't just split after the "as input," as that isn't valid syntax, and there's no convenient parentheses for me to split over.



    Is there a standard Pythonic way?



    Second example - long error messages:



    self.logger.error('Unable to open input or output file - %s. Please check you have sufficient permissions and the file and parent directory exist.' % e)



    I can use triple quotes:



    self.logger.error(

    """Unable to open input or output file - %s. Please check you

    have sufficient permissions and the file and parent directory

    exist.""" % e)



    However, that will introduce newlines in the message, which I don't want.



    I can use backslashes:



    self.logger.error(

    'Unable to open input or output file - %s. Please check you\

    have sufficient permissions and the file and parent directory\

    exist.' % e)



    which won't introduce newlines.



    Or I can put them all as separate strings, and trust Python to glue them together:



    self.logger.error(

    'Unable to open input or output file - %s. Please check you'

    'have sufficient permissions and the file and parent directory'

    'exist.' % e)



    Which way is the recommended Pythonic way?



    Third example - long comments:



    """ NB - We can't use Psycopg2's parametised statements here, as

    that automatically wraps everything in single quotes.

    So s3://my_bucket/my_file.csv.gz would become s3://'my_bucket'/'my_file.csv.gz'.

    Hence, we use Python's normal string formating - this could

    potentially exposes us to SQL injection attacks via the config.yaml

    file.

    I'm not aware of any easy ways around this currently though - I'm

    open to suggestions though.

    See

    http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error

    for further information. """



    In this case, I'm guessing a using triple quotes (""") is a better idea with multi-line comments, right?



    However, I've noticed that I can't seem to put in line-breaks inside the comment without triggering a warning. For example, trying to put in another empty line in between lines 6 and 7 above causes a warning.



    Also, how would I split up the long URLs? Breaking it up makes it annoying to use the URL. Thoughts?



    Cheers,

    Victor
  • Ben Finney at Nov 28, 2013 at 2:55 am

    Victor Hooi <victorhooi@gmail.com> writes:


    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [[os.path.relpath(filename, foobar_input_folder)] for filename in filenames])

    I've already broken it up using the parentheses

    But now the continuation line indentation is needlessly dependent on the
    content of the previous line.


    Better to just use a standard additional indentation (I prefer 8 columns
    to easily distinguish from block indentation) for any continuation
    line::


         cur.executemany(
                 "INSERT INTO foobar_foobar_files VALUES (?)", [
                     [os.path.relpath(filename, foobar_input_folder)]
                     for filename in filenames])


    Notice that further continuation within an existing continuation doesn't
    get another 8, but only another 4 columns. The whole statement is
    already distinguished, I don't need to add huge amounts of further
    indentation for that purpose.

    if os.path.join(root, file) not in previously_processed_files and os.path.join(root, file)[:-3] not in previously_processed_files:

    In cases where the line is long because of a long expression, enclosing
    that expression in parens for clarification will then give you an easy
    way to break for continuation lines::


         if (
                 os.path.join(root, file) not in previously_processed_files
                 and os.path.join(root, file)[:-3] not in previously_processed_files):


    But that's still some long lines, and your repeated use of a computed
    value is an ideal opportunity to bind it to a convenience name for local
    use::


         file_path = os.path.join(root, file)
         if (
                 file_path not in previously_processed_files
                 and file_path[:-3] not in previously_processed_files):


    Not least because you're very likely going to use the value of
    ?file_path? yet again in the body of that ?if? block.


    --
      \ ?Two paradoxes are better than one; they may even suggest a |
       `\ solution.? ?Edward Teller |
    _o__) |
    Ben Finney
  • Terry Reedy at Nov 28, 2013 at 3:03 am

    On 11/27/2013 9:03 PM, Victor Hooi wrote:
    Hi,

    Also, forgot two other examples that are causing me grief:

    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [[os.path.relpath(filename, foobar_input_folder)] for filename in filenames])

    I've already broken it up using the parentheses, not sure what's the tidy way to break it up again to fit under 80? In this case, the 80-character mark is hitting me around the "for filename" towards the end.

    add another linebreak, you are still inside the parens.

    and:

    if os.path.join(root, file) not in previously_processed_files and os.path.join(root, file)[:-3] not in previously_processed_files:

    In this case, the 80-character mark is actually partway through "previously processed files" (the first occurrence)...

    You can add parens. In this case, factoring out the common subexpression
    and replaced the repeated long name does the job.
                          p = os.path.join(root, file)
                          ppf = previously_processed_files
                          if p not in ppf and p[:-3] not in ppf:


    [snip previous post]


    --
    Terry Jan Reedy
  • Ned Batchelder at Nov 28, 2013 at 3:05 am

    On 11/27/13 9:03 PM, Victor Hooi wrote:
    Hi,

    Also, forgot two other examples that are causing me grief:

    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [[os.path.relpath(filename, foobar_input_folder)] for filename in filenames])

    I've already broken it up using the parentheses, not sure what's the tidy way to break it up again to fit under 80? In this case, the 80-character mark is hitting me around the "for filename" towards the end.

          file_values = [
              (os.path.relpath(filename, foobar_input_folder),)
              for filename in filenames
          ]
          cur.executemany(
              "INSERT INTO foobar_foobar_files VALUES (?)",
              file_values
          )

    and:

    if os.path.join(root, file) not in previously_processed_files and os.path.join(root, file)[:-3] not in previously_processed_files:

    In this case, the 80-character mark is actually partway through "previously processed files" (the first occurrence)...

          full_file = os.path.join(root, file)
          full_not_in = full_file not in previously_processed_files
          tail_not_in = full_file[:-3] not in previously_processed_files
          if full_not_in and tail_not_in:
              ...


    This has the advantage of naming these complex intermediate conditions,
    and giving you natural places to write comments explaining them. Why
    [:-3] for example?


    More, shorter, statements.


    --Ned.

    Cheers,
    Victor
  • Jussi Piitulainen at Nov 28, 2013 at 6:12 am

    Victor Hooi <victorhooi@gmail.com> writes:


    Hi,

    Also, forgot two other examples that are causing me grief:

          cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                          [[os.path.relpath(filename, foobar_input_folder)]
                           for filename in filenames])

    I've already broken it up using the parentheses, not sure what's the
    tidy way to break it up again to fit under 80? In this case, the
    80-character mark is hitting me around the "for filename" towards
    the end.

    That's a natural break. I did it to your code above. Another is the
    condition, if it's there:


          cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                          [[os.path.relpath(filename, foobar_input_folder)]
                           for filename in filenames
                           if not filename.startswith('tmp')])


    There's much freedom of indentation inside the brackets, but these
    points are natural.

    and:

    Put the long expression in parentheses and you are again free to break
    and indent; I tend to have extra spaces inside these parentheses, but
    I have no idea about any standards:


                          if ( os.path.join(root, file)
                               not in previously_processed_files and
                               os.path.join(root, file)[:-3]
                               not in previously_processed_files ):

    In this case, the 80-character mark is actually partway through
    "previously processed files" (the first occurrence)...

    Try to find natural breaks, between phrases, and maybe highlight
    operators by putting them in the beginning of a line like.


    Or even this:
                          pre = previously_processed_files
                          if ( os.path.join(root, file) not in pre and
                               os.path.join(root, file)[:-3] not in pre ):
  • Chris Angelico at Nov 28, 2013 at 6:22 am

    On Thu, Nov 28, 2013 at 5:12 PM, Jussi Piitulainen wrote:
    if ( os.path.join(root, file)
    not in previously_processed_files and
    os.path.join(root, file)[:-3]
    not in previously_processed_files ):

    Eww, don't like this. The indentation puts the two halves of the
    condition on equal footing, which they're not.


    ChrisA
  • Roy Smith at Nov 28, 2013 at 3:08 pm
    In article <34479463-b8a8-4417-9989-cd29369461c2@googlegroups.com>,
      Victor Hooi wrote:

    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [[os.path.relpath(filename, foobar_input_folder)] for
    filename in filenames])

    I don't often write raw SQL embedded in Python. I'm much more likely to
    use some sort of ORM layer. But, if I were doing this, I would break it
    up something like:






    There's a few different strategies I employed there. My first thought
    was a logical break of computing the list of pathnames vs. inserting
    them into the database. That got me here:


         paths = [[os.path.relpath(filename, foobar_input_folder)] \
           for filename in filenames]
         cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                         paths)


    I wouldn't have actually broken the first line with the backslash, but
    I'm doing that to appease my news posting software.


    My next step would be some simple textual changes; I'd get rid of the
    overly-line variable names. In general, I don't like very short
    variable names, but I'm OK with them as long as the scope is very small,
    as it is here:


         paths = [[os.path.relpath(fn, folder)] for fn in filenames]
         cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                         paths)


    I'd probably factor out the double lookup of os.path.relpath. I think
    this is easier to read:


         relpath = os.path.relpath
         paths = [[relpath(fn, folder)] for fn in filenames]
         cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                         paths)


    If filenames was a very long list, it would also be a little bit faster
    to execute, but that's such a small factor as to probably be
    unmeasurable.


    And, finally, I'd probably move one set of square brackets down into the
    SQL statement. It really makes more sense there anyway; the bundling up
    of the arguments into a sequence is more a part of the database API than
    it is inherent to the data.


         relpath = os.path.relpath
         paths = [relpath(fn, foobar_input_folder) for fn in filenames]
         cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
                         [paths])
  • Walter Hurry at Nov 28, 2013 at 5:38 pm

    On Thu, 28 Nov 2013 10:08:30 -0500, Roy Smith wrote:


    In article <34479463-b8a8-4417-9989-cd29369461c2@googlegroups.com>,
    Victor Hooi wrote:
    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [[os.path.relpath(filename, foobar_input_folder)]
    for filename in filenames])
    I don't often write raw SQL embedded in Python. I'm much more likely to
    use some sort of ORM layer. But, if I were doing this, I would break it
    up something like:



    There's a few different strategies I employed there. My first thought
    was a logical break of computing the list of pathnames vs. inserting
    them into the database. That got me here:

    paths = [[os.path.relpath(filename, foobar_input_folder)] \
    for filename in filenames]
    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    paths)

    I wouldn't have actually broken the first line with the backslash, but
    I'm doing that to appease my news posting software.

    My next step would be some simple textual changes; I'd get rid of the
    overly-line variable names. In general, I don't like very short
    variable names, but I'm OK with them as long as the scope is very small,
    as it is here:

    paths = [[os.path.relpath(fn, folder)] for fn in filenames]
    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    paths)

    I'd probably factor out the double lookup of os.path.relpath. I think
    this is easier to read:

    relpath = os.path.relpath paths = [[relpath(fn, folder)] for fn in
    filenames] cur.executemany("INSERT INTO foobar_foobar_files VALUES
    (?)",
    paths)

    If filenames was a very long list, it would also be a little bit faster
    to execute, but that's such a small factor as to probably be
    unmeasurable.

    And, finally, I'd probably move one set of square brackets down into the
    SQL statement. It really makes more sense there anyway; the bundling up
    of the arguments into a sequence is more a part of the database API than
    it is inherent to the data.

    relpath = os.path.relpath paths = [relpath(fn, foobar_input_folder)
    for fn in filenames]
    cur.executemany("INSERT INTO foobar_foobar_files VALUES (?)",
    [paths])

    I had a problem similar to this some time ago. Someone (here) suggested
    textwrap.dedent, which for me was just the ticket.
  • Ben Finney at Nov 28, 2013 at 2:47 am

    Victor Hooi <victorhooi@gmail.com> writes:


    I'm running pep8 across my code, and getting warnings about my long
    lines (> 80 characters).

    Great! Thank you for working to make your code readable by keeping lines
    reasonably short.

    I'm wonder what's the recommended way to handle the below cases, and
    fit under 80 characters.

    In general, I advise:


    * Avoid line-end backslashes like the plague. Sometimes they're
       necessary, but try very hard to write code that doesn't need them.


    * Break a long line on existing open-parenthesis syntax (whether parens,
       brackets, braces, triple-quotes); early in the line is better, so the
       reader doesn't need to scan a long way back to the next line.


    * Ignore the code on the first line for aligning, and instead indent
       continuation lines to a consistent level (i.e. don't line up with
       some arbitrary character on the first line).


    * Use eight-column indentation for continuation lines (to clearly
       distinguish from four-column block indentation).

    First example - multiple context handlers:

    I haven't used multiples in the same statement yet, and I'm annoyed that
    they simultaneously encourage long statements, and have no obvious way
    to break on an open parenthesis syntax.

    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:

    So in this case I don't see a way to avoid the ugly line-end backslash::


         with \
                 open(self.full_path, 'r') as input, \
                 open(self.output_csv, 'ab') as output:


    Or::


         with open(self.full_path, 'r') as input, \
                 open(self.output_csv, 'ab') as output:


    Both look terrible to me. I'd love to know of a way to avoid
    backslashes, while still breaking the line at the comma.

    Second example - long error messages:

    self.logger.error('Unable to open input or output file - %s. Please check you have sufficient permissions and the file and parent directory exist.' % e)

    Break at the first open-paren. Use implicit string literal continuation
    to break that literal across multiple lines inside the parens. Like so::


         self.logger.error(
                 'Unable to open input or output file - %s.'
                 ' Please check you have sufficient permissions'
                 ' and the file and parent directory exist.'
                 % e)


    Also, with such a large literal this is an ideal example of why named
    parameters are preferable::


         self.logger.error(
                 'Unable to open input or output file - {error}.'
                 ' Please check you have sufficient permissions'
                 ' and the file and parent directory exist.'.format(
                     error=e))

    Third example - long comments:

    """ NB - We can't use Psycopg2's parametised statements here, as
    that automatically wraps everything in single quotes.
    So s3://my_bucket/my_file.csv.gz would become s3://'my_bucket'/'my_file.csv.gz'.
    Hence, we use Python's normal string formating - this could
    potentially exposes us to SQL injection attacks via the config.yaml
    file.
    I'm not aware of any easy ways around this currently though - I'm
    open to suggestions though.
    See
    http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error
    for further information. """

    That's not syntactically a comment, and I don't think pretending
    triple-quoted strings are comments is good practice. If nothing else,
    you'll need a special case if you want to enclose something with
    existing triple-quotes.


    Better to use the standard Python comment style::


         # NB - We can't use Psycopg2's parametised statements here, as
         # that automatically wraps everything in single quotes.
         # So s3://my_bucket/my_file.csv.gz would become s3://'my_bucket'/'my_file.csv.gz'.
         # Hence, we use Python's normal string formating - this could
         # potentially exposes us to SQL injection attacks via the config.yaml
         # file.
         # I'm not aware of any easy ways around this currently though - I'm
         # open to suggestions though.
         # See
         # http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error
         # for further information.


    which most programmer's text editors can do for you easily by applying
    or un-applying comments to a whole selected region of lines with a
    single command.


    The comments then are correctly identified as comments by all your
    programming tools, and you don't have to think about using a different
    comment style depending on the content ? the same style is used for all
    comments.

    Also, how would I split up the long URLs? Breaking it up makes it
    annoying to use the URL. Thoughts?

    I'd try very hard to find an equivalent URL that isn't so long :-) but
    URLs in comments are a good example of a PEP 8 exception: if the line is
    over 80 characters because it contains a long URL in a comment, that's
    fine, as I'm not expecting the human reader to be scanning it carefully
    like other text in the code.


    --
      \ ?I fly Air Bizarre. You buy a combination one-way round-trip |
       `\ ticket. Leave any Monday, and they bring you back the previous |
    _o__) Friday. That way you still have the weekend.? ?Steven Wright |
    Ben Finney
  • Terry Reedy at Nov 28, 2013 at 2:55 am
    On 11/27/2013 8:57 PM, Victor Hooi wrote:


    [sorry if the re-wrapping mis-formats anything]

    I'm running pep8

    Ah yes, the module that turns PEP8 into the straightjacket it explicitly
    says it is not meant to be.

    across my code,

    We mostly to not change existing stdlib modules to conform unless they
    are being edited anyway for fixes and features.

    and getting warnings about my long lines (> 80 characters).

    You are free to ignore such



    I'm wonder what's the recommended way to handle the below cases, and
    fit under 80 characters.

    First example - multiple context handlers:

    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab')
    as output:

    and in my case, with indents, the 80-character marks is just before
    the ending "as output".

    What's the standard recognised way to split this across multiple
    lines, so that I'm under 80 characters?

    I believe () does not work, so put \ after 'input,'.

    I can't just split after the "as input," as that isn't valid syntax,
    and there's no convenient parentheses for me to split over.

    Is there a standard Pythonic way?

    Second example - long error messages:

    self.logger.error('Unable to open input or output file - %s. Please
    check you have sufficient permissions and the file and parent
    directory exist.' % e)

    I can use triple quotes:

    self.logger.error( """Unable to open input or output file - %s.
    Please check you have sufficient permissions and the file and parent
    directory exist.""" % e)

    However, that will introduce newlines in the message, which I don't
    want.

    I can use backslashes:

    self.logger.error( 'Unable to open input or output file - %s. Please
    check you\ have sufficient permissions and the file and parent
    directory\ exist.' % e)

    which won't introduce newlines.

    Or I can put them all as separate strings, and trust Python to glue
    them together:

    Such gluing in a documented feature and I think you can trust it to not
    disappear (for at least a decade ;-).

    self.logger.error( 'Unable to open input or output file - %s. Please
    check you' 'have sufficient permissions and the file and parent
    directory' 'exist.' % e)

    Which way is the recommended Pythonic way?

    Third example - long comments:

    """ NB - We can't use Psycopg2's parametised statements here, as that
    automatically wraps everything in single quotes. So
    s3://my_bucket/my_file.csv.gz would become
    s3://'my_bucket'/'my_file.csv.gz'. Hence, we use Python's normal
    string formating - this could potentially exposes us to SQL injection
    attacks via the config.yaml file. I'm not aware of any easy ways
    around this currently though - I'm open to suggestions though. See
    http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error

    for further information. """

    In this case, I'm guessing a using triple quotes (""") is a better
    idea with multi-line comments, right?

    Either are ok.

    However, I've noticed that I can't seem to put in line-breaks inside
    the comment without triggering a warning. For example, trying to put
    in another empty line in between lines 6 and 7 above causes a
    warning.

    Blank lines are normal in triple-quoted strings and recommended for
    docstrings longer than a line. So I am puzzled as to your problem.

    Also, how would I split up the long URLs?

    Don't if you can possibly avoid it.

    Breaking it up makes it annoying to use the URL.

    Yep.

    Thoughts?

    Some urls, especially long ones, have extra tracking/query junk that can
    be removed while still pointing to the page.




    --
    Terry Jan Reedy
  • Ned Batchelder at Nov 28, 2013 at 2:59 am

    On 11/27/13 8:57 PM, Victor Hooi wrote:
    Hi,

    I'm running pep8 across my code, and getting warnings about my long lines (> 80 characters).

    I'm wonder what's the recommended way to handle the below cases, and fit under 80 characters.

    My recommendations usually amount to: write more statements, each
    shorter than what you have.

    First example - multiple context handlers:

    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:

    and in my case, with indents, the 80-character marks is just before the ending "as output".

    What's the standard recognised way to split this across multiple lines, so that I'm under 80 characters?

    I can't just split after the "as input," as that isn't valid syntax, and there's no convenient parentheses for me to split over.

    Is there a standard Pythonic way?

    The important thing in a with statement is that the assigned name will
    be closed (or otherwise exited) automatically. The open call is just
    the expression used to assign the name. The expression there isn't
    really important. This looks odd, but works the same as what you have:


          input = open(self.full_path)
          output = open(self.output_csv, 'ab')
          with input as input, output as output:
              ...


    (Use different names for the two parts of the "as" clauses if you like.)

    Second example - long error messages:

    self.logger.error('Unable to open input or output file - %s. Please check you have sufficient permissions and the file and parent directory exist.' % e)

    I can use triple quotes:

    self.logger.error(
    """Unable to open input or output file - %s. Please check you
    have sufficient permissions and the file and parent directory
    exist.""" % e)

    However, that will introduce newlines in the message, which I don't want.

    I can use backslashes:

    self.logger.error(
    'Unable to open input or output file - %s. Please check you\
    have sufficient permissions and the file and parent directory\
    exist.' % e)

    which won't introduce newlines.

    Or I can put them all as separate strings, and trust Python to glue them together:

    self.logger.error(
    'Unable to open input or output file - %s. Please check you'
    'have sufficient permissions and the file and parent directory'
    'exist.' % e)

    Which way is the recommended Pythonic way?

    Use the separate strings, but don't forget the spaces:


          self.logger.error(
              "Unable to open input or output file - %s. Please check you "
              "have sufficient permissions and the file and parent directory "
              "exist." % e
          )

    Third example - long comments:

    """ NB - We can't use Psycopg2's parametised statements here, as
    that automatically wraps everything in single quotes.
    So s3://my_bucket/my_file.csv.gz would become s3://'my_bucket'/'my_file.csv.gz'.
    Hence, we use Python's normal string formating - this could
    potentially exposes us to SQL injection attacks via the config.yaml
    file.
    I'm not aware of any easy ways around this currently though - I'm
    open to suggestions though.
    See
    http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-with-sql-query-parameter-causes-syntax-error
    for further information. """

    In this case, I'm guessing a using triple quotes (""") is a better idea with multi-line comments, right?

    As Ben pointed out, using an actual comment is best.

    However, I've noticed that I can't seem to put in line-breaks inside the comment without triggering a warning. For example, trying to put in another empty line in between lines 6 and 7 above causes a warning.

    I don't know what you mean about line-breaks causing warnings.


    --Ned.

    Also, how would I split up the long URLs? Breaking it up makes it annoying to use the URL. Thoughts?

    Cheers,
    Victor
  • Ben Finney at Nov 28, 2013 at 3:06 am

    Ned Batchelder <ned@nedbatchelder.com> writes:


    The important thing in a with statement is that the assigned name will
    be closed (or otherwise exited) automatically. The open call is just
    the expression used to assign the name. The expression there isn't
    really important. This looks odd, but works the same as what you
    have:

    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...

    Excellent! That solves for me a significant irritation of the syntax for
    multiple context managers. Thank you.


    --
      \ ?The most common way people give up their power is by thinking |
       `\ they don't have any.? ?Alice Walker |
    _o__) |
    Ben Finney
  • MRAB at Nov 28, 2013 at 12:43 pm

    On 28/11/2013 03:06, Ben Finney wrote:
    Ned Batchelder <ned@nedbatchelder.com> writes:
    The important thing in a with statement is that the assigned name will
    be closed (or otherwise exited) automatically. The open call is just
    the expression used to assign the name. The expression there isn't
    really important. This looks odd, but works the same as what you
    have:

    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...
    Excellent! That solves for me a significant irritation of the syntax for
    multiple context managers. Thank you.
    What if the second 'open' raises an exception?
  • Ethan Furman at Nov 28, 2013 at 3:15 am

    On 11/27/2013 06:59 PM, Ned Batchelder wrote:
    The important thing in a with statement is that the assigned name will be closed (or otherwise exited) automatically.
    The open call is just the expression used to assign the name. The expression there isn't really important. This looks
    odd, but works the same as what you have:

    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...

    (Use different names for the two parts of the "as" clauses if you like.)

    Or skip the `as` clauses all together:


        input = ...
        output = ...
        with input, output:
          ...


    works just fine. (At least on 2.7 where I tested it. ;)


    --
    ~Ethan~
  • Steven D'Aprano at Nov 28, 2013 at 3:02 am

    On Thu, 28 Nov 2013 13:47:22 +1100, Ben Finney wrote:




    Third example - long comments:

    """ NB - We can't use Psycopg2's parametised statements here, as
    that automatically wraps everything in single quotes. So
    s3://my_bucket/my_file.csv.gz would become
    s3://'my_bucket'/'my_file.csv.gz'. Hence, we use Python's normal
    string formating - this could potentially exposes us to SQL
    injection attacks via the config.yaml file. I'm not aware of any
    easy ways around this currently though - I'm open to suggestions
    though.
    See
    http://stackoverflow.com/questions/9354392/psycopg2-cursor-execute-
    with-sql-query-parameter-causes-syntax-error
    for further information. """
    That's not syntactically a comment, and I don't think pretending
    triple-quoted strings are comments is good practice. If nothing else,
    you'll need a special case if you want to enclose something with
    existing triple-quotes.

    The CPython core devs disagree with you. Using bare strings for comments
    is supported by Python, and the compiler strips them out at compile-time:




    steve at runes:~$ python3.3
    Python 3.3.0rc3 (default, Sep 27 2012, 18:31:58)
    [GCC 4.4.5] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    py> import dis
    py> dis.dis('x = 1; """Comment"""; y = 1')
       1 0 LOAD_CONST 0 (1)
                   3 STORE_NAME 0 (x)
                   6 LOAD_CONST 0 (1)
                   9 STORE_NAME 1 (y)
                  12 LOAD_CONST 1 (None)
                  15 RETURN_VALUE
    py>




    I'm on the fence on this one. I can it's useful; but it's also
    inconsistent with docstrings. And I wonder why other "dead objects"
    aren't also dropped:


    py> dis.dis('x = 1; {23: "a", "b": 42}; y = 1')
       1 0 LOAD_CONST 0 (1)
                   3 STORE_NAME 0 (x)
                   6 BUILD_MAP 2
                   9 LOAD_CONST 1 ('a')
                  12 LOAD_CONST 2 (23)
                  15 STORE_MAP
                  16 LOAD_CONST 3 (42)
                  19 LOAD_CONST 4 ('b')
                  22 STORE_MAP
                  23 POP_TOP
                  24 LOAD_CONST 0 (1)
                  27 STORE_NAME 1 (y)
                  30 LOAD_CONST 5 (None)
                  33 RETURN_VALUE




    --
    Steven
  • Ben Finney at Nov 28, 2013 at 3:14 am

    Steven D'Aprano <steve@pearwood.info> writes:

    On Thu, 28 Nov 2013 13:47:22 +1100, Ben Finney wrote:
    [A triple-quoted string is] not syntactically a comment, and I don't
    think pretending triple-quoted strings are comments is good
    practice. If nothing else, you'll need a special case if you want to
    enclose something with existing triple-quotes.
    The CPython core devs disagree with you. Using bare strings for comments
    is supported by Python, and the compiler strips them out at
    compile-time:

    That's not the same as disagreeing with what I said above. Triple-quoted
    strings are not syntactically comments, they're still syntactically
    treated as string literals. The implementation detail of them being
    stripped out doesn't change that.


    Also, ?is supported by Python? doesn't mean the developers agree with
    the practice.


    --
      \ ?Be careless in your dress if you must, but keep a tidy soul.? |
       `\ ?Mark Twain, _Following the Equator_ |
    _o__) |
    Ben Finney
  • Neil Cerutti at Nov 28, 2013 at 3:09 am

    tOn Wed, Nov 27, 2013 at 8:57 PM, Victor Hooi wrote:
    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:

    There are two nice and clean solutions for this.


    I usually nest them.


    with open(self.full_path, 'r') as input:
       with open(self.ouptut_csv, 'ab') as output:


    I use a non-standard (shortened) indentation for nested with
    statements, to save space. I don't feel bad about this at all; the
    with statement's syntax makes me do it!


    Otherwise, contextlib.ExitStack is another way to separate them.


    with contextlib.ExitStack() as stack:
         input = stack.enter_context(open(self.full_path, 'r'))
         writer = csv.writer(stack.enter_context(open(self.output_csv)))


    When working with a csv file I like how it removes the output
    temporary file object variable, though if you needed it for some
    reason you could keep it.


    --
    Neil Cerutti
  • Steven D'Aprano at Nov 28, 2013 at 3:57 am

    On Wed, 27 Nov 2013 17:57:13 -0800, Victor Hooi wrote:


    Hi,

    I'm running pep8 across my code, and getting warnings about my long
    lines (> 80 characters).

    I'm wonder what's the recommended way to handle the below cases, and fit
    under 80 characters.

    First example - multiple context handlers:

    with open(self.full_path, 'r') as input,
    open(self.output_csv, 'ab') as output:



    if True: # add indentation, just for the example's sake.
         if True:
             if True:
                 with (open(self.full_path, 'r') as input,
                           open(self.output_csv, 'ab') as output):
                     do_this()
                     do_that()





    Second example - long error messages:

    self.logger.error('Unable to open input or output file - %s.
    Please check you have sufficient permissions and the file
    and parent directory exist.' % e)

    Long strings are always ugly :-(


    But you can use implicit concatenation to make them a little less so.






    if True: # add indentation, just for the example's sake.
         if True:
             if True:
                 self.logger.error(
                     'Unable to open input or output file - %s.'
                     ' Please check you have sufficient permissions and
                     ' the file and parent directory exist.'
                     % e)


    Notice that my convention is to use a leading space in the strings when
    doing implicit concatenation, if possible. E.g.:


    ("Hello"
      " World!")


    rather than:


    ("Hello "
      "World!")



    Third example - long comments:

    """ NB - We can't use Psycopg2's parametised statements
    here, as that automatically wraps everything in single
    quotes. So s3://my_bucket/my_file.csv.gz would become
    s3://'my_bucket'/'my_file.csv.gz'. Hence, we use Python's
    normal string formating - this could potentially exposes us
    to SQL injection attacks via the config.yaml file.
    I'm not aware of any easy ways around this currently though
    - I'm open to suggestions though.
    See
    http://stackoverflow.com/questions/9354392/psycopg2-cursor-
    execute-with-sql-query-parameter-causes-syntax-error
    for further information. """

    In this case, I'm guessing a using triple quotes (""") is a better idea
    with multi-line comments, right?

    *shrug* That's a matter of personal preference.



    However, I've noticed that I can't seem to put in line-breaks inside the
    comment without triggering a warning. For example, trying to put in
    another empty line in between lines 6 and 7 above causes a warning.

    Also, how would I split up the long URLs? Breaking it up makes it
    annoying to use the URL. Thoughts?

    I hate long URLs.


    There's no good solution to long URLs. Just leave them as they come, and
    if your style checker allows you to flag an exception to the long-line
    rules, use it.




    --
    Steven
  • Steven D'Aprano at Nov 28, 2013 at 4:03 am

    On Thu, 28 Nov 2013 03:57:03 +0000, Steven D'Aprano wrote:




    if True: # add indentation, just for the example's sake.
    if True:
    if True:
    with (open(self.full_path, 'r') as input,
    open(self.output_csv, 'ab') as output):
    do_this()
    do_that()



    Bah! Apparently you can't do that.


    py> with (open(self.full_path, 'r') as input,
       File "<stdin>", line 1
         with (open(self.full_path, 'r') as input,
                                          ^
    SyntaxError: invalid syntax








    --
    Steven
  • Steven D'Aprano at Nov 28, 2013 at 3:58 am

    On Wed, 27 Nov 2013 21:59:56 -0500, Ned Batchelder wrote:


    The important thing in a with statement is that the assigned name will
    be closed (or otherwise exited) automatically. The open call is just
    the expression used to assign the name. The expression there isn't
    really important. This looks odd, but works the same as what you have:

    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...

    (Use different names for the two parts of the "as" clauses if you like.)



    That's really clever! Why didn't I think of that?




    --
    Steven
  • Tim Chase at Nov 28, 2013 at 2:04 pm

    On 2013-11-28 03:58, Steven D'Aprano wrote:
    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...
    That's really clever! Why didn't I think of that?

    Because if the 2nd output fails, the input doesn't get closed as
    it {w,sh}ould in a with-statement? :-)


    You could work around this with:


       from functools import partial
       in_ = partial(open, self.full_path)
       out_ = partial(open, self.output_csv, 'ab')
       with in_() as input, out_() as output:
         do_stuff()


    There's still room for programmer error if you don't have an
    "output_csv" property on self and you get an AttributeError, but
    that's more of a "dumb programmer error" rather than an actual
    runtime exception.


    -tkc
  • Chris Angelico at Nov 28, 2013 at 2:21 pm

    On Fri, Nov 29, 2013 at 1:04 AM, Tim Chase wrote:
    You could work around this with:

    from functools import partial
    in_ = partial(open, self.full_path)
    out_ = partial(open, self.output_csv, 'ab')
    with in_() as input, out_() as output:
    do_stuff()

    Yeah, I think I'd prefer a backslash continuation to that!


    ChrisA
  • Ned Batchelder at Nov 28, 2013 at 5:26 pm

    On 11/28/13 9:04 AM, Tim Chase wrote:
    On 2013-11-28 03:58, Steven D'Aprano wrote:
    input = open(self.full_path)
    output = open(self.output_csv, 'ab')
    with input as input, output as output:
    ...
    That's really clever! Why didn't I think of that?
    Because if the 2nd output fails, the input doesn't get closed as
    it {w,sh}ould in a with-statement? :-)

    D'oh, sorry. A doubly-nested with statement?


          input = open(self.full_path)
          with input:
              output = open(self.output_csv, 'ab')
              with output:
                  ...


    Of course, if you were willing to use a doubly-nested with statement,
    then the original open-in-the-with form would have fit in 80 chars...
    Also, I have a sneaking suspicion there's still an error case not
    handled well, someone will point it out if so!


    --Ned.

    -tkc
  • Roel Schroeven at Nov 28, 2013 at 6:37 pm

    Victor Hooi schreef:
    with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as output:

    I think I would write that as


          with open(self.full_path, 'r') as input:
              with open(self.output_csv, 'ab') as output:


    That introduces an extra indentation level, but that doesn't really
    bother me. Does this seem like a good idea?




    Best regards,
    Roel


    --
    "Met een spitsvondig citaat bewijs je niets."
              -- Voltaire
  • Nick Mellor at Dec 5, 2013 at 1:47 am
    Hi Victor,


    I use PyCharm which is set up by default to warn when line length exceeds 120 chars, not 80. Perhaps times have changed?


    I often break comprehensions at the for, in and else clauses. It's often not for line length reasons but because it's easier to follow the code that way. I have heard this is how Haskell programmers tend to use comprehensions (comprehensions are from Haskell originally):


    location=random.choice([loc['pk']
                             for loc
                             in locations.whole_register()
                             if loc['fields']['provider_id'] == provider_id])))


    The other suggestion I have is to put the with clauses in a generator function. This saves you a level or more of indentation and modularises the code usefully. Here's an example:


    def spreadsheet(csv_filename):
         with open(csv_filename) as csv_file:
             for csv_row in list(csv.DictReader(csv_file, delimiter='\t')):
                 yield csv_row


    then invoked using:


    for row in spreadsheet("...")
         # your processing code here


    Cheers,


    Nick

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedNov 28, '13 at 1:57a
activeDec 5, '13 at 1:47a
posts26
users15
websitepython.org

People

Translate

site design / logo © 2022 Grokbase