FAQ
I use Python a lot, but not well. I usually start by writing a small
script, no classes or modules. Then I add more content to the loops,
and repeat. It's a bit of a trial and error learning phase, making
sure I'm using the third party modules correctly, and so on. I end up
with a working script, but by the end it looks messy, unorganized, and
feels hacked together. I feel like in order to reuse it or expand it
in the future, I need to take what I learned and rewrite it from
scratch.

If I peeked over a Python expert's shoulder while they developed
something new, how would their habits differ? Do they start with
classes from the start?

I guess I'm looking for something similar to "Large Scale C++ Software
Design" for Python. Or even just a walkthrough of someone competent
writing something from scratch. I'm not necessarily looking for a
finished product that is well written. I'm more interested in, "I have
an idea for a script/program, and here is how I get from point A to
point B."

Or maybe I'm looking for is best practices for how to organize the
structure of a Python program. I love Python and I just want to be
able to use it well.

Search Discussions

  • Kurt Smith at Feb 16, 2011 at 7:09 pm

    On Wed, Feb 16, 2011 at 12:35 PM, snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops,
    and repeat. It's a bit of a trial and error learning phase, making
    sure I'm using the third party modules correctly, and so on. I end up
    with a working script, but by the end it looks messy, unorganized, and
    feels hacked together. I feel like in order to reuse it or expand it
    in the future, I need to take what I learned and rewrite it from
    scratch.

    If I peeked over a Python expert's shoulder while they developed
    something new, how would their habits differ? Do they start with
    classes from the start?

    I guess I'm looking for something similar to "Large Scale C++ Software
    Design" for Python. Or even just a walkthrough of someone competent
    writing something from scratch. I'm not necessarily looking for a
    finished product that is well written. I'm more interested in, "I have
    an idea for a script/program, and here is how I get from point A to
    point B."

    Or maybe I'm looking for is best practices for how to organize the
    structure of a Python program. I love Python and I just want to be
    able to use it well.
    Try this:

    http://www.refactoring.com/

    Not a silver bullet, but a good place to start.
  • Dan Stromberg at Feb 16, 2011 at 10:09 pm

    On Wed, Feb 16, 2011 at 10:35 AM, snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops,
    and repeat. It's a bit of a trial and error learning phase, making
    sure I'm using the third party modules correctly, and so on. I end up
    with a working script, but by the end it looks messy, unorganized, and
    feels hacked together. I feel like in order to reuse it or expand it
    in the future, I need to take what I learned and rewrite it from
    scratch.
    To some extent, writing code well just comes from practice with
    programming, and practice with a language.

    Exploratory programming is pretty normal (though some still insist on
    having a complete design before starting on coding), but I find that
    having lots of automated tests helps make exploratory programming more
    practical. You may or may not want to read a bit about agile
    programming: http://en.wikipedia.org/wiki/Agile_software_development

    A rewrite once in a while is not the end of the world, unless your
    management decides it is (then it's just a pain to live without the
    rewrite). ^_^ Oh, and if you use modules and classes and even just
    functions to limit the impact of one detail on another (each design
    decision probably should be wrapped up into its own scope somehow),
    you'll find that rewrites of Portions of your code are pretty viable -
    without having changes need to cascade through one's codebase.

    Just saying "I want this to read clearly, not run marginally faster"
    helps, as does using tools like pylint, pychecker, pyflakes and/or
    pep8 (pylint probably obviates the pep8 script, but pychecker or
    pyflakes likely work well in combination with pep8 - so far I've only
    used pylint).

    Also, pymetrics is nice for its McCabe Complexity statistic - if the #
    gets too high, simplify - this often means subdividing large functions
    or methods into a larger number of smaller functions or methods. But
    pylint and perhaps others have a warning if your code blocks get too
    long - that almost gives the same benefit as McCabe.

    You may find that http://rope.sourceforge.net/ helps with your
    refactoring, though I have yet to try rope - I just use vim and n.n.n.
    I hear that some IDE's support refactoring well - pycharm might be a
    good example.

    IOW, using some automated tools should give you lots of nearly
    immediate feedback and assistance in your goal.

    Yes, some people do start with classes at the outset. Just think of a
    class as a jack in the box - something with an external view, and a
    different, hidden, internal view. When you see a need for something
    like a jack in a box in your code (two different views of what's going
    on, to limit detail getting scattered more broadly than necessary),
    consider using a class or perhaps a generator. And yeah, sometimes
    functions are enough - hey, some functional programming languages have
    no other means of limiting the impact of details, and there is still
    some really good functional code out there.

    Finally, look over someone else's code now and then for ideas; that's
    a great way to learn. You don't necessarily have to hover over
    someone's shoulder to learn from them - fortunately we live in a world
    with symbolic language :). Make sure the copyright on the code won't
    bite you though.

    HTH
  • Terry Reedy at Feb 16, 2011 at 10:33 pm

    On 2/16/2011 1:35 PM, snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops,
    and repeat. It's a bit of a trial and error learning phase, making
    sure I'm using the third party modules correctly, and so on. I end up
    with a working script, but by the end it looks messy, unorganized, and
    feels hacked together. I feel like in order to reuse it or expand it
    in the future, I need to take what I learned and rewrite it from
    scratch.
    Not a completely bad idea, except for the 'from scratch' part. Parts of
    code that work may just need reorganizing.

    The most import thing is automated tests. They should grow with the
    code. Tests are like having a safety net.
    If I peeked over a Python expert's shoulder while they developed
    something new, how would their habits differ? Do they start with
    classes from the start?
    Depends on whether the particular purpose needs user-defined classes or
    is fine with functions using built-in classes.

    --
    Terry Jan Reedy
  • Steven D'Aprano at Feb 16, 2011 at 11:00 pm

    On Wed, 16 Feb 2011 10:35:28 -0800, snorble wrote:

    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops, and
    repeat. It's a bit of a trial and error learning phase, making sure I'm
    using the third party modules correctly, and so on. I end up with a
    working script, but by the end it looks messy, unorganized, and feels
    hacked together. I feel like in order to reuse it or expand it in the
    future, I need to take what I learned and rewrite it from scratch. [...]
    Or maybe I'm looking for is best practices for how to organize the
    structure of a Python program. I love Python and I just want to be able
    to use it well.
    I don't think best practice for writing Python is that much different
    from best practice for other languages. The main difference is that
    Python is multi-paradigm: you can mix procedural, functional and object-
    oriented code in the one program.

    You should read about bottom-up and top-down programming. You'll probably
    end up doing some of both, but mostly top-down.

    The most important thing is structured programming and modularization.
    The debate over structured programming was won so decisively that people
    have forgotten that there was ever an argument to be made for spaghetti
    code!

    You can learn a lot (and lose a lot of time!) reading the c2.com wiki at
    http://c2.com/cgi/wiki. These may be useful:

    http://c2.com/cgi/wiki?StructuredProgramming
    http://c2.com/cgi/wiki?WhatIsRefactoring


    Break your code up into small pieces, whether you use functions or
    classes doesn't really matter, although for small scripts functions are
    simpler and have less mental overhead. Instead of writing one giant
    monolithic block of code that does twenty things, write one function for
    each thing, and then one extra main function to call them. This
    encourages code reuse, ease of testing, and simplifies maintenance.


    I find that I've learned more from "things to avoid" than from "things to
    do". Something about reading about disasters makes it really clear why
    you shouldn't do it that way :)

    Avoid:

    * GOTO, but Python doesn't have that :)

    * Copy-and-paste programming. If you want to do almost the same thing
    twice, don't copy the relevant code, paste it, and make a small
    modification to it. Write one or more functions that handle the common
    code, and call the function.

    * Functions that do unrelated things. Functions should do one thing,
    although that "thing" can get quite complicated.

    * Don't sweep errors under the rug. Don't catch exceptions unless you can
    do something about them. "Just ignore it, I'm sure it's not important" is
    rarely appropriate.


    See also:
    http://c2.com/cgi/fullSearch?search=CategoryDevelopmentAntiPattern



    --
    Steven
  • Michael Torrie at Feb 17, 2011 at 10:55 pm

    On 02/16/2011 04:00 PM, Steven D'Aprano wrote:
    You should read about bottom-up and top-down programming. You'll probably
    end up doing some of both, but mostly top-down.
    Most of my development is done by designing top-down and then coding
    bottom-up. Coding top down is fine, but I'd expect to refactor the code
    frequently as I try to spin code off into standalone modules.
  • Ben Finney at Feb 16, 2011 at 11:12 pm

    Terry Reedy <tjreedy at udel.edu> writes:

    The most import thing is automated tests.
    Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
    The most important thing is structured programming and modularization.

    Steel-cage death match. FIGHT!

    --
    \ ?[W]e are still the first generation of users, and for all that |
    `\ we may have invented the net, we still don't really get it.? |
    _o__) ?Douglas Adams |
    Ben Finney
  • Steven D'Aprano at Feb 17, 2011 at 12:43 am

    On Thu, 17 Feb 2011 10:12:52 +1100, Ben Finney wrote:

    Terry Reedy <tjreedy at udel.edu> writes:
    The most import thing is automated tests.
    Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
    The most important thing is structured programming and modularization.

    Steel-cage death match. FIGHT!
    To the death? No, to the pain!

    http://en.wikiquote.org/wiki/The_Princess_Bride



    --
    Steven
  • Flebber at Feb 17, 2011 at 9:53 am

    On Feb 17, 11:43?am, Steven D'Aprano <steve +comp.lang.pyt... at pearwood.info> wrote:
    On Thu, 17 Feb 2011 10:12:52 +1100, Ben Finney wrote:
    Terry Reedy <tjre... at udel.edu> writes:
    The most import thing is automated tests.
    Steven D'Aprano <steve+comp.lang.pyt... at pearwood.info> writes:
    The most important thing is structured programming and modularization.
    Steel-cage death match. FIGHT!
    To the death? No, to the pain!

    http://en.wikiquote.org/wiki/The_Princess_Bride

    --
    Steven
    I really liked Abstraction Chapter 6 & 7 In Magnus Lie Hetlands book
    novice to professional. It really show the how to "think it out" which
    seems to be what your after. The first sub heading in Chapter 6 is
    "Laziness is a virtue" can't beat that.
  • Chris Rebert at Feb 17, 2011 at 10:05 am

    On Thu, Feb 17, 2011 at 1:53 AM, flebber wrote:
    On Feb 17, 11:43?am, Steven D'Aprano <steve
    +comp.lang.pyt... at pearwood.info> wrote:
    On Thu, 17 Feb 2011 10:12:52 +1100, Ben Finney wrote:
    Terry Reedy <tjre... at udel.edu> writes:
    The most import thing is automated tests.
    Steven D'Aprano <steve+comp.lang.pyt... at pearwood.info> writes:
    The most important thing is structured programming and modularization.
    Steel-cage death match. FIGHT!
    <snip>
    I really liked Abstraction Chapter 6 & 7 In Magnus Lie Hetlands book
    novice to professional. It really show the how to "think it out" which
    seems to be what your after. The first sub heading in Chapter 6 is
    "Laziness is a virtue" can't beat that.
    Related:
    http://c2.com/cgi/wiki?LazinessImpatienceHubris

    Cheers,
    Chris
  • Roy Smith at Feb 16, 2011 at 11:21 pm
    In article
    <cfdfce2a-a7cd-4174-a3a5-da021e80abff at x21g2000vbn.googlegroups.com>,
    snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules.
    One anti-pattern that I see in my own code is starting out thinking,
    "this is just a little script, I doesn't need any real structure". That
    almost always turns out to be wrong, but by the time I start to realize
    I'm writing spaghetti code, it's so temping to say, "I don't have the
    time to refactor this now, I'll just hack on it a bit more". Which, of
    course, makes it even harder to unravel later.

    The first step is to break up a monolithic script into a few functions.
    I encourage myself to do that from the git-go by keeping a template
    around:

    #!/usr/bin/env python

    def main():
    pass

    if __name__ == '__main__':
    main()

    and I use that whenever I start a new script. That at least gets me off
    on the right foot.

    The next step is to turn my collection of functions (with the inevitable
    collection of global variables that lets them communicate) into a class
    with methods and instance variables. I can't tell you how many times
    I've started out saying, "this isn't going to be complicated enough to
    justify making it a class". Inevitably, I'm wrong.

    Finally, the next layer of stupid mistake I often make is to start out
    saying, "This isn't going to be complicated enough to justify writing
    unit tests". Inevitably, I'm wrong about that too.

    So far, none of the above is at all specific to Python, It's equally
    true in any language.

    Now, for some Python-specific advice; you can write Fortran in any
    language. What that means is it's one thing to translate some existing
    script into Python and make it work, but it's another to actually take
    advantage of some of Python's biggest strengths. Learn to be
    comfortable with list comprehensions, generator expressions, and
    iterators in general. Learn about Python's advanced data structures
    such as sets, defaultdicts, and named tuples.
  • Jean-Michel Pichavant at Feb 17, 2011 at 11:29 am

    snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops,
    and repeat. It's a bit of a trial and error learning phase, making
    sure I'm using the third party modules correctly, and so on. I end up
    with a working script, but by the end it looks messy, unorganized, and
    feels hacked together. I feel like in order to reuse it or expand it
    in the future, I need to take what I learned and rewrite it from
    scratch.

    If I peeked over a Python expert's shoulder while they developed
    something new, how would their habits differ? Do they start with
    classes from the start?

    I guess I'm looking for something similar to "Large Scale C++ Software
    Design" for Python. Or even just a walkthrough of someone competent
    writing something from scratch. I'm not necessarily looking for a
    finished product that is well written. I'm more interested in, "I have
    an idea for a script/program, and here is how I get from point A to
    point B."

    Or maybe I'm looking for is best practices for how to organize the
    structure of a Python program. I love Python and I just want to be
    able to use it well.
    I guess this is happening to anyone using python for their everyday
    scripting. Writing tools to make life easier.
    You start writing some quick and dirty code just to do the trick ASAP.
    Then after adding some features your realize it became a little more
    than a script but don't want to share it because the code is just a
    freaking mess that no one but you can understand.

    Here are the things I do to avoid getting to that point:
    1 - accept the fact that I will lose time trying to write it correcly
    from the begining
    2 - the script has to be "importable" and usable by a 3rd python program.
    3 - regulary test my code with the python shell importing the script
    4 - use a linter (pylint) with all warnings activated.
    5 - document the public class/function. This "force" me to expose the
    minimum interface, because like everyone, I dislike writing doc
    6 - always include a argument parser.
    7 - always use the logging module.
    8 - if the project starts to become medium, write unitary tests.

    And maybe the most important one, I always start from the upper level
    interface down to primitives (top-down programming ?). This helps me
    making suff really easy to use, with a clean interface. Sometimes the
    implementation can become tricky that way, I guess it's a matter of
    preference. In pratice, that just means that I write the method call
    before writing the method definition.

    JM
  • Jorgen Grahn at Feb 17, 2011 at 9:44 pm

    On Wed, 2011-02-16, snorble wrote:
    I use Python a lot, but not well. I usually start by writing a small
    script, no classes or modules. Then I add more content to the loops,
    and repeat. It's a bit of a trial and error learning phase, making
    sure I'm using the third party modules correctly, and so on. I end up
    with a working script, but by the end it looks messy, unorganized, and
    feels hacked together. I feel like in order to reuse it or expand it
    in the future, I need to take what I learned and rewrite it from
    scratch.

    If I peeked over a Python expert's shoulder while they developed
    something new, how would their habits differ? Do they start with
    classes from the start?

    I guess I'm looking for something similar to "Large Scale C++ Software
    Design" for Python. Or even just a walkthrough of someone competent
    writing something from scratch. I'm not necessarily looking for a
    finished product that is well written. I'm more interested in, "I have
    an idea for a script/program, and here is how I get from point A to
    point B."

    Or maybe I'm looking for is best practices for how to organize the
    structure of a Python program. I love Python and I just want to be
    able to use it well.
    Good questions -- and you got some really good answers already!

    What I always do when starting a program is:

    - Split it into a 'if __name__ == "__main__":' which does the
    command-line parsing, usage message and so on; and a function
    which contains the logic, i.e. works like the program would have
    if the OS had fed it its arguments as Python types

    - Document functions and classes.

    - Avoid having functions use 'print' and 'sys.std*', in case I need to
    use them with other files. I pass file-like objects as arguments
    instead.

    - Write user documentation and build/installation scripts. Since I'm
    on Unix, that means man pages and a Makefile.

    And that's all in the normal case. No need to do anything more fancy
    if it turns out I'll never have to touch that program again.

    I use classes when I see a use for them. The "see" part comes from
    quite a few years' worth of experience with object-oriented design in
    Python and C++ ... not sure how to learn that without getting lost in
    Design with a capital 'D' for a few years ...

    Anyway, I don't feel bad if I don't find any classes at first.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
  • Roy Smith at Feb 17, 2011 at 10:24 pm
    In article <slrnilr5lj.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    - Write user documentation and build/installation scripts. Since I'm
    on Unix, that means man pages and a Makefile.
    Wow, I haven't built a man page in eons. These days, user documentation
    for me means good help text for argparse to use. If I need something
    more than that, I'll write it up in our wiki.
    Anyway, I don't feel bad if I don't find any classes at first.
    Same here. I don't usually find a reason to refactor things into
    classes until I've written the second or third line of code :-)

    Somewhat more seriously, the big clue for me that I've got a class
    hiding in there is when I start having all sorts of globals. That's
    usually a sign you've done something wrong.
  • Jorgen Grahn at Feb 19, 2011 at 12:07 am

    On Thu, 2011-02-17, Roy Smith wrote:
    In article <slrnilr5lj.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    - Write user documentation and build/installation scripts. Since I'm
    on Unix, that means man pages and a Makefile.
    Wow, I haven't built a man page in eons. These days, user documentation
    for me means good help text for argparse to use.
    Perhaps I'm old-fashioned, but all other software I use (on Unix) has
    man pages. I /expect/ there to be one. (It's not hard to write a man
    page either, if you have a decent one as a template.)

    Help texts are better than nothing though (and unlike man pages they
    work on Windows too).
    If I need something
    more than that, I'll write it up in our wiki.
    I guess you're working within an organization? Local rules override
    personal preferences -- if everyone else is using the wiki, I guess
    you must do too.

    I have to say though that *not* handling the documentation together
    with the source code is harmful. If source code and documentation
    aren't in version control together, they *will* go out of sync.
    Anyway, I don't feel bad if I don't find any classes at first.
    Same here. I don't usually find a reason to refactor things into
    classes until I've written the second or third line of code :-)

    Somewhat more seriously, the big clue for me that I've got a class
    hiding in there is when I start having all sorts of globals. That's
    usually a sign you've done something wrong.
    Or a whole bunch of related arguments to a function, and/or the same
    arguments being passed to many functions.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
  • Roy Smith at Feb 19, 2011 at 12:20 am
    In article <slrnilu2e0.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    On Thu, 2011-02-17, Roy Smith wrote:
    In article <slrnilr5lj.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    - Write user documentation and build/installation scripts. Since I'm
    on Unix, that means man pages and a Makefile.
    Wow, I haven't built a man page in eons. These days, user documentation
    for me means good help text for argparse to use.
    Perhaps I'm old-fashioned, but all other software I use (on Unix) has
    man pages. I /expect/ there to be one. (It's not hard to write a man
    page either, if you have a decent one as a template.)
    The nice thing about help text is that it keeps the documentation and
    the code in one place, which makes it a little more likely people will
    actually update the docs as they update the code.
    I guess you're working within an organization?
    FSVO "organization", but yes.
    Local rules override personal preferences -- if everyone else is
    using the wiki, I guess you must do too.
    I've become very enamored of wikis because of the lost activation energy
    barrier and instant feedback. To update a man page (info node, etc),
    you need to find the source document, perhaps check it out, edit it,
    submit it back to version control, install the new version in /usr/man,
    and so on. People tend not to bother. Wikis are so much more
    lightweight, they're that much more likely to get kept current.
    I have to say though that *not* handling the documentation together
    with the source code is harmful. If source code and documentation
    aren't in version control together, they *will* go out of sync.
    That is a valid argument against wikis.
  • Ben Finney at Feb 19, 2011 at 12:39 am

    Roy Smith <roy at panix.com> writes:

    In article <slrnilu2e0.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    On Thu, 2011-02-17, Roy Smith wrote:
    These days, user documentation for me means good help text for
    argparse to use.
    Perhaps I'm old-fashioned, but all other software I use (on Unix)
    has man pages. I /expect/ there to be one. (It's not hard to write a
    man page either, if you have a decent one as a template.)
    The nice thing about help text is that it keeps the documentation and
    the code in one place, which makes it a little more likely people will
    actually update the docs as they update the code.
    Yes, that's nice for the programmer. But isn't the point of the man page
    to be nice for the users? The man pages document many more things than
    help text output from the program.

    --
    \ ?Very few things happen at the right time, and the rest do not |
    `\ happen at all. The conscientious historian will correct these |
    _o__) defects.? ?Mark Twain, _A Horse's Tale_ |
    Ben Finney
  • Westley Martínez at Feb 19, 2011 at 2:20 am

    On Sat, 2011-02-19 at 11:39 +1100, Ben Finney wrote:
    Roy Smith <roy at panix.com> writes:
    In article <slrnilu2e0.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    On Thu, 2011-02-17, Roy Smith wrote:
    These days, user documentation for me means good help text for
    argparse to use.
    Perhaps I'm old-fashioned, but all other software I use (on Unix)
    has man pages. I /expect/ there to be one. (It's not hard to write a
    man page either, if you have a decent one as a template.)
    The nice thing about help text is that it keeps the documentation and
    the code in one place, which makes it a little more likely people will
    actually update the docs as they update the code.
    Yes, that's nice for the programmer. But isn't the point of the man page
    to be nice for the users? The man pages document many more things than
    help text output from the program.

    --
    \ ?Very few things happen at the right time, and the rest do not |
    `\ happen at all. The conscientious historian will correct these |
    _o__) defects.? ?Mark Twain, _A Horse's Tale_ |
    Ben Finney
    From what I've seen, the man pages are supposed to be in depth
    information that covers every nook and cranny of every option while the
    --help option is supposed to simply print a summary in case one forgets
    the syntax, but nowadays they've kind of been blended together.
  • Ben Finney at Feb 19, 2011 at 3:01 am

    Westley Mart?nez <anikom15 at gmail.com> writes:

    From what I've seen, the man pages are supposed to be in depth
    information that covers every nook and cranny of every option
    Those are man pages the document commands. There are many more man pages
    on a typical Unix-like system, which do not document commands.

    This collection of a great deal of documentation for the operating
    system into a single ?manual? is one reason why users like man pages so
    much: we want to find anything installed on the system documented in
    that one place.

    --
    \ ?The trouble with eating Italian food is that five or six days |
    `\ later you're hungry again.? ?George Miller |
    _o__) |
    Ben Finney
  • Roy Smith at Feb 19, 2011 at 3:28 am
    In article <878vxcbudn.fsf at benfinney.id.au>,
    Ben Finney wrote:
    This collection of a great deal of documentation for the operating
    system into a single ???manual??? is one reason why users like man pages so
    much: we want to find anything installed on the system documented in
    that one place.
    What made man pages such a great technology back in the 70's was exactly
    what Ben is saying. Everything was on-line and instantly available for
    quick reference. Not to mention that you could use man as just another
    cog in the unix toolset and do things like grep all of /usr/man for a
    term (or an error message which appeared and you didn't know what had
    produced it). These were astonishing advances in usability vs. having
    printed manuals (which may or may not have been available to you).

    But, today we have such better tools available. HTML, for example.
    Whether it's a wiki or the generated output of sphinx/doxygen/etc, HTML
    provides for a much richer presentation. Which is more convenient:
    having the signal(3) man page reference "sigaction(2)" textually, or
    having it be a clickable link that can take me right there? HTML also
    gives you much greater formatting flexibility than what's still
    basically 35-year old nroff.

    If, for whatever reason, you're still wed to plain text, even info gives
    you much better capabilities than man. At least you get basic stuff
    like menus, document hierarchy, cross-linking, and browsing history.

    I'm not saying that help text is the be-all and end-all for
    documentation. I'm just saying that if you're going to do more than
    help text, it's hard to imagine putting any effort into producing man
    pages. Except possibly as the automated output of some multi-target
    documentation system which produces them as a by-product of producing
    other, richer, formats.
  • Ben Finney at Feb 19, 2011 at 3:37 am

    Roy Smith <roy at panix.com> writes:

    Whether it's a wiki or the generated output of sphinx/doxygen/etc, HTML
    provides for a much richer presentation. Which is more convenient:
    having the signal(3) man page reference "sigaction(2)" textually, or
    having it be a clickable link that can take me right there?
    My man page browser does exactly that: the reference to another man page
    is clickable, bringing me directly to the approrpiate page. If yours
    doesn't, perhaps that's what needs to be addressed?
    HTML also gives you much greater formatting flexibility than what's
    still basically 35-year old nroff.
    Full agreement there.
    If, for whatever reason, you're still wed to plain text, even info
    gives you much better capabilities than man.
    Yet the manual system is reliably installed on any Unix system I'm
    likely to encounter. The ?info? system is not. Which is my point:
    there's one place the documentation is expected to be.
    I'm not saying that help text is the be-all and end-all for
    documentation.
    Nor am I making any similar claim for the Unix manual system. But it is
    the common baseline which can be relied upon ? unless vendors fail to
    provide it. Hence the appeal not to do that.

    --
    \ ?Every man would like to be God, if it were possible; some few |
    `\ find it difficult to admit the impossibility.? ?Bertrand |
    _o__) Russell, _Power: A New Social Analysis_, 1938 |
    Ben Finney
  • Jorgen Grahn at Feb 19, 2011 at 8:43 am

    On Sat, 2011-02-19, Ben Finney wrote:
    Roy Smith <roy at panix.com> writes:
    ...
    HTML also gives you much greater formatting flexibility than what's
    still basically 35-year old nroff.
    Full agreement there.
    Some disagreement here. There are typographical features in
    nroff/troff today which you don't get in web browsers: ligatures and
    hyphenation for example.

    Then of course there's the argument that "formatting flexibility"
    isn't a good thing for reference manuals -- you want them to look
    similar no matter who wrote them. (Not that all man pages look similar
    in reality, but there are some pretty decent conventions which most
    follow).

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
  • Roy Smith at Feb 19, 2011 at 12:40 pm
    In article <slrnilv0ls.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    Some disagreement here. There are typographical features in
    nroff/troff today which you don't get in web browsers: ligatures and
    hyphenation for example.
    Saying that HTML doesn't have ligatures and hyphenation is kind of like
    saying Python is a bad programming language because it doesn't come in
    purple.

    Yes, n/troff does ligatures and hyphenation. Are such things really
    essential for an on-line reference manual? The ligatures, clearly not,
    since what most of us are talking about here are the plain-text
    renditions you get with the on-line "man" output.

    Hyphenation? I suppose it has some value for this kind of stuff, but it
    can also be a pain. What happens when you're grepping the man page for
    "egregious" and can't find it because it got hyphenated?

    No, those are things you want for typesetting documents, not for
    browsing on-line reference material. And I can't imagine anybody using
    troff for typesetting today. I'm sure there are a few people who will
    pop out of the woodwork insisting they do, but when it comes to
    typesetting, "I want a tool, not a hobby".
  • Westley Martínez at Feb 19, 2011 at 5:10 pm

    On Sat, 2011-02-19 at 07:40 -0500, Roy Smith wrote:
    In article <slrnilv0ls.15e.grahn+nntp at frailea.sa.invalid>,
    Jorgen Grahn wrote:
    Some disagreement here. There are typographical features in
    nroff/troff today which you don't get in web browsers: ligatures and
    hyphenation for example.
    Saying that HTML doesn't have ligatures and hyphenation is kind of like
    saying Python is a bad programming language because it doesn't come in
    purple.

    Yes, n/troff does ligatures and hyphenation. Are such things really
    essential for an on-line reference manual? The ligatures, clearly not,
    since what most of us are talking about here are the plain-text
    renditions you get with the on-line "man" output.

    Hyphenation? I suppose it has some value for this kind of stuff, but it
    can also be a pain. What happens when you're grepping the man page for
    "egregious" and can't find it because it got hyphenated?

    No, those are things you want for typesetting documents, not for
    browsing on-line reference material. And I can't imagine anybody using
    troff for typesetting today. I'm sure there are a few people who will
    pop out of the woodwork insisting they do, but when it comes to
    typesetting, "I want a tool, not a hobby".
    But you can't seriously say that authoring HTML is effective. Sure,
    outputting HTML is fine, but as for writing the source, troff, docbook,
    sphinx, even TeX, etc, is superior to HTML simply because HTML was
    designed for web pages and those others were designed specifically for
    documentation (not TeX, but that's another story). I hate writing HTML,
    it's a pain in the neck.

    But anyways, I find it easier to simply type "man sigaction" than to
    search around on google all day for a handful of facts and a truckload
    of opinions, as do I find it easier to type "help(os.path)" than have to
    open my browser, click on the python doc bookmark, search for os.path,
    wait for the search results, and click on it. This is just what I find
    easier though, and I think using tools that output HTML, groff, LaTeX,
    etc should continued to be used.
  • Cameron Simpson at Feb 19, 2011 at 10:32 pm

    On 19Feb2011 09:10, Westley Mart?nez wrote:
    But you can't seriously say that authoring HTML is effective. Sure,
    outputting HTML is fine, but as for writing the source, troff, docbook,
    sphinx, even TeX, etc, is superior to HTML simply because HTML was
    designed for web pages and those others were designed specifically for
    documentation (not TeX, but that's another story). I hate writing HTML,
    it's a pain in the neck.
    Chuckle.

    The basics of HTML (H1, H2, P, I etc) are Very Very closely based on the
    [nt]roff -mm macro set. Quite useable, actually, for the basics. The tag
    closing etc is a PITA, I agree. Of course, in nroff you'd be going:

    .H1 A Level One Heading
    paragraph blah blah ...

    No tedious closing tags there!

    CCheers,
    --
    Cameron Simpson <cs at zip.com.au> DoD#743
    http://www.cskk.ezoshosting.com/cs/

    (about SSSCA) I don't say this lightly. However, I really think that the U.S.
    no longer is classifiable as a democracy, but rather as a plutocracy.
    - H. Peter Anvin <hpa at hera.kernel.org>
  • Roy Smith at Feb 19, 2011 at 5:46 pm
    In article <mailman.218.1298135413.1189.python-list at python.org>,
    Westley Mart??nez wrote:
    But you can't seriously say that authoring HTML is effective.
    By hand? No of course not. That's why we have things like wikis and
    CMS's, markup languages like ReST, TeX-to-HTML converters, and so on.

    But, we're getting way off topic for a Python forum. The original
    question was along the lines of "How do I write good Python?". I think
    we're all in agreement that somewhere in the answer to that has to be,
    "Provide documentation which is useful to your users, and keep it
    updated as the code changes". Beyond that, I think we need to move this
    to comp.text.religion.
  • Ben Finney at Feb 19, 2011 at 11:42 pm

    Westley Mart?nez <anikom15 at gmail.com> writes:

    I hate writing HTML, it's a pain in the neck.
    So do I. But I hate writing *roff markup even more. So I don't write
    either of those formats directly if I can avoid it.

    I write my man pages in either Docbook (using an XML editor) or reST (my
    default markup format these days), then convert to *roff.
    But anyways, I find it easier to simply type "man sigaction" than to
    search around on google all day for a handful of facts and a truckload
    of opinions
    Right. The Unix manual system allows a consistent place for operating
    system documentation. That includes, but is not limited to, the commands
    to be typed at the command line. Please preserve that consistency by
    maintaining man pages for commands.

    --
    \ ?I installed a skylight in my apartment. The people who live |
    `\ above me are furious!? ?Steven Wright |
    _o__) |
    Ben Finney
  • Cameron Simpson at Feb 19, 2011 at 10:27 pm

    On 18Feb2011 22:28, Roy Smith wrote:
    In article <878vxcbudn.fsf at benfinney.id.au>,
    Ben Finney wrote:
    This collection of a great deal of documentation for the operating
    system into a single ???manual??? is one reason why users like man pages so
    much: we want to find anything installed on the system documented in
    that one place.
    What made man pages such a great technology back in the 70's was exactly
    what Ben is saying. Everything was on-line and instantly available for
    quick reference. Not to mention that you could use man as just another
    cog in the unix toolset and do things like grep all of /usr/man for a
    term (or an error message which appeared and you didn't know what had
    produced it). These were astonishing advances in usability vs. having
    printed manuals (which may or may not have been available to you).

    But, today we have such better tools available. HTML, for example.
    Whether it's a wiki or the generated output of sphinx/doxygen/etc, HTML
    provides for a much richer presentation. Which is more convenient:
    having the signal(3) man page reference "sigaction(2)" textually, or
    having it be a clickable link that can take me right there? HTML also
    gives you much greater formatting flexibility than what's still
    basically 35-year old nroff.
    But HTML is just presentation. There are _plenty_ of manual page
    renderers that write HTML. (Example: http://www.FreeBSD.org/cgi/man.cgi)
    Complete with clickable links to other manual pages etc. That can all
    be done automatically. And has _nothing_ to do with the source being in
    nroff format. And the source needn't be in nroff format, either. I have
    a bunch of man pages in POD format, which renders to an assortment of
    formats including nroff output.

    Your argument above is a fine argument for saying that HTML is a very
    valuable presentation format, especially if well cross referenced.
    But it is irrelevant to the usefulness of man pages.
    If, for whatever reason, you're still wed to plain text, even info gives
    you much better capabilities than man. At least you get basic stuff
    like menus, document hierarchy, cross-linking, and browsing history.
    Any yet I (and others, based on stuff I've seen) find info to be a
    disaster. Why?

    - it forces the reader to use a non-standard pager to look
    at info, typically the utterly weird one that comes with the info
    command. The user using a terminal _should_ get to use their own pager
    because their fingers know how to drive it. Info, in its tiny pieces
    of text linked to other tiny pieces of text form, does not lend itself
    to this and the browser it does offer on a terminal is arcane.

    But see below (*).

    - the info pages end up as a scattering of tiny cross linked (if
    you're lucky) pieces with little information on one place/page.
    So you can't, for example, stand at the top of the doco page and
    search for a term.

    Frankly, info is usually a step backward, speaking as a reader.

    * I grew enraged at the prevalence of "GNU" unix tools with only info
    for doco, and no manual pages or manual pages that said "we don't put
    anything useful here, go read the info pages, the stuff here may not
    even be maintained" (I'm serious - see the bottom of a lot of the
    rather trite manual pages that ship with GNU this/that/the-other).

    So enraged that I wrote a couple of tools called info2pod and
    info2man that read postcompiled info output (the
    binary-mixed-with-text stuff info files ship as, post install)
    and join it all up again into a single flat text output that _can_ be
    paged and searched. And a modified "man" command that can include info
    dirs in the $MANPATH and thus present info as a man page. It is a
    little ugly, but at least it clubs info into usability.
    Example:

    % man screen
    1: /usr/share/man/man1/screen.1.bz2
    2: /usr/share/info/screen.info-2.bz2
    3: /usr/share/info/screen.info-4.bz2
    4: /usr/share/info/screen.info-5.bz2
    5: /usr/share/info/screen.info-1.bz2
    6: /usr/share/info/screen.info-3.bz2
    7: /usr/share/info/screen.info.bz2
    which entry?

    Choosing (1) gets you "man screen" as usual, choosing (7) gets you the
    whole screen info stuff flattened and presented as a single page, where
    you can _search_ for what you want.

    URL: http://www.cskk.ezoshosting.com/cs/css/#key-doc
    I'm not saying that help text is the be-all and end-all for
    documentation. I'm just saying that if you're going to do more than
    help text, it's hard to imagine putting any effort into producing man
    pages.
    Hard for you, maybe. As someone whole consistently finds well written
    (terse yet complete) man pages _much_ more useful than many other supposed
    documentation, I find it hard to imagine lack of man pages as other than
    a failure.

    There are exceptions of course. The python doco at python.org is pretty
    good. Wikipedia is often very good. But many wikis and other "rich and
    easy to author" systems are awful. Incomplete and badly fragmented.
    A lot of that can be laid to "documentation as an afterthought"
    mentality, but I also feel that having a manual page as a _single_ item
    contributes a lot to getting it all down.

    Writing man pages in nroff is a bit tedious (though actually not all
    that hard). Generating man pages from POD or some other similarly friendly
    format is easy and desirable.
    Except possibly as the automated output of some multi-target
    documentation system which produces them as a by-product of producing
    other, richer, formats.
    Tick.

    Cheers,
    --
    Cameron Simpson <cs at zip.com.au> DoD#743
    http://www.cskk.ezoshosting.com/cs/

    It looked like documentation, so I threw it out. - unknown luser
  • Chris Jones at Feb 20, 2011 at 6:10 pm
    On Sat, Feb 19, 2011 at 05:27:24PM EST, Cameron Simpson wrote:

    [..]
    Any yet I (and others, based on stuff I've seen) find info to be a
    disaster. Why?

    - it forces the reader to use a non-standard pager to look
    at info, typically the utterly weird one that comes with the info
    command.
    On the rare occcasions I used it, navigation was such an uphill battle
    that I often forgot what I was looking for in the first place.
    The user using a terminal _should_ get to use their own pager
    because their fingers know how to drive it.
    I stumbled into this some time ago and never looked back:

    https://alioth.debian.org/projects/pinfo/

    It was love at first sight since it actually has the good taste to use
    by default the same vi-like navigation key bindings I have set up for
    myself in the ELinks web browser, which I tend to favor over GUIs
    browsers when I'm reading html docs.

    When you need to do brutal force searches, you could also take a look at
    the vim ?info? plugin. On debian distributions, it is part of the
    ?vim-scripts? package and can be invoked by the ?:Info? Ex-mode command.
    You can then use the ?:helpgrep? command to create a list of matches
    that you can navigate in the same user-friendly way as you would use for
    the Vim help files. In a nutshell, instead of getting cross-eyed trying
    to locate the highlighted area on the screen to find the current match
    and hit some ?find next? button (or use any functionally similar
    mechanism) repeatedly, you are presented with a list of all your matches
    in their context. It is then just a matter of navigating to the one(s)
    that looks more promising and just hit enter to open the corresponding
    doc page in another Vim sub-window.
    Info, in its tiny pieces of text linked to other tiny pieces of
    text form, does not lend itself to this and the browser it does
    offer on a terminal is arcane.
    That also happens with html docs, with the single page vs. chunked
    formats. I have been rather enraged myself when researching something or
    other and felt I'd hit the jackpot when I found the perfect document
    online, only to have to read through the whole thing anyway because only
    the chunked format was available, and save from downloading all the bits
    and pieces and somehow recreating the single page version, there was no
    way I could run a global search.

    My main criticism of the man format is that it does not provide both.

    Here's an example. Since I don't write bash scripts on a regular basis,
    I often have to refer to the bash documentation. If I use man, I can
    search for instance for ?SHELL BUILTIN? alright, but the trouble is that
    there are about a dozen matches in this giant man page before I actually
    get to the ?SHELL BUILTINS? section. The info format, on the other hand,
    provides and index of the builtins, where I quickly find precisely what
    I am looking for.

    Generally speaking, I find that man pages are fine for anything that's,
    well.. about one page and that I can display on one screen (that's 92
    lines on my display) have has major limitations for anything much
    longer.
    But see below (*).

    - the info pages end up as a scattering of tiny cross linked (if
    you're lucky) pieces with little information on one place/page.
    So you can't, for example, stand at the top of the doco page and
    search for a term.
    Not sure which particular info manual(s) you are referring to.

    There are also info documents that are nicely structured.. with a table
    of contents, an index, and sections of manageable proportions that
    provided you don't use the ridiculous ?info? viewer, make on-screen
    reading a pleasure, especially when you have decided to read the manual
    cover to cover. GNU/screen is a good example. The gdb manual is another.

    Perhaps it's also a matter of who wrote the doc, how good he is at
    writing doc, and how much effort he put in designing and writing the
    doc. And tools that automate the conversion from man to info and back
    may also have something to do with this sorry state of affairs.
    Frankly, info is usually a step backward, speaking as a reader.
    I am also speaking as a reader and I find that both the man and the info
    format (and html as well, for that matter) have their merits, and it's
    a question of choosing the right format, depending on the circumstances
    and what you are trying to do.
    * I grew enraged at the prevalence of "GNU" unix tools with only info
    for doco, and no manual pages or manual pages that said "we don't put
    anything useful here, go read the info pages, the stuff here may not
    even be maintained" (I'm serious - see the bottom of a lot of the
    rather trite manual pages that ship with GNU this/that/the-other).
    Same here... Especially when adding insult to injury, your favorite
    distribution ships a man page that directs you to the info manual, but
    does not ship the info version due to licensing disagreements, and you
    have to download the info version from gnu.org, create your own debian
    package.. etc. etc. Depending on the particular info manual, this can be
    quite tricky, since the procedure is not well-documented :) and rather
    buggy.
    So enraged that I wrote a couple of tools called info2pod and
    info2man that read postcompiled info output (the
    binary-mixed-with-text stuff info files ship as, post install) and
    join it all up again into a single flat text output that _can_ be
    paged and searched. And a modified "man" command that can include
    info dirs in the $MANPATH and thus present info as a man page. It is
    a little ugly, but at least it clubs info into usability. Example:
    % man screen
    1: /usr/share/man/man1/screen.1.bz2
    2: /usr/share/info/screen.info-2.bz2
    3: /usr/share/info/screen.info-4.bz2
    4: /usr/share/info/screen.info-5.bz2
    5: /usr/share/info/screen.info-1.bz2
    6: /usr/share/info/screen.info-3.bz2
    7: /usr/share/info/screen.info.bz2
    which entry?

    Choosing (1) gets you "man screen" as usual, choosing (7) gets you the
    whole screen info stuff flattened and presented as a single page, where
    you can _search_ for what you want.

    URL: http://www.cskk.ezoshosting.com/cs/css/#key-doc
    Sounds more mature than my own messy ?solutions? to this problem. :-)
    I'm not saying that help text is the be-all and end-all for
    documentation. I'm just saying that if you're going to do more than
    help text, it's hard to imagine putting any effort into producing man
    pages.
    Hard for you, maybe. As someone whole consistently finds well written
    (terse yet complete) man pages _much_ more useful than many other
    supposed documentation, I find it hard to imagine lack of man pages as
    other than a failure.
    In an ideal doc world and where ?program? is a non-trivial piece of
    software, I would like to be able to think of ?program --help? as the
    condensed reference card, ?man program? as the detailed reference card,
    and.. something like info, html, etc. as the user guide, and depending
    on what I am doing, the bases would be covered.
    There are exceptions of course. The python doco at python.org is
    pretty good. Wikipedia is often very good. But many wikis and other
    "rich and easy to author" systems are awful. Incomplete and badly
    fragmented. A lot of that can be laid to "documentation as an
    afterthought" mentality, but I also feel that having a manual page as
    a _single_ item contributes a lot to getting it all down.
    Apples and oranges.. In the same spirit as Westley Mart?nez nicely put
    it a few posts back in this thread, my personal experience has led me to
    regard wiki's as just a tiny step up from having to google mailing
    list's archives and.. many steps backward from man or info.
    Writing man pages in nroff is a bit tedious (though actually not all
    that hard). Generating man pages from POD or some other similarly
    friendly format is easy and desirable.
    Nothing as nice as man pages written from scratch, but I've had good
    results with ?help2man?.

    cj
  • Grant Edwards at Feb 22, 2011 at 2:57 pm

    On 2011-02-20, Chris Jones wrote:
    On Sat, Feb 19, 2011 at 05:27:24PM EST, Cameron Simpson wrote:

    [..]
    Any yet I (and others, based on stuff I've seen) find info to be a
    disaster. Why?

    - it forces the reader to use a non-standard pager to look
    at info, typically the utterly weird one that comes with the info
    command.
    On the rare occcasions I used it, navigation was such an uphill battle
    that I often forgot what I was looking for in the first place.
    I've lost track of how many times I've tried to learn to use the Gnu
    "info" command and gave up in frustration. I've never seen a program
    with a more difficult to use UI. Breaking up the manuals into
    uselessly small chunks is just the icing on top of the mud cake.

    --
    Grant Edwards grant.b.edwards Yow! The entire CHINESE
    at WOMEN'S VOLLEYBALL TEAM all
    gmail.com share ONE personality --
    and have since BIRTH!!
  • Anssi Saari at Feb 23, 2011 at 8:43 am

    Grant Edwards <invalid at invalid.invalid> writes:

    I've lost track of how many times I've tried to learn to use the Gnu
    "info" command and gave up in frustration. I've never seen a program
    with a more difficult to use UI.
    As I recall, there are other info viewers like tkinfo for example. But
    really, how hard is the basic info? Enter to go down a topic, u or l
    to back, s for search? Seems simple enough for me.

    What's really annoying is when the man page for a program finishes
    with a boilerplate announcement that the info version of the
    documentation is more complete. And yet, all the info version is, is
    the same manual page, including the same boilerplate...
  • Ben Finney at Feb 23, 2011 at 12:08 pm

    Anssi Saari <as at sci.fi> writes:

    Grant Edwards <invalid at invalid.invalid> writes:
    I've lost track of how many times I've tried to learn to use the Gnu
    "info" command and gave up in frustration. I've never seen a program
    with a more difficult to use UI.
    As I recall, there are other info viewers like tkinfo for example. But
    really, how hard is the basic info? Enter to go down a topic, u or l
    to back, s for search? Seems simple enough for me.
    Yes, and non-standard; it ignores the finger-memory of both Vim *and*
    Emacs users.
    What's really annoying is when the man page for a program finishes
    with a boilerplate announcement that the info version of the
    documentation is more complete.
    With the caveat ?If the info and ls programs are properly installed at
    your site?, which it might not be; see below.
    And yet, all the info version is, is the same manual page, including
    the same boilerplate...
    That's what ?info? does if there's not actually any info documentation
    installed for the entry, but there is a manual page.

    Some of the GNU documentation is only licensed under FDL, which is
    (despite its name) a license that doesn't meet the FSF's own definition
    of software freedom. For that reason, it tends not to be included in
    some major free-software operating systems.

    A damned shame, but a result of some members in the FSF trying to apply
    different freedoms to different purposes.

    --
    \ ?I don't care to belong to a club that accepts people like me |
    `\ as members.? ?Groucho Marx |
    _o__) |
    Ben Finney

Related Discussions

People

Translate

site design / logo © 2022 Grokbase