FAQ

[CouchDB-user] Appending "\n" and pretty printing output from the command line using curl (with a smidge of python)

Jason Huggins
Aug 20, 2008 at 6:05 pm
Hello all!

Briefly, introducing myself... and please forgive the blatant
self-promotion -- I'm recently former Googler, now working on my own
startup in Chicago (with some others still in San Francisco). Django
and CouchDB weigh heavily in our server setup and future API... I'm
*loving* CouchDB... And I've pretty much devoured every online
presentation or tech talk video that exists on CouchDB so far. :-)
Keep it up!

With that out of the way.... I just wanted to send along a quick tip
for folks using curl. If this is useful, maybe some version of this
should probably graduate to a "GettingStartedWithCurl" wiki page:

I'm frequently annoyed that curl doesn't end the data stream with a
"\n" at the end... so a curl request usually looks like this:

----------------------------
ubuntu: ~ hugs$ curl -u http://mycouchdb
{"couchdb":"Welcome","version":"0.8.0-incubating"}ubuntu: ~ hugs$
----------------------------

I was thinking about creating a patch to add a "\n" at the end of
every request, but I figured that would be a request of last resort.

The stupid quick solution is append an empty "echo" command like so:
--------------------------------------------------------
ubuntu: ~ hugs$ curl http://mycouchdb ; echo ''
{"couchdb":"Welcome","version":"0.8.0-incubating"}
ubuntu: ~ hugs$
--------------------------------------------------------

I could have stopped there... but the urge to bikeshed this further
was just too great... So, I whipped up a quick python script that I
now pipe to to do my "post-processing". I remembered that the
GettingStartedWithPython wiki page simply does a pretty print of the
content... So I heavily streamlined that script to make it play nicer
with curl. So this is the result now:

--------------------------------------------------------
ubuntu: ~ hugs$ curl -s http://mycouchdb | ./pprint-json
{
"couchdb": "Welcome",
"version": "0.8.0-incubating"
}
ubuntu: ~ hugs$
--------------------------------------------------------


.... Ah... much better. :-)


Here's the script:
--------------------------------------------------------
#! /usr/bin/python

import sys
import simplejson

raw_input = sys.stdin.read()
json_data = simplejson.loads(raw_input)

# Make pretty!
print simplejson.dumps(json_data, sort_keys=True, indent=4)
--------------------------------------------------------

To get this to work correctly, I also had to add the silent "-s" flag
to curl so that it didn't print a progress meter with the results as
well. (Try it without "-s" to see what I mean.)

I would love it there was "pretty print all output" configuration
option for Couch... but until then, I'll just use my script. :-)

Cheers and thanks for all the JSON!

- Jason Huggins
reply

Search Discussions

20 responses

  • Noah Slater at Aug 21, 2008 at 1:13 am

    On Wed, Aug 20, 2008 at 01:04:49PM -0500, Jason Huggins wrote:
    I was thinking about creating a patch to add a "\n" at the end of
    every request, but I figured that would be a request of last resort.
    I did this very patch about 6 months ago but it was rejected and I can't quite
    remember why. Now that we're at the ASF I would like to call an informal vote.

    Fellow committers, what say ye to adding a trailing \n to JSON output?

    Best,
  • Ed Finkler at Aug 21, 2008 at 1:21 am
    I am in no way a committer, but I would really love an option to get
    *formatted* output similar to what JSONlint.com spits out. It's so
    much easier to parse with the eyes.

    --
    Ed Finkler
    http://funkatron.com
    AIM: funka7ron
    ICQ: 3922133
    Skype: funka7ron

    On Wed, Aug 20, 2008 at 9:10 PM, Noah Slater wrote:
    On Wed, Aug 20, 2008 at 01:04:49PM -0500, Jason Huggins wrote:
    I was thinking about creating a patch to add a "\n" at the end of
    every request, but I figured that would be a request of last resort.
    I did this very patch about 6 months ago but it was rejected and I can't quite
    remember why. Now that we're at the ASF I would like to call an informal vote.

    Fellow committers, what say ye to adding a trailing \n to JSON output?

    Best,

    --
    Noah Slater, http://people.apache.org/~nslater/
  • Chris Anderson at Aug 21, 2008 at 1:51 am

    On Wed, Aug 20, 2008 at 6:20 PM, Ed Finkler wrote:
    I am in no way a committer, but I would really love an option to get
    *formatted* output similar to what JSONlint.com spits out. It's so
    much easier to parse with the eyes.
    I'd like to have pretty-print JSON also. It might be an easy flag to
    add to the json:encode function.



    --
    Chris Anderson
    http://jchris.mfdz.com
  • Noah Slater at Aug 21, 2008 at 2:05 am

    On Wed, Aug 20, 2008 at 06:50:36PM -0700, Chris Anderson wrote:
    I'd like to have pretty-print JSON also. It might be an easy flag to
    add to the json:encode function.
    Please note there are two separate issues here:

    * Ending the output with a newline character, which according to POSIX is the
    required way to signal an End of File (for non-binary formats) and should be
    the default

    * Pretty printing the data, which is nice but should not be the default

    Best,
  • Ed Finkler at Aug 21, 2008 at 2:13 am
    Agreed on both.

    --
    Ed Finkler
    http://funkatron.com
    AIM: funka7ron
    ICQ: 3922133
    Skype: funka7ron

    On Wed, Aug 20, 2008 at 10:04 PM, Noah Slater wrote:
    On Wed, Aug 20, 2008 at 06:50:36PM -0700, Chris Anderson wrote:
    I'd like to have pretty-print JSON also. It might be an easy flag to
    add to the json:encode function.
    Please note there are two separate issues here:

    * Ending the output with a newline character, which according to POSIX is the
    required way to signal an End of File (for non-binary formats) and should be
    the default

    * Pretty printing the data, which is nice but should not be the default

    Best,

    --
    Noah Slater, http://people.apache.org/~nslater/
  • Jason Huggins at Aug 22, 2008 at 5:25 am

    On Wed, Aug 20, 2008 at 9:04 PM, Noah Slater wrote:
    Please note there are two separate issues here:

    * Ending the output with a newline character, which according to POSIX is the
    required way to signal an End of File (for non-binary formats) and should be
    the default
    Hi, Noah... out of curiosity (and to further help the cause), is there
    an online reference for that rule? Perhaps I've read too many
    "citation needed" comments in Wikipedia to let this one go. :-) I did
    some searching, but couldn't find a definitive, explicit mention.)

    Also, is your patch for this available online, too? I suspect it
    hasn't made the switch into Jira (I couldn't find it there.) If not,
    this might be the motivation I need to start reading some Erlang code
    and try to figure out what to edit on my own. :-)
    * Pretty printing the data, which is nice but should not be the default
    I agree. :-) I'm more than fine letting pretty printing be a client
    issue. It was a small annoyance not having it in there, but the
    work-around solution is easy enough with Python, Ruby, or Bash... but
    I figured I'd post anyway... making the easy solution
    search-engine-findable now. :-)

    Lastly, not everything is "unpretty" by default right now. For
    example, when viewing "/_all_docs", the current one-doc-per-line
    printing *is* the pretty solution in that case. :-)

    - Jason
  • Noah Slater at Aug 22, 2008 at 11:45 am

    On Fri, Aug 22, 2008 at 12:24:53AM -0500, Jason Huggins wrote:
    Hi, Noah... out of curiosity (and to further help the cause), is there an
    online reference for that rule? Perhaps I've read too many "citation needed"
    comments in Wikipedia to let this one go. :-) I did some searching, but
    couldn't find a definitive, explicit mention.)
    3.392 Text File

    A file that contains characters organized into one or more lines. The lines do
    not contain NUL characters and none can exceed {LINE_MAX} bytes in length,
    including the <newline>.

    - http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap03.html

    3.205 Line

    A sequence of zero or more non- <newline>s plus a terminating <newline>.

    - http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap03.html

    Canonical Mode Input Processing

    In canonical mode input processing, terminal input is processed in units of
    lines. A line is delimited by a newline character (NL), an end-of-file character
    (EOF), or an end-of-line (EOL) character. See Special Characters for more
    information on EOF and EOL. This means that a read request will not return until
    an entire line has been typed or a signal has been received. Also, no matter how
    many bytes are requested in the read() call, at most one line will be
    returned. It is not, however, necessary to read a whole line at once; any number
    of bytes, even one, may be requested in a read() without losing information.

    - http://www.opengroup.org/onlinepubs/007908775/xbd/termios.html

    Many applications choke when processing text files if last line is not properly
    terminated, as your experience with the shell demonstrates.
    Also, is your patch for this available online, too? I suspect it hasn't made
    the switch into Jira (I couldn't find it there.) If not, this might be the
    motivation I need to start reading some Erlang code and try to figure out what
    to edit on my own. :-)
    Nope, sorry, it's been lost to the sands of time.

    The patch should be quite simple thought, even for a non-Erlang programmer.
  • Jan Lehnardt at Aug 22, 2008 at 3:26 pm
    Hi, thanks for the research.
    On Aug 22, 2008, at 13:42, Noah Slater wrote:

    On Fri, Aug 22, 2008 at 12:24:53AM -0500, Jason Huggins wrote:

    Many applications choke when processing text files if last line is
    not properly
    terminated, as your experience with the shell demonstrates.
    Thing is that CouchDB is talking HTTP to curl (or any other HTTP client)
    and not through POSIX interfaces with the OS and other programs. Curl
    (or you favourite HTTP client) is doing that. So maybe curl should add a
    newline there to be POSIX compliant. But I can see curl saying "I don't
    touch the payload" for a good reason, too.

    I don't really know what's practical now, but I don't think your
    argumentation
    is tight. Maybe we can bring that before the resident ASF HTTP experts?

    Cheers
    Jan
    --
  • Damien Katz at Aug 22, 2008 at 3:35 pm
    I'm leery about having the server emit newlines, when payloads are
    signed little things like that can cause headaches.

    My preference is a command line script or client that wraps curl and
    pretty prints json and emits newlines when appropriate.
    On Aug 22, 2008, at 10:36 AM, Jan Lehnardt wrote:

    Hi, thanks for the research.
    On Aug 22, 2008, at 13:42, Noah Slater wrote:

    On Fri, Aug 22, 2008 at 12:24:53AM -0500, Jason Huggins wrote:

    Many applications choke when processing text files if last line is
    not properly
    terminated, as your experience with the shell demonstrates.
    Thing is that CouchDB is talking HTTP to curl (or any other HTTP
    client)
    and not through POSIX interfaces with the OS and other programs. Curl
    (or you favourite HTTP client) is doing that. So maybe curl should
    add a
    newline there to be POSIX compliant. But I can see curl saying "I
    don't
    touch the payload" for a good reason, too.

    I don't really know what's practical now, but I don't think your
    argumentation
    is tight. Maybe we can bring that before the resident ASF HTTP
    experts?

    Cheers
    Jan
    --
  • Noah Slater at Aug 22, 2008 at 4:03 pm

    On Fri, Aug 22, 2008 at 11:34:19AM -0400, Damien Katz wrote:
    I'm leery about having the server emit newlines, when payloads are signed
    little things like that can cause headaches.
    What do you mean by "signed" in this context?

    This proposal could go two ways:

    * CouchDB consistently omits the trailing newline from text files
    * CouchDB consistently adds the trailing newline to text files

    Where "file" is a reference to the HTTP response entity-body with a media type
    of application/json, which is in turn a text based format.

    Assuming you're talking about cryptographic signing, neither of these scenarios
    affect that because the output format will be entirely consistent.

    As I mentioned in my last email, I have outlined some concrete upsides of the
    trailing newline but no one has come forward with a concrete downside.

    Best,
  • Chris Anderson at Aug 22, 2008 at 5:12 pm

    On Fri, Aug 22, 2008 at 11:34:19AM -0400, Damien Katz wrote:
    I'm leery about having the server emit newlines, when payloads are signed
    little things like that can cause headaches.
    I'm not absolutely certain of the merits in this case, but as long as
    we're consistent I don't see the harm in adding newlines to JSON
    output. There's nothing in the JSON RFC that says we can't.

    I've added a ticket and a patch for when we decide which way to go:
    https://issues.apache.org/jira/browse/COUCHDB-107

    --
    Chris Anderson
    http://jchris.mfdz.com
  • Noah Slater at Aug 22, 2008 at 3:58 pm

    On Fri, Aug 22, 2008 at 04:36:08PM +0200, Jan Lehnardt wrote:
    Thing is that CouchDB is talking HTTP to curl (or any other HTTP client) and
    not through POSIX interfaces with the OS and other programs.
    This point is mute as curl is an application designed for a POSIX environment,
    therefor its IO handling is designed for POSIX standards, therefor it doesn't
    matter the input is coming from, only how the client expects to be able to
    handle it.

    But in any case, using curl like this is misdirection. We haven't built CouchDB
    to be user agent specific and there are many more POSIX aware clients out there.
    So maybe curl should add a newline there to be POSIX compliant.
    This would be awful and certainly not curl's mandate.

    Principal of least surprise, etc.
    But I can see curl saying "I don't touch the payload" for a good reason, too. Right.
    I don't really know what's practical now, but I don't think your argumentation
    is tight.
    Can you point out any flaws in my argument other than the above which I think I
    have refuted reasonably well.

    The summary of my argument is as follows:

    * POSIX defines text files are being terminated by newline characters
    * POSIX aware software, including a very large subset of potential CouchDB
    user agents, expect that text files are terminated by newline characters
    * CouchDB does not currently terminate text files with newline characters and
    this has already proved troublesom in a minor fashion but could easily cause
    larger problems for unknown use cases.
    * Patching CouchDB to terminate text files with newline characters does not
    have any negative consequences pointed out so far.

    Even if this was some hazy idea of best practice and not a standardised
    procedure, we're dealing with a situation that only has upsides and no known
    down sides. Combine that with the triviality of the patch involved and I really
    can't see the case for rejecting this change.
    Maybe we can bring that before the resident ASF HTTP experts?
    Sounds reasonable.
  • Chris Anderson at Aug 21, 2008 at 11:44 pm

    On Wed, Aug 20, 2008 at 6:50 PM, Chris Anderson wrote:
    I'd like to have pretty-print JSON also. It might be an easy flag to
    add to the json:encode function.
    Ah but the others are right that this belongs in the client. I don't
    see any harm in appending a newline to json output, though.



    --
    Chris Anderson
    http://jchris.mfdz.com
  • Christopher Lenz at Aug 21, 2008 at 10:18 pm

    On 21.08.2008, at 03:10, Noah Slater wrote:
    On Wed, Aug 20, 2008 at 01:04:49PM -0500, Jason Huggins wrote:
    I was thinking about creating a patch to add a "\n" at the end of
    every request, but I figured that would be a request of last resort.
    I did this very patch about 6 months ago but it was rejected and I
    can't quite
    remember why. Now that we're at the ASF I would like to call an
    informal vote.

    Fellow committers, what say ye to adding a trailing \n to JSON output?
    I think when you first brought this up, the patch would add a trailing
    newline to all responses, which would potentially corrupt binary
    responses such as for attachments.

    If this is done for JSON responses only, I don't see any problems with
    adding a trailing newline.

    [Not in direct response to this mail, but rather others on this
    thread:] Pretty-printing OTOH... I don't think it's worth it. This
    stuff isn't really intended for direct consumption by humans, and if
    you need prettified output, just post-process it with one of the
    bazillion available libs. Not something CouchDB should concern itself
    with.

    Cheers,
    --
    Christopher Lenz
    cmlenz at gmx.de
    http://www.cmlenz.net/
  • Jan Lehnardt at Aug 21, 2008 at 10:30 pm

    On Aug 22, 2008, at 00:17, Christopher Lenz wrote:
    On 21.08.2008, at 03:10, Noah Slater wrote:
    On Wed, Aug 20, 2008 at 01:04:49PM -0500, Jason Huggins wrote:
    I was thinking about creating a patch to add a "\n" at the end of
    every request, but I figured that would be a request of last resort.
    I did this very patch about 6 months ago but it was rejected and I
    can't quite
    remember why. Now that we're at the ASF I would like to call an
    informal vote.

    Fellow committers, what say ye to adding a trailing \n to JSON
    output?
    I think when you first brought this up, the patch would add a
    trailing newline to all responses, which would potentially corrupt
    binary responses such as for attachments.

    If this is done for JSON responses only, I don't see any problems
    with adding a trailing newline.

    [Not in direct response to this mail, but rather others on this
    thread:] Pretty-printing OTOH... I don't think it's worth it. This
    stuff isn't really intended for direct consumption by humans, and if
    you need prettified output, just post-process it with one of the
    bazillion available libs. Not something CouchDB should concern
    itself with.
    I see the benefits, but:

    jan@macnolia ~> curl -X GET http://localhost:5984/
    {"couchdb":"Welcome","version":"0.8.0-incubating"}jan@macnolia ~>
    jan@macnolia ~>
    jan@macnolia ~> ./ccurl.sh -X GET http://localhost:5984/
    {"couchdb":"Welcome","version":"0.8.0-incubating"}
    jan@macnolia ~> cat ccurl.sh
    #/bin/sh
    curl $@
    echo ""jan@macnolia ~>

    So this should be done on the client, in my opinion. Nothing stops
    us from shipping a newlineadding and prettyprinting CouchDB client,
    but i don't think that should be in the server.

    Cheers
    Jan
    --
  • Noah Slater at Aug 21, 2008 at 11:01 pm
    Pretty-printing OTOH... I don't think it's worth it.
    +1
    On Fri, Aug 22, 2008 at 12:29:46AM +0200, Jan Lehnardt wrote:
    So this should be done on the client, in my opinion. Nothing stops
    us from shipping a newlineadding and prettyprinting CouchDB client,
    but i don't think that should be in the server.
    -1

    Pretty printing can be separate, but we really should be appending a newline to
    JSON responses. If nothing else it makes the output files POSIXly correct.
  • Jan Lehnardt at Aug 22, 2008 at 7:49 am

    On Aug 22, 2008, at 00:58, Noah Slater wrote:

    Pretty-printing OTOH... I don't think it's worth it.
    +1
    On Fri, Aug 22, 2008 at 12:29:46AM +0200, Jan Lehnardt wrote:
    So this should be done on the client, in my opinion. Nothing stops
    us from shipping a newlineadding and prettyprinting CouchDB client,
    but i don't think that should be in the server.
    -1

    Pretty printing can be separate, but we really should be appending a
    newline to
    JSON responses. If nothing else it makes the output files POSIXly
    correct.
    If anything, we should be talking HTTP by the spec. RFC 2616 nowhere
    states
    that the the message or the body part of a request should end in a
    newline. Adding
    a newline would effectively alter content which in turn potentially
    breaks clients.

    We could try and guess if a "human user-agent" is making the request
    but I
    doubt that we can make that reliable.

    We could further add a ?make_json_pretty=true parameter but I think that
    defeats the purpose.

    Cheers
    Jan
    --
  • Noah Slater at Aug 22, 2008 at 11:52 am

    On Fri, Aug 22, 2008 at 09:42:08AM +0200, Jan Lehnardt wrote:
    If anything, we should be talking HTTP by the spec. RFC 2616 nowhere states
    that the the message or the body part of a request should end in a newline.
    This is incorrect. RFC2616 explicitly states that the entity-body of a HTTP
    response should be delimited according to the rules of it's media type.

    2.2 Basic Rules

    ...

    HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all protocol
    elements except the entity-body (see appendix 19.3 for tolerant
    applications). The end-of-line marker within an entity-body is defined by its
    associated media type, as described in section 3.7.

    Please see my previous email for the exact details of the POSIX specification
    for text mode files and canonical IO processing.

    UNIX applications WILL choke on files that do not end in a proper delimiter.
    Adding a newline would effectively alter content which in turn potentially
    breaks clients.
    This is a non sequitur. We are only proposing to add a newline character to JSON
    media types, and so this will have absolutely no harmful effect on clients.
    We could try and guess if a "human user-agent" is making the request but I
    doubt that we can make that reliable.
    Perhaps you are conflating the issues here. It's not about prettifying the
    output for humans, it's about correcting the text file format per the standards
    so that the regular UNIX toolchain can process it without problems.

    Best,
  • Jan Lehnardt at Aug 22, 2008 at 2:43 pm
    On Aug 22, 2008, at 13:52, Noah Slater wrote
    We could try and guess if a "human user-agent" is making the
    request but I
    doubt that we can make that reliable.
    Perhaps you are conflating the issues here. It's not about
    prettifying the
    output for humans, it's about correcting the text file format per
    the standards
    so that the regular UNIX toolchain can process it without problems.
    I am aware that we are discussing multiple things here. I want CouchDB's
    output to be correct as possible, even if my statements here are
    incorrect.

    See my next mail.

    Cheers
    Jan
    --
  • Niket Patel at Aug 21, 2008 at 5:29 am
    Thanks for idea, sometime this can be helpful

    Here is the script in ruby

    ---------------------------------------------

    #!/usr/bin/env ruby -wKU

    require 'rubygems'
    require 'json'

    json = `curl -s #{ARGV}`
    puts JSON.pretty_generate(JSON.parse(json))

    ---------------------------------------------

    Two addition,

    * It doesn't require pipe curl, still forward Arguments to curl.
    * -s option provided by default.

    above script can be used as command

    Thanks
    Niket Patel
    On Aug 20, 2008, at 11:34 PM, Jason Huggins wrote:

    Here's the script:
    --------------------------------------------------------
    #! /usr/bin/python

    import sys
    import simplejson

    raw_input = sys.stdin.read()
    json_data = simplejson.loads(raw_input)

    # Make pretty!
    print simplejson.dumps(json_data, sort_keys=True, indent=4)
    --------------------------------------------------------

    To get this to work correctly, I also had to add the silent "-s" flag
    to curl so that it didn't print a progress meter with the results as
    well. (Try it without "-s" to see what I mean.)

    I would love it there was "pretty print all output" configuration
    option for Couch... but until then, I'll just use my script. :-)

Related Discussions