FAQ
I believe I have found a bug in libpq COPY handling with non-blocking
SSL connections. The bug manifests itself by dropping the connection
in PGputCopyData() with "server closed the connection unexpectedly"
message. The connection drop only occurs with nonblocking connections -
blocking connections work as expected.

I'll skip a lot of debugging details, but what I found out was that
the connection drop was caused by a SSL "bad write retry" from
SSL_write(). Which in turn appears to be caused by bad data passed to
SSL_write() retry. So, what happened was that at some point SSL_write()
returned SSL_ERROR_WANT_WRITE meaning that the write needs to be
retried. Now, according to OpenSSL manual the retry SSL_write needs to
be passed the same arguments as the previous call:
http://www.openssl.org/docs/ssl/SSL_write.html#WARNING

However the next SSL_write was passed another buffer with a different
length and it failed with SSL_ERROR_SSL "bad write retry". Which in
turn caused the pqSendSome to drop the connection and failed the
PQputCopyData. Actually, now that I think of it, this is probably not
only related to COPY handling ...

The connection drop can be reproduced by introducing some "network
problems" to the connection, so that the SSL_write needs to be retried.
This can be most easily accomplished by passing a huge buffer to
PQputCopyData. For instance on my laptop I can do about 3 PQputCopyData
with a 100MB buffer before the connection is dropped. Others results may
vary. Also, it doesn't seem to matter what the libpq version is
used - I initially started on 8.3, but later confirmed that the same
problem exists on 8.4, 9.0 and 9.1. Tested with OpenSSL 0.9.8g and
0.9.8o, no idea about other versions.

The following is a log excerpt from my test program (attached) with some
extra logging added to libpq:

PQputCopyData #0 buf:100000000 total:0
pqFlush()
pqFlush() = 0
pqPutMsgEnd()
pqSendSome(99999744)
SSL_write(0x7e59d60, 0x7ff5a7eac010, 99999744) = 99999744:
SSL_error: 0, errno: 0
pqPutMsgEnd() = 0
PQputCopyData #1 buf:100000000 total:100000000
pqPutMsgEnd()
pqSendSome(99999744)
SSL_write(0x7e59d60, 0x7ff5a7eac010, 99999744) = -1: SSL_error: 3,
errno: 11
pqPutMsgEnd() = 0
PQputCopyData #2 buf:100000000 total:200000000
pqFlush()
pqSendSome(100000266)
SSL_write(0x7e59d60, 0x7ff5a7eac010, 100000266) = -1: SSL_error:
3, errno: 11
pqFlush() = 1
pqPutMsgEnd()
pqSendSome(199999488)
SSL_write(0x7e59d60, 0x7ff597eab010, 199999488) = -1: SSL_error:
1, errno: 0
SSL error: bad write retry
pqPutMsgEnd() = -1
PQputCopyData() = -1: server closed the connection unexpectedly

In PQputCopyData #2 it is visible that the first SSL_write called from
pqFlush failed with SSL_ERROR_WANT_WRITE. The next SSL_write should
have been a retry with the same parameters, but instead was passed a
buffer with a different address and length. Hence the "bad write
retry". Some googling turned out similar issues for other projects
using SSL with non-blocking sockets.

The possible workarounds are to disable SSL or to disable non-blocking
libpq connections. Both are not always possible - security reasons, 3rd
party applications, drivers, etc. So I think this should be fixed in
libpq. Not sure exactly how though. It would seem that for the
PQputCopyData the best would be to return 0 to indicate that the
operation should be retried. No idea for the other possible cases of
SSL_write() retry though. What do you think?

regards,
Martin

Search Discussions

  • Robert Haas at Jun 12, 2011 at 1:22 am

    On Thu, Jun 9, 2011 at 10:39 AM, Martin Pihlak wrote:
    In PQputCopyData #2 it is visible that the first SSL_write called from
    pqFlush failed with SSL_ERROR_WANT_WRITE. The next SSL_write should
    have been a retry with the same parameters, but instead was passed a
    buffer with a different address and length. Hence the "bad write
    retry". Some googling turned out similar issues for other projects
    using SSL with non-blocking sockets.
    I'm not surprised. Setting the API so that it requires the same
    buffer (and not just a buffer pointing to the same data, or with the
    same initial contents) was maybe not the greatest design decision.
    The possible workarounds are to disable SSL or to disable non-blocking
    libpq connections. Both are not always possible - security reasons, 3rd
    party applications, drivers, etc. So I think this should be fixed in
    libpq. Not sure exactly how though. It would seem that for the
    PQputCopyData the best would be to return 0 to indicate that the
    operation should be retried. No idea for the other possible cases of
    SSL_write() retry though. What do you think?
    It seems to me we should try to paper over the problem within libpq
    rather than allowing it to percolate any higher up the call stack.
    One idea is that we could add outBuffer2/outBufSize2 to struct
    pg_conn, or something along those lines with less obviously stupid
    naming. Normally those would be unused, but in the special case where
    SSL indicates that we must retry the call with the same arguments, we
    set a flag that "freezes" the out buffer and forces any further data
    to be stuffed into outBuffer2. If or when the operation finally
    succeeds, we then move the data from outBuffer2 into outBuffer.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Martin Pihlak at Jun 15, 2011 at 7:20 pm

    On 06/12/2011 04:22 AM, Robert Haas wrote:
    One idea is that we could add outBuffer2/outBufSize2 to struct
    pg_conn, or something along those lines with less obviously stupid
    naming. Normally those would be unused, but in the special case where
    SSL indicates that we must retry the call with the same arguments, we
    set a flag that "freezes" the out buffer and forces any further data
    to be stuffed into outBuffer2. If or when the operation finally
    succeeds, we then move the data from outBuffer2 into outBuffer.
    Yes, that sounds like a good idea -- especially considering that COPY
    is not the only operation that can cause SSL_write retries.

    Attached is a first attempt at a patch to implement the described two
    buffer approach. This modifies pqSendSome so that whenever a SSL
    write retry is needed it saves the current outBuffer with its length
    and attempted write size to connection's sslRetry* variables. A new
    outBuffer is then allocated and used for any further data pushing.

    After the SSL write retry buffer is set up, any further calls to
    pqSendSome will first attempt to send the contents of the retry buffer,
    returning 1 to indicate that not all of the data could be sent. If the
    retry buffer is finally emptied it is freed and pqSendSome starts
    sending from the regular outBuffer.

    This is of course still "work in progress", needs cleaning up and
    definitely more testing. But at this point before going any further,
    I'd really appreciate a quick review from resident libpq gurus.

    regards,
    Martin
  • Steve Singer at Jun 24, 2011 at 9:14 pm

    On 11-06-15 03:20 PM, Martin Pihlak wrote:
    Yes, that sounds like a good idea -- especially considering that COPY
    is not the only operation that can cause SSL_write retries.


    This is of course still "work in progress", needs cleaning up and
    definitely more testing. But at this point before going any further,
    I'd really appreciate a quick review from resident libpq gurus.
    Martin,

    I'm not a libpq guru but I've given your patch a look over.

    The idea of storing the original buffer in new members of connection
    structure for later re-use seems like a good way to approach this.

    A few things I noticed (that you might be aware of since you mentioned
    it needs cleanup)

    -The patch doesn't compile with non-ssl builds, the debug at the bottom
    of PQSendSome isn't in an #ifdef

    -I don't think your handling the return code properly. Consider this case.

    pqSendSome(some data)
    sslRetryBuf = some Data
    return 1
    pqSendSome(more data)
    it sends all of 'some data'
    returns 0

    I think 1 should be returned because all of 'more data' still needs to
    be sent. I think returning a 0 will break PQsetnonblocking if you call
    it when there is data in both sslRetryBuf and outputBuffer.
    We might even want to try sending the data in outputBuffer after we've
    sent all the data sitting in sslRetryBuf.


    If you close the connection with an outstanding sslRetryBuf you need to
    free it.


    Other than running your test program I didn't do any testing with this
    patch.

    Steve
    regards,
    Martin

  • Robert Haas at Jun 27, 2011 at 2:25 pm

    On Fri, Jun 24, 2011 at 5:14 PM, Steve Singer wrote:
    A few things I noticed (that you might be aware of since you mentioned it
    needs cleanup)

    -The patch doesn't compile with non-ssl builds,  the debug at the bottom of
    PQSendSome isn't in an #ifdef

    -I don't think your handling the return code properly.   Consider this case.

    pqSendSome(some data)
    sslRetryBuf = some Data
    return 1
    pqSendSome(more data)
    it sends all of 'some data'
    returns 0

    I think 1 should be returned because all of 'more data' still needs to be
    sent.  I think returning a 0 will break PQsetnonblocking if you call it when
    there is data in both sslRetryBuf and outputBuffer.
    We might even want to try sending the data in outputBuffer after we've sent
    all the data sitting in sslRetryBuf.


    If you close the connection with an outstanding sslRetryBuf you need to free
    it.
    Based on these comments, I have updated the status of the patch to
    "Waiting on Author".

    https://commitfest.postgresql.org/action/patch_view?id=594

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Martin Pihlak at Jun 28, 2011 at 6:14 pm

    On 06/25/2011 12:14 AM, Steve Singer wrote:
    I'm not a libpq guru but I've given your patch a look over.
    Thanks for the review!

    I have since simplified the patch to assume that partial SSL writes are
    disabled -- according to SSL_write(3) this is the default behaviour.
    Now the SSL retry buffer only holds the data to be retried, the
    remainder is moved to the new outBuffer.
    -The patch doesn't compile with non-ssl builds, the debug at the bottom
    of PQSendSome isn't in an #ifdef
    Ack, there was another one in pqFlush. Fixed that.
    -I don't think your handling the return code properly. Consider this case.

    pqSendSome(some data)
    sslRetryBuf = some Data
    return 1
    pqSendSome(more data)
    it sends all of 'some data'
    returns 0

    I think 1 should be returned because all of 'more data' still needs to
    be sent. I think returning a 0 will break PQsetnonblocking if you call
    it when there is data in both sslRetryBuf and outputBuffer.
    Hmm, I thought I thought about that. There was a check in the original
    patch: "if (conn->sslRetryBytes || (conn->outCount - remaining) > 0)"
    So if the SSL retry buffer was emptied it would return 1 if there was
    something left in the regular output buffer. Or did I miss something
    there?
    We might even want to try sending the data in outputBuffer after we've
    sent all the data sitting in sslRetryBuf.
    IMHO that'd add unnecessary complexity to the pqSendSome. As this only
    happens in non-blocking mode the caller should be well prepared to
    handle the retry.
    If you close the connection with an outstanding sslRetryBuf you need to
    free it.
    Fixed.

    New version of the patch attached.

    regards,
    Martin
  • Steve Singer at Jun 30, 2011 at 1:25 pm

    On 11-06-28 02:14 PM, Martin Pihlak wrote:
    Hmm, I thought I thought about that. There was a check in the original
    patch: "if (conn->sslRetryBytes || (conn->outCount - remaining)> 0)"
    So if the SSL retry buffer was emptied it would return 1 if there was
    something left in the regular output buffer. Or did I miss something
    there?
    The issue I saw in the original patch was that at that point in the
    function, sslRetryBytes could be zero (if the data was sent) but
    conn->outCount - remaining would be an incorrect value since "remaining"
    is the number of bytes left to send from sslRetryBuf NOT
    conn->outBuffer. This is no longer an issue in the updated patch.
    I will try to take a closer look at your updated patch in the next few days.
  • Steve Singer at Jul 3, 2011 at 2:08 am

    On 11-06-28 02:14 PM, Martin Pihlak wrote:
    Thanks for the review!
    I have since simplified the patch to assume that partial SSL writes are
    disabled -- according to SSL_write(3) this is the default behaviour.
    Now the SSL retry buffer only holds the data to be retried, the
    remainder is moved to the new outBuffer.
    That sounds okay. Does it make sense to add in a check to verify that
    SSL didn't send a partial write? I don't know how bad openssl is about
    changing default behaviours or if we are concerned about protecting
    against someone changing the SSL parameters. My inclination is that
    this isn't needed but I'll raise the issue.
    Fixed.
    New version of the patch attached.
    Otherwise this version of the patch looks good to me.

    The only testing I have done is running the test program you sent
    earlier on in the thread and verified that the regression tests all
    pass. Other than something like your test program I'm not sure how else
    this bug can be induced.

    Since the original patch was submitted as a WIP patch and this version
    wasn't sent until well into the commit fest I am not sure if it
    qualifies for a committer during this commitfest or if it needs to wait
    until the next one.




    regards,
    Martin

  • Martin Pihlak at Jul 8, 2011 at 1:36 pm

    On 07/03/2011 05:08 AM, Steve Singer wrote:
    Since the original patch was submitted as a WIP patch and this version
    wasn't sent until well into the commit fest I am not sure if it
    qualifies for a committer during this commitfest or if it needs to wait
    until the next one.
    If possible I would like the fix to be backported as well. This is
    quite a nasty bug and difficult to track down. Especially as the
    actual SSL error messages are masked by the "server closed the
    connection unexpectedly" message.

    regards,
    Martin
  • Robert Haas at Jul 9, 2011 at 3:34 am

    On Fri, Jul 8, 2011 at 9:36 AM, Martin Pihlak wrote:
    On 07/03/2011 05:08 AM, Steve Singer wrote:
    Since the original patch was submitted as a WIP patch and this version
    wasn't sent until well into the commit fest I am not sure if it
    qualifies for a committer during this commitfest or if it needs to wait
    until the next one.
    If possible I would like the fix to be backported as well. This is
    quite a nasty bug and difficult to track down. Especially as the
    actual SSL error messages are masked by the "server closed the
    connection unexpectedly" message.
    I would not be inclined to back-patch this straight away. I agree
    it's an important fix, but I am a little worried about the chances of
    breaking something else... then again, I don't have the only vote
    here.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Jul 15, 2011 at 9:46 pm

    Robert Haas writes:
    On Fri, Jul 8, 2011 at 9:36 AM, Martin Pihlak wrote:
    On 07/03/2011 05:08 AM, Steve Singer wrote:
    Since the original patch was submitted as a WIP patch and this version
    wasn't sent until well into the commit fest I am not sure if it
    qualifies for a committer during this commitfest or if it needs to wait
    until the next one.
    If possible I would like the fix to be backported as well. This is
    quite a nasty bug and difficult to track down. Especially as the
    actual SSL error messages are masked by the "server closed the
    connection unexpectedly" message.
    I would not be inclined to back-patch this straight away. I agree
    it's an important fix, but I am a little worried about the chances of
    breaking something else... then again, I don't have the only vote
    here.
    I reviewed this patch a bit. I agree that we have a bug here that
    should be fixed and backported, but I don't think this patch is ready.
    Some problems:

    1. The behavior under low-memory conditions is pretty intolerable.
    As coded, a malloc failure here:

    + conn->outBufSize = Max(16 * 1024, remaining);
    + conn->outBuffer = malloc(conn->outBufSize);
    + if (conn->outBuffer == NULL)
    + {
    + printfPQExpBuffer(&conn->errorMessage,
    + "cannot allocate memory for output buffer\n");
    + return -1;
    + }

    leaves libpq's internal state completely corrupt --- outBuffer is null,
    which will result in instant coredump on any future access, but that's
    just the tip of the iceberg because we've also lost buffered data and
    messed up the none-too-clearly-defined-anyway state of which pending
    data is in which buffer.

    In general, I don't think it's a smart idea to proceed by duplicating
    the buffer contents in this situation, particularly not if the most
    obvious way to cause the problem is to have a very large buffer :-(.
    I think the direction to move in ought to be to use the existing buffer
    as-is, and have pqCheckOutBufferSpace refuse to enlarge the buffer while
    we are in "write frozen" state. It should be OK to append data to the
    buffer, though, so long as we remember how much we're allowed to pass to
    SSL_write when we next try to write.

    2. According to the OpenSSL man pages, *both* SSL_read and SSL_write are
    defined to need to be re-called with the identical arguments after a
    WANT_READ or WANT_WRITE return. This means that pqReadData() is also at
    risk for this type of bug. I believe it could only happen in code paths
    where we call pqReadData when there is already some data in the buffer:
    if the caller then consumes some of that data, the next call to
    pqReadData would try to compact the buffer contents before making the
    pqsecure_read call. I think we probably need a "read frozen" state in
    which we won't enlarge or compact the inBuffer until SSL_read succeeds.

    3. As of 9.0, somebody's decided that the backend needs nonblock read
    semantics in some cases. I probably should have complained harder about
    that before it went in, but since it's in, the backend is at risk for
    this same type of issue in secure_read().

    I will mark the patch Returned With Feedback. I think that we need to
    address all these issues before we consider applying it. If we weren't
    hard up against the end of the CommitFest I might have a go at it
    myself, but I can't justify the time right now.

    regards, tom lane
  • Martin Pihlak at Jul 24, 2011 at 12:55 pm

    On 07/16/2011 12:46 AM, Tom Lane wrote:
    I think the direction to move in ought to be to use the existing buffer
    as-is, and have pqCheckOutBufferSpace refuse to enlarge the buffer while
    we are in "write frozen" state. It should be OK to append data to the
    buffer, though, so long as we remember how much we're allowed to pass to
    SSL_write when we next try to write.
    Alternative to freezing the outBuffer would be to set
    SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER mode during SSL initialisation.
    That would enable the buffer address to be changed in-between the
    SSL_write calls, so long as the content remains the same. Attached
    is a patch that uses the single buffer approach described by Tom, but
    with a moving SSL write buffer enabled.

    Modifying pqCheckOutBufferSpace is also an option, but it'd break some
    (arguably already broken) client applications that don't have proper
    retry handling. Notably some versions of psycopg2 have problems with
    handling zero return values from PQputCopyData. So ISTM that from
    backporting perspective the moving write buffer is a bit safer.

    I'll see if I can come up with a test case for the SSL_read retry before
    attempting to fix that.

    regards,
    Martin
  • Tom Lane at Jul 24, 2011 at 4:22 pm

    Martin Pihlak writes:
    On 07/16/2011 12:46 AM, Tom Lane wrote:
    I think the direction to move in ought to be to use the existing buffer
    as-is, and have pqCheckOutBufferSpace refuse to enlarge the buffer while
    we are in "write frozen" state. It should be OK to append data to the
    buffer, though, so long as we remember how much we're allowed to pass to
    SSL_write when we next try to write.
    Alternative to freezing the outBuffer would be to set
    SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER mode during SSL initialisation.
    Hmmm ... that is a really interesting flag. The documentation is
    inadequate, but I googled a bit and found this thread about it:
    http://www.mail-archive.com/openssl-users@openssl.org/msg45440.html
    in which it's stated that (1) the prohibition on moving the buffer is
    actually just a sort of sanity check, and can be turned off using this
    flag; (2) it is okay to increase, but not reduce, the length parameter
    to SSL_write in the next call after a WANT_READ/WANT_WRITE return.

    So this does look like it'd fix the issue for SSL_write, without needing
    to introduce a concept of a "write frozen" buffer state. I am wondering
    though how far back support for this flag exists in OpenSSL, and whether
    the situation is the same for SSL_read or not. I don't see any
    SSL_MODE_ACCEPT_MOVING_READ_BUFFER macro ...

    regards, tom lane
  • Tom Lane at Jul 24, 2011 at 4:45 pm

    I wrote:
    So this does look like it'd fix the issue for SSL_write, without needing
    to introduce a concept of a "write frozen" buffer state. I am wondering
    though how far back support for this flag exists in OpenSSL,
    A bit of archaeology reveals that the flag was introduced in OpenSSL
    0.9.4, released in 1999. So it's probably Old Enough. (BTW, the 0.9.4
    changelog credits this change to one Bodo Moeller ... so the comments
    from him in the other thread I linked to can be considered authoritative
    ...)

    Still wondering about the SSL_read end of it, though.

    regards, tom lane
  • Tom Lane at Jul 24, 2011 at 5:10 pm

    I wrote:
    Still wondering about the SSL_read end of it, though.
    And on that front, some digging around in the OpenSSL source code
    indicates that they do all their work in internal buffers, and transfer
    data into SSL_read's result buffer only when ready to return it.
    So the claim in the documentation that SSL_read has a restriction
    comparable to SSL_write is a lie: there is no case where they'll copy
    some data into the buffer and then return -1.

    So the SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER solution looks like a good
    fix. I'll see about applying it.

    regards, tom lane
  • Tom Lane at Jul 24, 2011 at 8:33 pm

    Martin Pihlak writes:
    If possible I would like the fix to be backported as well. This is
    quite a nasty bug and difficult to track down. Especially as the
    actual SSL error messages are masked by the "server closed the
    connection unexpectedly" message.
    I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER)
    as well as a patch to improve the error reporting situation.

    regards, tom lane
  • Peter Geoghegan at Jul 24, 2011 at 11:50 pm

    On 24 July 2011 21:33, Tom Lane wrote:
    I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER)
    as well as a patch to improve the error reporting situation.
    I'm not exactly sure why, and don't have time to investigate right
    now, but this commit (fee476da952a1f02f7ccf6e233fb4824c2bf6af4)
    appears to have broken the build for me:

    gcc -O0 -g -Wall -Wmissing-prototypes -Wpointer-arith
    -Wdeclaration-after-statement -Wendif-labels -Wformat-security
    -fno-strict-aliasing -fwrapv -g -pthread -D_REENTRANT -D_THREAD_SAFE
    -D_POSIX_PTHREAD_SEMANTICS -fpic -DFRONTEND -DUNSAFE_STAT_OK -I.
    -I../../../src/include -D_GNU_SOURCE -I../../../src/port
    -I../../../src/port -DSO_MAJOR_VERSION=5 -c -o fe-misc.o fe-misc.c
    -MMD -MP -MF .deps/fe-misc.Po
    fe-misc.c: In function ‘pqReadData’:
    fe-misc.c:651:11: error: ‘PGconn’ has no member named ‘ssl’
    fe-misc.c:743:11: error: ‘PGconn’ has no member named ‘ssl’
    fe-misc.c:761:10: error: ‘PGconn’ has no member named ‘ssl’
    fe-misc.c: In function ‘pqSendSome’:
    fe-misc.c:841:14: error: ‘PGconn’ has no member named ‘ssl’
    fe-misc.c:861:14: error: ‘PGconn’ has no member named ‘ssl’

    The problem goes away if I check out the commit made immediately prior
    to this one - d0c23026b2499ba9d6797359241ade076a5a677d. I'm building
    with my usual, rather generic settings for hacking.

    --
    Peter Geoghegan       http://www.2ndQuadrant.com/
    PostgreSQL Development, 24x7 Support, Training and Services
  • Andrew Dunstan at Jul 25, 2011 at 12:55 am

    On 07/24/2011 07:49 PM, Peter Geoghegan wrote:
    On 24 July 2011 21:33, Tom Lanewrote:
    I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER)
    as well as a patch to improve the error reporting situation.
    I'm not exactly sure why, and don't have time to investigate right
    now, but this commit (fee476da952a1f02f7ccf6e233fb4824c2bf6af4)
    appears to have broken the build for me:
    Yeah, looks like it fails unless --with-openssl is set.

    cheers

    andrew
  • Martin Pihlak at Jul 25, 2011 at 7:33 am

    On 07/24/2011 11:33 PM, Tom Lane wrote:
    I've applied the simplified fix (just set SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER)
    as well as a patch to improve the error reporting situation.
    Cool that this turned out to be a one-line fix. Thanks!

    regards,
    Martin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJun 9, '11 at 2:39p
activeJul 25, '11 at 7:33a
posts19
users6
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase