FAQ
I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
calls, PQgetResult returns tens of millions of (invalid?) non-null
PGresults. This behavior seems incorrect, and sometimes causes my
application to exhaust memory and crash.

The postgres version is 8.1.3.

I can reproduce the problem in about 50 lines of C. I include below
(1) the code, (2) a psql dump of the table in question, (3) the code's
output.

I'd appreciate any insight or suggestions you may have.

Thanks,
max poletto

======================================================================
(1) code

// compile with: g++ -Wall -O -o pgtest pgtest.cc -lpq

#include <cerrno>
#include <cstdio>
#include <ctime>
#include <vector>
using namespace std;
#include <libpq-fe.h>
#include <sys/poll.h>

void retrieve(PGconn *conn, time_t timeout)
{
vector<PGresult *> res;
while (1) {
int r;
do {
struct pollfd pfds[1];
pfds[0].fd = PQsocket(conn);
pfds[0].events = POLLIN;
pfds[0].revents = 0;
r = poll(pfds, sizeof(pfds) / sizeof(struct pollfd), 1000 *
timeout);
} while (r < 0 && errno == EINTR);
if (r <= 0 || !PQconsumeInput(conn))
return;
int i = 0;
PGresult *oldr = 0;
while (!PQisBusy(conn)) {
PGresult *r = PQgetResult(conn);
if (r) {
res.push_back(r);
if (++i % 5000000 == 0) { printf("%d results\n", i); }
if (r == oldr) { printf("r==oldr (%p)\n", r); }
oldr = r;
} else {
printf("PQgetResult return 0 after %d results\n", i);
return;
}
}
}
}

int main()
{
PGconn *conn = PQconnectdb("dbname=testdb user=postgres");
if (!conn)
return -1;
if (PQsendQuery(conn, "COPY test FROM STDIN") > 0) {
retrieve(conn, 20);
if (PQputCopyEnd(conn, 0) < 0)
return -1;
}
PQfinish(conn);
return 0;
}


======================================================================
(2) psql session

root@tm01-5% psql testdb postgres
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

testdb=# \d test
Table "public.test"
Column | Type | Modifiers
--------+---------+-----------
a | integer |
b | integer |
c | integer |

testdb=# select * from test;
a | b | c
---+---+---
1 | 2 | 3
(1 row)

testdb=#


======================================================================
(3) output

root@tm01-5% ./pgtest
5000000 results
10000000 results
15000000 results
20000000 results
25000000 results
PQgetResult return 0 after 25649299 results
4.640u 4.696s 0:09.34 99.8% 0+0k 0+0io 0pf+0w

In this toy example, the process VM size exceeds 2GB before PQgetResult
finally returns 0. The real application, which has already allocated
~1GB prior to the query, runs out of memory (3GB limit on 32-bit linux)
before PQgetResult ever returns 0.

Search Discussions

  • Volkan YAZICI at May 21, 2006 at 9:49 pm

    On May 19 11:51, max.poletto@gmail.com wrote:
    if (PQsendQuery(conn, "COPY test FROM STDIN") > 0) {
    retrieve(conn, 20);
    Shouldn't you be send()'ing instead of retrieve()'ing? COPY tbl FROM
    stdin, requests data from client to COPY FROM stdin TO tbl.


    Regards.
  • Tom Lane at May 21, 2006 at 10:15 pm

    max.poletto@gmail.com writes:
    I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
    Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
    calls, PQgetResult returns tens of millions of (invalid?) non-null
    PGresults.
    You should fix your code to pay some attention to what those results
    are. I'm betting they are error results.
    This behavior seems incorrect, and sometimes causes my
    application to exhaust memory and crash.
    Well, that's because you're not PQclear'ing a result when done with it.

    regards, tom lane
  • Martijn van Oosterhout at May 22, 2006 at 6:54 am

    On Fri, May 19, 2006 at 11:51:00AM -0700, max.poletto@gmail.com wrote:
    I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
    Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
    calls, PQgetResult returns tens of millions of (invalid?) non-null
    PGresults. This behavior seems incorrect, and sometimes causes my
    application to exhaust memory and crash.
    In addition to the issues other people have pointed out, when using
    COPY you transfer the data with PQputCopyData()/PQgetCopyData().

    Have a nice day,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    From each according to his ability. To each according to his ability to litigate.
  • Max Poletto at May 24, 2006 at 2:13 am
    Thanks for all your replies, but I must clarify some things.

    First, note that what I posted is just a small example that reproduces
    behavior that appears incorrect. The real code is a C++ wrapper
    around libpq that supports non-blocking queries and reuses open
    connections.

    Volkan and Martijn: I know about PQ{put,get}CopyData, but my example
    never gets to that point. I must first determine whether the
    (asynchronous) PQsendQuery() of "COPY test FROM STDIN" succeeded.
    That's all that retrieve() tries to do in my example.

    Tom: of course I should (and eventually do) use PQclear(), but I may
    not want to right away, because I must return to the user a vector of
    result objects (for example, all the result rows from a query).

    I do not expect PQgetResult to return millions of non-null PGresult
    objects after a PQsendQuery("COPY test FROM STDIN"). I expect exactly
    one non-null result, with a result status of PGRES_COPY_IN. Moreover,
    the manual says:

    If a COPY command is issued via PQexec in a string that could
    contain
    additional commands, the application must continue fetching results
    via PQgetResult after completing the COPY sequence. Only when
    PQgetResult returns NULL is it certain that the PQexec command
    string
    is done and it is safe to issue more commands.

    I assumed this to be true for PQexec or "one of the equivalent
    functions" mentioned in the manual, such as PQsendQuery. However, if
    I add the following switch statement to my example:

    while (!PQisBusy(conn)) {
    PGresult *r = PQgetResult(conn);
    if (r) {
    switch (PQresultStatus(r)) {
    case PGRES_COPY_IN:
    break;
    case PGRES_EMPTY_QUERY:
    printf("PGRES_EMPTY_QUERY\n");
    break;
    case PGRES_COMMAND_OK:
    printf("PGRES_COMMAND_OK\n");
    break;
    case PGRES_TUPLES_OK:
    printf("PGRES_TUPLES_OK\n");
    break;
    case PGRES_COPY_OUT:
    printf("PGRES_COPY_OUT\n");
    break;
    case PGRES_BAD_RESPONSE:
    printf("PGRES_BAD_RESPONSE\n");
    break;
    case PGRES_NONFATAL_ERROR:
    printf("PGRES_NONFATAL_ERROR\n");
    break;
    case PGRES_FATAL_ERROR:
    printf("PGRES_FATAL_ERROR\n");
    break;
    }
    res.push_back(r);
    if (++i % 5000000 == 0) { printf("%d results\n", i); }
    if (r == oldr) { printf("r==oldr (%p)\n", r); }
    oldr = r;
    } else {
    printf("PQgetResult return 0 after %d results\n", i);
    return;
    }
    }

    the code still prints only:

    5000000 results
    10000000 results
    15000000 results
    20000000 results
    25000000 results
    PQgetResult return 0 after 25649299 results

    In other words, there are >25M distinct non-null results, and all of
    them have status code PGRES_COPY_IN, and none of them have errors.

    So it appears that I should check whether the first PGresult object
    has a status code of PGRES_COPY_IN, and ignore subsequent PGresults
    even if they are not NULL. I don't object to this interface, but it
    is not what I would conclude after RTFM.

    max
  • Tom Lane at May 24, 2006 at 7:06 pm

    max.poletto@gmail.com writes:
    I do not expect PQgetResult to return millions of non-null PGresult
    objects after a PQsendQuery("COPY test FROM STDIN"). I expect exactly
    one non-null result, with a result status of PGRES_COPY_IN.
    If you call it exactly once, it'll say that exactly once. If you keep
    calling it "millions of times", it'll keep saying that.
    Moreover, the manual says:
    If a COPY command is issued via PQexec in a string that could
    contain
    additional commands, the application must continue fetching results
    via PQgetResult after completing the COPY sequence. Only when
    PQgetResult returns NULL is it certain that the PQexec command
    string
    is done and it is safe to issue more commands.
    Indeed. You forgot to "complete the COPY sequence" before returning
    to the PQgetResult loop. As long as the thing is in COPY mode,
    PQgetResult will return a result saying PGRES_COPY_IN. The point
    of this paragraph is that you might want to consider doing more
    PQgetResults *after* you've ended COPY mode.

    regards, tom lane
  • Massimiliano Poletto at May 24, 2006 at 9:12 pm

    If you call it exactly once, it'll say that exactly once. If you keep
    calling it "millions of times", it'll keep saying that.
    OK. I wonder, though, why at some point it does in fact return 0.
    Indeed. You forgot to "complete the COPY sequence" before returning
    to the PQgetResult loop. As long as the thing is in COPY mode,
    PQgetResult will return a result saying PGRES_COPY_IN. The point
    Got it. Thanks.

    max
  • Tom Lane at May 25, 2006 at 2:03 pm

    "Massimiliano Poletto" <max.poletto@gmail.com> writes:
    If you call it exactly once, it'll say that exactly once. If you keep
    calling it "millions of times", it'll keep saying that.
    OK. I wonder, though, why at some point it does in fact return 0.
    Probably because you ran out of memory to create new PGresult structs.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedMay 19, '06 at 6:51p
activeMay 25, '06 at 2:03p
posts8
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase