FAQ
Hi DK,

i run into the same error as below.
"Query aborted, unable to fetch data"
this error always occurs when submit a specific query.

my impalad.INFO is below.

---
I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
<Host: 10.200.0.115 Port: 41752>Connection reset by peer
I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
<Host: 10.200.0.115 Port: 41738>Connection reset by peer
I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
<Host: 10.200.0.115 Port: 41737>Connection reset by peer
I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
died: ECONNRESET
I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
died: ECONNRESET
I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
died: ECONNRESET
I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
query_id=cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
query_id=cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
I0311 18:48:28.310125 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
backend=10.200.0.105:22000
I0311 18:48:28.387034 13261 impala-server.cc:1357]
CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
I0311 18:48:28.387292 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
backend=10.200.0.106:22000
I0311 18:48:29.122704 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
backend=10.200.0.107:22000
I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
query_id=cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
query_id=cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:29.237684 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
backend=10.200.0.108:22000
E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:29.512716 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
backend=10.200.0.109:22000
E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:29.580339 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
backend=10.200.0.110:22000
E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:31.394305 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
backend=10.200.0.111:22000
E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:31.475584 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
backend=10.200.0.112:22000
I0311 18:48:31.575423 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
backend=10.200.0.113:22000
E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:32.417927 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612eca
backend=10.200.0.114:22000
E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:32.773434 13198 coordinator.cc:886] sending
CancelPlanFragment rpc for
instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
backend=10.200.0.115:22000
E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
connect() <Host: 10.200.0.115 Port: 22000>Connection refused
I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
10.200.0.115:22000(connect() failed: Connection refused)
@ 0x7ccc81 (unknown)
@ 0x7b41d4 (unknown)
@ 0x76dce7 (unknown)
@ 0x775c0b (unknown)
@ 0x775da7 (unknown)
@ 0x775e7b (unknown)
@ 0x68638a (unknown)
@ 0x6865ce (unknown)
@ 0x7c1265 (unknown)
@ 0x86e81f (unknown)
@ 0x86df44 (unknown)
@ 0x69922e (unknown)
@ 0x11ef429 (unknown)
@ 0x11f1ed2 (unknown)
@ 0x3e566077f1 (unknown)
@ 0x3e562e570d (unknown)
I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
query_id=cc5e64ba3e604477:8dd556dcf7612ebf
E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
cc5e64ba3e604477:8dd556dcf7612ebf
I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
---

i suspect that it results from connection timeout of thrift service...

Thanks,

suda

2013/3/11 DK <dileepkumar.dk@gmail.com>:
Hi All,

When I submit many queries using a script and "impala-shell" I see after
10-12 query execution the system hangs and in the query response this is
what I see:

Query aborted, unable to fetch data

Error connecting: <class 'thrift.transport.TTransport.TTransportException'>,
Could not connect to impala02:21000

Has anyone seen similar behavior ?

Thanks,
Dileep

Search Discussions

  • Henry Robinson at Mar 11, 2013 at 6:05 pm
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to
    10.200.0.115
    because the connection was refused, which usually means that the server
    socket is not open, and this usually means that the server has crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original cancellation
    of the query (see the "Connection reset by peer" messages at the top of the
    log). I suspect this host has crashed, so it would be excellent to see its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepkumar.dk@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see after
    10-12 query execution the system hangs and in the query response this is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep


    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • DK at Mar 11, 2013 at 7:54 pm
    Do you think changing the connection timeout for our Thrift servers or
    clients will help as you said the current value is 0 I think.

    Please advise.

    DK
    On Monday, March 11, 2013 11:05:27 AM UTC-7, Henry wrote:

    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to 10.200.0.115
    because the connection was refused, which usually means that the server
    socket is not open, and this usually means that the server has crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original cancellation
    of the query (see the "Connection reset by peer" messages at the top of
    the log). I suspect this host has crashed, so it would be excellent to see
    its logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the query
    itself.

    Thanks,
    Henry

    On 11 March 2013 03:33, Yukinori SUDA <sud...@gmail.com <javascript:>>wrote:
    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com <javascript:>>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see after
    10-12 query execution the system hangs and in the query response this is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep


    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Yukinori SUDA at Mar 12, 2013 at 3:11 am
    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and swap space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <henry@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to
    10.200.0.115 because the connection was refused, which usually means that
    the server socket is not open, and this usually means that the server has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original cancellation of
    the query (see the "Connection reset by peer" messages at the top of the
    log). I suspect this host has crashed, so it would be excellent to see its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepkumar.dk@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see after
    10-12 query execution the system hangs and in the query response this is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • DK at Mar 12, 2013 at 7:19 pm
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>, Could not connect to
    impala01host:21000

    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com <javascript:>>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to
    10.200.0.115 because the connection was refused, which usually means that
    the server socket is not open, and this usually means that the server has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of the
    log). I suspect this host has crashed, so it would be excellent to see its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the query
    itself.

    Thanks,
    Henry

    On 11 March 2013 03:33, Yukinori SUDA <sud...@gmail.com <javascript:>>
    wrote:
    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com <javascript:>>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see
    after
    10-12 query execution the system hangs and in the query response this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Marcel Kornacker at Mar 12, 2013 at 10:25 pm

    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to
    10.200.0.115 because the connection was refused, which usually means
    that
    the server socket is not open, and this usually means that the server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of the
    log). I suspect this host has crashed, so it would be excellent to see
    its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see
    after
    10-12 query execution the system hangs and in the query response this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Scott Ruffing at Apr 18, 2013 at 3:27 pm
    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:

    On Tue, Mar 12, 2013 at 12:19 PM, DK <dileepk...@gmail.com <javascript:>>
    wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers
    or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established
    to
    10.200.0.115 because the connection was refused, which usually means
    that
    the server socket is not open, and this usually means that the server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of
    the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see
    after
    10-12 query execution the system hangs and in the query response
    this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Ishaan Joshi at Apr 18, 2013 at 4:11 pm
    Scott,

    Impala 0.7 supports per-process and per-query memory limits. Details on
    the memory limit flag for the impalad process can be found here:
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_3_1.html

    You will also be able to set per query memory limits from the impala
    shell. The set command should yield the mem limit option.

    Thanks,

    -- Ishaan

    On Thu, Apr 18, 2013 at 8:27 AM, Scott Ruffing wrote:

    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:
    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following
    reasons:
    1. We don't set the recv or connection timeout for our Thrift
    servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established
    to
    10.200.0.115 because the connection was refused, which usually means
    that
    the server socket is not open, and this usually means that the
    server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of
    the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:**8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I
    see
    after
    10-12 query execution the system hangs and in the query response
    this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Miklos Christine at Apr 18, 2013 at 5:59 pm
    Hello Scott,

    The memory limits is a startup configuration. The configuration guide for
    these options can be found here:
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_3_1.html
    There is a -mem_limit configuration that will set the percentage physical
    memory on the machine for impala.

    Thanks,
    Miklos

    On Thu, Apr 18, 2013 at 8:27 AM, Scott Ruffing wrote:

    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:
    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following
    reasons:
    1. We don't set the recv or connection timeout for our Thrift
    servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established
    to
    10.200.0.115 because the connection was refused, which usually means
    that
    the server socket is not open, and this usually means that the
    server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of
    the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:**8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I
    see
    after
    10-12 query execution the system hangs and in the query response
    this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Scott Ruffing at Apr 18, 2013 at 6:09 pm
    Thanks for the responses Ishaan and Miklos.

    I tried going to impala-shell and setting the limit as

    "set mem-limit=70%" and that would lead to something like...

    Query: show mem_limit
    ERROR: Invalid query memory limit with percent '70%'.

    Then I set it as follows and it failed as I kind of expected...

    [localhost:21000] > set mem_limit=70
    ;
    MEM_LIMIT set to 70
    [localhost:21000] > select count(*) from table;
    Query: select count(*) from pa_sales_fact;
    Query aborted, unable to fetch data

    Backend 2:Memory limit exceeded

    So setting the value seems to work fine from the command line. But is
    there any documentation available on what all these options are for and
    when to use them, or are these more for Cloudera's internal tuning use and
    not intended on being publicized at this time? The link above shows how to
    set them, but no further details are provided.

    For example, setting this seems to prevent the Impala node from crashing
    because the query cuts out sooner. But it doesn't seem to help limit the
    amount of memory available so the query could actually finish. The only
    solution in this case is to get more memory or reduce the size of your
    tables.


    On Thu, Apr 18, 2013 at 12:59 PM, Miklos Christine wrote:

    Hello Scott,

    The memory limits is a startup configuration. The configuration guide for
    these options can be found here:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_3_1.html
    There is a -mem_limit configuration that will set the percentage physical
    memory on the machine for impala.

    Thanks,
    Miklos

    On Thu, Apr 18, 2013 at 8:27 AM, Scott Ruffing wrote:

    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:
    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following
    reasons:
    1. We don't set the recv or connection timeout for our Thrift
    servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established
    to
    10.200.0.115 because the connection was refused, which usually
    means
    that
    the server socket is not open, and this usually means that the
    server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115
    still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top
    of the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of
    the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:**8dd556dcf7612ebf: 92% Complete (703 out of
    764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I
    see
    after
    10-12 query execution the system hangs and in the query response
    this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Chris Aschauer at Apr 18, 2013 at 6:13 pm
    Information on the command line options, including the one you found, can
    be found here:
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_6_3.html

    On Thu, Apr 18, 2013 at 11:09 AM, Scott Ruffing wrote:

    Thanks for the responses Ishaan and Miklos.

    I tried going to impala-shell and setting the limit as

    "set mem-limit=70%" and that would lead to something like...

    Query: show mem_limit
    ERROR: Invalid query memory limit with percent '70%'.

    Then I set it as follows and it failed as I kind of expected...

    [localhost:21000] > set mem_limit=70
    ;
    MEM_LIMIT set to 70
    [localhost:21000] > select count(*) from table;
    Query: select count(*) from pa_sales_fact;
    Query aborted, unable to fetch data

    Backend 2:Memory limit exceeded

    So setting the value seems to work fine from the command line. But is
    there any documentation available on what all these options are for and
    when to use them, or are these more for Cloudera's internal tuning use and
    not intended on being publicized at this time? The link above shows how to
    set them, but no further details are provided.

    For example, setting this seems to prevent the Impala node from crashing
    because the query cuts out sooner. But it doesn't seem to help limit the
    amount of memory available so the query could actually finish. The only
    solution in this case is to get more memory or reduce the size of your
    tables.


    On Thu, Apr 18, 2013 at 12:59 PM, Miklos Christine wrote:

    Hello Scott,

    The memory limits is a startup configuration. The configuration guide for
    these options can be found here:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_3_1.html
    There is a -mem_limit configuration that will set the percentage physical
    memory on the machine for impala.

    Thanks,
    Miklos

    On Thu, Apr 18, 2013 at 8:27 AM, Scott Ruffing wrote:

    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:
    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following
    reasons:
    1. We don't set the recv or connection timeout for our Thrift
    servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be
    established to
    10.200.0.115 because the connection was refused, which usually
    means
    that
    the server socket is not open, and this usually means that the
    server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115
    still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top
    of the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the
    query,
    including the format and structure of any tables and the text of
    the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:**8dd556dcf7612ebf: 92% Complete (703 out of
    764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling
    all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling
    all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311]
    close():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile
    for
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I
    see
    after
    10-12 query execution the system hangs and in the query
    response this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Alex Behm at Apr 18, 2013 at 9:19 pm
    Dear Scott,

    Impala supports two different kinds of memory limits:
    1. Process-wide limit which is set a the startup of an impalad
    2. Per-query memory limit which can be set, e.g., via the shell as you have

    There is one important difference between the two.
    The per-query memory limit currently does not support the "%" notation
    because the allowed amount of memory for the query will be different on
    each machine. This detail makes it difficult to reason about the memory at
    query compile time.

    That being said, the error message clearly needs improvement. My apologies.

    Best regards,

    Alex



    On Thu, Apr 18, 2013 at 11:09 AM, Scott Ruffing wrote:

    Thanks for the responses Ishaan and Miklos.

    I tried going to impala-shell and setting the limit as

    "set mem-limit=70%" and that would lead to something like...

    Query: show mem_limit
    ERROR: Invalid query memory limit with percent '70%'.

    Then I set it as follows and it failed as I kind of expected...

    [localhost:21000] > set mem_limit=70
    ;
    MEM_LIMIT set to 70
    [localhost:21000] > select count(*) from table;
    Query: select count(*) from pa_sales_fact;
    Query aborted, unable to fetch data

    Backend 2:Memory limit exceeded

    So setting the value seems to work fine from the command line. But is
    there any documentation available on what all these options are for and
    when to use them, or are these more for Cloudera's internal tuning use and
    not intended on being publicized at this time? The link above shows how to
    set them, but no further details are provided.

    For example, setting this seems to prevent the Impala node from crashing
    because the query cuts out sooner. But it doesn't seem to help limit the
    amount of memory available so the query could actually finish. The only
    solution in this case is to get more memory or reduce the size of your
    tables.


    On Thu, Apr 18, 2013 at 12:59 PM, Miklos Christine wrote:

    Hello Scott,

    The memory limits is a startup configuration. The configuration guide for
    these options can be found here:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/ImpalaBeta/0.7/Installing-and-Using-Impala/ciiu_topic_3_1.html
    There is a -mem_limit configuration that will set the percentage physical
    memory on the machine for impala.

    Thanks,
    Miklos

    On Thu, Apr 18, 2013 at 8:27 AM, Scott Ruffing wrote:

    Can anyone provide an update on this?

    "The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process. "

    How do you set these limits in the 0.7 release? I am seeing the ImpalaD
    process take up all the memory and crash, eventually restarting itself.
    This kind of round-robins through the nodes. In the process of one node
    crashing, other nodes seem to continually gain memory until they eventually
    crash themselves at some point. Most nodes show peaks of memory usage,
    while other nodes seem to hang onto it and eventually fail. It seems like
    when one node fails, the others may not finish their process correctly and
    free the memory they've acquired.

    One other related question to setting the limits above. Where is the
    documentation on the Imapala Configuration variables? Those listed in Hue
    such as MAX_ERRORS, MEM_LIMIT, and so on.
    On Tuesday, March 12, 2013 5:25:23 PM UTC-5, Marcel Kornacker wrote:
    On Tue, Mar 12, 2013 at 12:19 PM, DK wrote:
    In my case the timeout is set to 60 sec max and 10 sec minimum and also
    server has plenty of memory which looks Impala is using unbounded and goes
    over 100GB.
    Still I see
    Error connecting: <class 'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala01host:21000
    That probably means that you're running a query that consumes a very
    large amount of memory (e.g., SELECT DISTINCT ...), which eventually
    ends up killing the whole process.

    The next release will allow you to set limits on the per-query memory
    consumption and also memory consumption of the impalad process.
    Thanks,
    Dileep

    On Monday, March 11, 2013 8:11:43 PM UTC-7, Suda Yukinori wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and
    swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <he...@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following
    reasons:
    1. We don't set the recv or connection timeout for our Thrift
    servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be
    established to
    10.200.0.115 because the connection was refused, which usually
    means
    that
    the server socket is not open, and this usually means that the
    server
    has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115
    still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top
    of the
    log). I suspect this host has crashed, so it would be excellent to
    see
    its
    logs, and to get as much detail as you can provide about the
    query,
    including the format and structure of any tables and the text of
    the
    query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:**8dd556dcf7612ebf: 92% Complete (703 out of
    764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read()
    recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer
    client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling
    all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394]
    Cancel():
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling
    all
    streams for fragment=cc5e64ba3e604477:**8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled
    stream:
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311]
    close():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012]
    UnregisterQuery():
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:**8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport
    for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile
    for
    query_id=cc5e64ba3e604477:**8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query
    id:
    cc5e64ba3e604477:**8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210]
    DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:**8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift
    service...
    Thanks,

    suda

    2013/3/11 DK <dileepk...@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I
    see
    after
    10-12 query execution the system hangs and in the query
    response this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.**TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Henry Robinson at Mar 12, 2013 at 7:28 pm
    Hi Suda -

    Thanks for keeping us up to date - and thanks for your great work producing
    the benchmarks, which have been very interesting to read.

    Let us know if you think Impala's memory consumption is excessive, and
    we'll work with you to track that down.

    Best,
    Henry
    On 11 March 2013 20:11, Yukinori SUDA wrote:

    Hi Henry,

    thanks so much for your advice.

    When monitering status of impalad process on all servers,
    I could see that the process has crashed due to lack of memory( and swap
    space).

    Our environment is shown below.
    http://goo.gl/3f96F

    Frankly speaking, I became aware that our servers didn't have enough
    memory to use impala.
    I will expand memory in the near future.

    Thanks,

    suda

    2013/3/12 Henry Robinson <henry@cloudera.com>:
    Hi Suda -

    I'm not sure that this is a timeout issue, for the following reasons:

    1. We don't set the recv or connection timeout for our Thrift servers or
    clients, and the default is 0
    2. The last error you get is that a connection can't be established to
    10.200.0.115 because the connection was refused, which usually means that
    the server socket is not open, and this usually means that the server has
    crashed.

    After you run your query, is the Impala daemon on 10.200.0.115 still
    running? Note that it was this host that caused the original
    cancellation of
    the query (see the "Connection reset by peer" messages at the top of the
    log). I suspect this host has crashed, so it would be excellent to see its
    logs, and to get as much detail as you can provide about the query,
    including the format and structure of any tables and the text of the query
    itself.

    Thanks,
    Henry
    On 11 March 2013 03:33, Yukinori SUDA wrote:

    Hi DK,

    i run into the same error as below.
    "Query aborted, unable to fetch data"
    this error always occurs when submit a specific query.

    my impalad.INFO is below.

    ---
    I0311 18:47:48.205356 6272 progress-updater.cc:55] Query
    cc5e64ba3e604477:8dd556dcf7612ebf: 92% Complete (703 out of 764)
    I0311 18:48:21.708498 14032 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41752>Connection reset by peer
    I0311 18:48:21.708485 13296 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41738>Connection reset by peer
    I0311 18:48:21.708406 13275 thrift-util.cc:53] TSocket::read() recv()
    <Host: 10.200.0.115 Port: 41737>Connection reset by peer
    I0311 18:48:22.720187 14032 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.720300 13296 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:22.769686 13275 thrift-util.cc:53] TThreadedServer client
    died: ECONNRESET
    I0311 18:48:28.131676 13198 impala-server.cc:1226] Cancel():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259261 13198 coordinator.cc:836] Cancel()
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:28.259349 13198 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310003 13198 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec0
    I0311 18:48:28.310081 13198 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0 node_id=6
    I0311 18:48:28.310125 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    backend=10.200.0.105:22000
    I0311 18:48:28.387034 13261 impala-server.cc:1357]
    CancelPlanFragment(): instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387152 13261 plan-fragment-executor.cc:394] Cancel():
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387187 13261 data-stream-mgr.cc:233] cancelling all
    streams for fragment=cc5e64ba3e604477:8dd556dcf7612ec1
    I0311 18:48:28.387209 13261 data-stream-mgr.cc:127] cancelled stream:
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec1 node_id=5
    I0311 18:48:28.387292 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec2
    backend=10.200.0.106:22000
    I0311 18:48:29.122704 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec3
    backend=10.200.0.107:22000
    I0311 18:48:29.143808 13072 impala-beeswax-server.cc:311] close():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.175984 13072 impala-server.cc:1012] UnregisterQuery():
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.237684 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec4
    backend=10.200.0.108:22000
    E0311 18:48:29.248538 14025 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.512716 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec5
    backend=10.200.0.109:22000
    E0311 18:48:29.570737 13288 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:29.580339 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec6
    backend=10.200.0.110:22000
    E0311 18:48:29.608491 14033 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.394305 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec7
    backend=10.200.0.111:22000
    E0311 18:48:31.455822 13295 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:31.475584 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec8
    backend=10.200.0.112:22000
    I0311 18:48:31.575423 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ec9
    backend=10.200.0.113:22000
    E0311 18:48:31.941292 13299 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.417927 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612eca
    backend=10.200.0.114:22000
    E0311 18:48:32.496443 14026 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.496633 13287 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.773434 13198 coordinator.cc:886] sending
    CancelPlanFragment rpc for
    instance_id=cc5e64ba3e604477:8dd556dcf7612ecb
    backend=10.200.0.115:22000
    E0311 18:48:32.787086 6272 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:32.791580 14027 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:32.831794 13198 thrift-util.cc:53] TSocket::open()
    connect() <Host: 10.200.0.115 Port: 22000>Connection refused
    I0311 18:48:32.912494 13198 status.cc:40] Couldn't open transport for
    10.200.0.115:22000(connect() failed: Connection refused)
    @ 0x7ccc81 (unknown)
    @ 0x7b41d4 (unknown)
    @ 0x76dce7 (unknown)
    @ 0x775c0b (unknown)
    @ 0x775da7 (unknown)
    @ 0x775e7b (unknown)
    @ 0x68638a (unknown)
    @ 0x6865ce (unknown)
    @ 0x7c1265 (unknown)
    @ 0x86e81f (unknown)
    @ 0x86df44 (unknown)
    @ 0x69922e (unknown)
    @ 0x11ef429 (unknown)
    @ 0x11f1ed2 (unknown)
    @ 0x3e566077f1 (unknown)
    @ 0x3e562e570d (unknown)
    I0311 18:48:32.956487 13198 coordinator.cc:1126] Final profile for
    query_id=cc5e64ba3e604477:8dd556dcf7612ebf
    E0311 18:48:33.601567 13292 impala-server.cc:1349] unknown query id:
    cc5e64ba3e604477:8dd556dcf7612ebf
    I0311 18:48:33.957559 13300 data-stream-mgr.cc:210] DeregisterRecvr():
    fragment_id=cc5e64ba3e604477:8dd556dcf7612ec0, node=6
    ---

    i suspect that it results from connection timeout of thrift service...

    Thanks,

    suda

    2013/3/11 DK <dileepkumar.dk@gmail.com>:
    Hi All,

    When I submit many queries using a script and "impala-shell" I see
    after
    10-12 query execution the system hangs and in the query response this
    is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>,
    Could not connect to impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep



    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679


    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Henry Robinson at Mar 11, 2013 at 6:10 pm
    Hi DK -

    It sounds like the Impala daemon that you're submitting queries to might
    have crashed. Can you check if the process is still running when you get
    this error?

    Thanks,
    Henry
    On 10 March 2013 21:53, DK wrote:

    Hi All,

    When I submit many queries using a script and "impala-shell" I see after
    10-12 query execution the system hangs and in the query response this is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>, Could not connect to
    impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep

    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • DK at Mar 11, 2013 at 7:53 pm
    I can see the process "impalad" still up and running, aslo I notice that
    memory utilization is very very hign around 90GB for RSS and 131 GB for
    VIRT.

    Thanks,
    DK
    On Monday, March 11, 2013 11:10:35 AM UTC-7, Henry wrote:

    Hi DK -

    It sounds like the Impala daemon that you're submitting queries to might
    have crashed. Can you check if the process is still running when you get
    this error?

    Thanks,
    Henry
    On 10 March 2013 21:53, DK <dileepk...@gmail.com <javascript:>> wrote:

    Hi All,

    When I submit many queries using a script and "impala-shell" I see after
    10-12 query execution the system hangs and in the query response this is
    what I see:

    Query aborted, unable to fetch data

    Error connecting: <class
    'thrift.transport.TTransport.TTransportException'>, Could not connect to
    impala02:21000

    Has anyone seen similar behavior ?

    Thanks,
    Dileep

    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedMar 11, '13 at 10:33a
activeApr 18, '13 at 9:19p
posts15
users9
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase