FAQ
Hi Anil,

impala-user is a better list for this question. CC'ing impala-user.

bcc: scm-users

- Vikas

On Tue, Apr 9, 2013 at 4:43 AM, Anil Kumar wrote:

Hi All,

I Installed CM , Impala 0.6 and cdh4.2 in my cluster.

Is it there any standards for transaction processing benchmarks for no
sql databases? like we have tpc(Transaction processing council) for sql
databases.

to compromise my clients by saying that impala is better than hive ,
for this i need transaction processing benchmark standards. so that i
can say it to my client we peformed this standard and based on that
standards impala is giving more performance than hive.

please help me on this

Thanks,
Anil

Search Discussions

  • Greg Rahn at Apr 9, 2013 at 6:51 pm
    A couple of things:

    First, the TPC does not have any benchmark designed for such workloads, but
    that really isn't an issue IMHO. The only benchmark that should matter to
    your client is the one that is performed using their workload (queries and
    data). There is no guaranteed correlation that if product X is better than
    product Y on benchmark A, that product X is also better than product Y on
    benchmark B.

    Second, Impala and Hive are not NoSQL databases. Neither are databases at
    all, in fact, and neither support transactions per se. Just clarifying.
    On Tue, Apr 9, 2013 at 9:42 AM, Vikas Singh wrote:

    Hi Anil,

    impala-user is a better list for this question. CC'ing impala-user.

    bcc: scm-users

    - Vikas

    On Tue, Apr 9, 2013 at 4:43 AM, Anil Kumar wrote:

    Hi All,

    I Installed CM , Impala 0.6 and cdh4.2 in my cluster.

    Is it there any standards for transaction processing benchmarks for no
    sql databases? like we have tpc(Transaction processing council) for sql
    databases.

    to compromise my clients by saying that impala is better than hive ,
    for this i need transaction processing benchmark standards. so that i
    can say it to my client we peformed this standard and based on that
    standards impala is giving more performance than hive.

    please help me on this

    Thanks,
    Anil
  • Charles Earl at Apr 9, 2013 at 7:20 pm
    This seems like a good point that tpc-h seems like the wrong kind of benchmark for stores like Hive and Impala.
    I wonder whether the Impala team can weigh in, if not on benchmarks, then perhaps on standard kinds of problem sets that might be good for understanding the relative performance of emerging "responsive" platforms? In the conference literature, it looks still like everyone is citing performance on TPC-H-like queries and datasets.
    It does seem like the argument for systems like Shark or Impala is that they provide "responsive" analytics for big data -- for example you would hope that for a given BI workload Impala is "fast enough" so that it seems less like a batch submission.
    I realize that "Big data benchmarking" is a notion still in its infancy, http://clds.ucsd.edu/wbdb2012/program.
    Any thoughts welcome.
    C
    On Apr 9, 2013, at 2:51 PM, Greg Rahn wrote:

    A couple of things:

    First, the TPC does not have any benchmark designed for such workloads, but that really isn't an issue IMHO. The only benchmark that should matter to your client is the one that is performed using their workload (queries and data). There is no guaranteed correlation that if product X is better than product Y on benchmark A, that product X is also better than product Y on benchmark B.

    Second, Impala and Hive are not NoSQL databases. Neither are databases at all, in fact, and neither support transactions per se. Just clarifying.

    On Tue, Apr 9, 2013 at 9:42 AM, Vikas Singh wrote:
    Hi Anil,

    impala-user is a better list for this question. CC'ing impala-user.

    bcc: scm-users

    - Vikas


    On Tue, Apr 9, 2013 at 4:43 AM, Anil Kumar wrote:
    Hi All,

    I Installed CM , Impala 0.6 and cdh4.2 in my cluster.

    Is it there any standards for transaction processing benchmarks for no sql databases? like we have tpc(Transaction processing council) for sql databases.

    to compromise my clients by saying that impala is better than hive ,
    for this i need transaction processing benchmark standards. so that i can say it to my client we peformed this standard and based on that standards impala is giving more performance than hive.

    please help me on this

    Thanks,
    Anil

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedApr 9, '13 at 4:42p
activeApr 9, '13 at 7:20p
posts3
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase