FAQ
I would like to build a fast dataquery system. Basically I have several
terabytes of time data I would like to analyze and I was wondering if hbase
is the right tool? Currently, I have a hdfs cluster of 100+ nodes and
everything is working fine. We are very happy with it. However, it would be
nice to build something on top of the data. I was thinking of hbase but are
there any other solutions (easier)? Hbase looks complicated to install
because of zookeeper component.


--
--- Get your facts first, then you can distort them as you please.--

Search Discussions

  • Josh Patterson at Mar 8, 2011 at 2:51 pm
    Rita,
    Specifically what type of data are we talking about, and what type of
    queries are you looking to do? Effectively, what do you need to learn
    from the data?

    Thanks,

    Josh
    On Tue, Mar 8, 2011 at 8:16 AM, Rita wrote:
    I would like to build a fast dataquery system. Basically I have several
    terabytes of time data I would like to analyze and I was wondering if hbase
    is the right tool? Currently, I have a hdfs cluster of 100+ nodes and
    everything is working fine. We are very happy with it. However, it would be
    nice to build something on top of the data. I was thinking of hbase but are
    there any other solutions (easier)? Hbase looks complicated to install
    because of zookeeper component.


    --
    --- Get your facts first, then you can distort them as you please.--


    --
    Twitter: @jpatanooga
    Solution Architect @ Cloudera
    hadoop: http://www.cloudera.com
    blog: http://jpatterson.floe.tv
  • Ted Dunning at Mar 8, 2011 at 6:10 pm
    Take a look at http://opentsdb.net/ and see if it attacks your time series
    problem in an interesting way for what you are doing.

    Regarding your second comment, Zookeeper actually makes it easier to install
    hbase because it stabilizes the interactions between different components.
    There is also an option to have hbase run ZK itself so that you don't have
    to think about it. I wouldn't recommend that for a serious production
    install.

    Can you say a bit more about what kind of "time data" you have and what kind
    of analysis you want to do?
    On Tue, Mar 8, 2011 at 5:16 AM, Rita wrote:

    I would like to build a fast dataquery system. Basically I have several
    terabytes of time data I would like to analyze and I was wondering if hbase
    is the right tool? Currently, I have a hdfs cluster of 100+ nodes and
    everything is working fine. We are very happy with it. However, it would be
    nice to build something on top of the data. I was thinking of hbase but are
    there any other solutions (easier)? Hbase looks complicated to install
    because of zookeeper component.


    --
    --- Get your facts first, then you can distort them as you please.--
  • Rita at Mar 9, 2011 at 3:10 am
    sorry for being unclear.

    The timeseries data is very simple.

    Time, Value

    The time is in nanosecond precision and values are floating point. Some
    datasets span from 2010 to 2011 so as you can imagine there is a lot of
    data.

    I am looking for things like, what is the value from 2010/03/04 17:00:06.001
    to 17:01:06.0009

    Something simple like that
    On Tue, Mar 8, 2011 at 10:22 AM, Ted Dunning wrote:

    Take a look at http://opentsdb.net/ and see if it attacks your time series
    problem in an interesting way for what you are doing.

    Regarding your second comment, Zookeeper actually makes it easier to
    install hbase because it stabilizes the interactions between different
    components. There is also an option to have hbase run ZK itself so that you
    don't have to think about it. I wouldn't recommend that for a serious
    production install.

    Can you say a bit more about what kind of "time data" you have and what
    kind of analysis you want to do?
    On Tue, Mar 8, 2011 at 5:16 AM, Rita wrote:

    I would like to build a fast dataquery system. Basically I have several
    terabytes of time data I would like to analyze and I was wondering if hbase
    is the right tool? Currently, I have a hdfs cluster of 100+ nodes and
    everything is working fine. We are very happy with it. However, it would be
    nice to build something on top of the data. I was thinking of hbase but are
    there any other solutions (easier)? Hbase looks complicated to install
    because of zookeeper component.


    --
    --- Get your facts first, then you can distort them as you please.--

    --
    --- Get your facts first, then you can distort them as you please.--

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedMar 8, '11 at 1:16p
activeMar 9, '11 at 3:10a
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase