FAQ
Hello,

Im using cascading/maple as a hbase tap for hbase 0.92.1. I have not been
able to query by specifying the row-id. Here is my code -

(ns cascatrial.dbhbase
(:require (cascalog [workflow :as w]
[ops :as c]
[vars :as v]))
(:import (org.apache.hadoop.hbase.util Bytes)
(com.twitter.maple.hbase HBaseTap HBaseScheme)))

(use '[cascalog api playground])

(defn hbase-tap [table-name key-field column-family & value-fields]
(let [scheme (HBaseScheme. (w/fields key-field) column-family (w/fields
value-fields))
tap (HBaseTap. table-name scheme)]
tap))

(defn to-string [bytes]
(if (= nil bytes)
""
(String. bytes)))

(defn trial-query
[]
(let [h-table (hbase-tap "test" "row1" "cf" "name" "")]
(?<- (stdout)
[!l !l1 !l2]
(h-table !name !temp !temp2)
(to-string !name :> !l)
(to-string !temp :> !l1)
(to-string !temp2 :> !l2))))

This returns -
RESULTS
-----------------------
row1 pranav
row2
row3
-----------------------

Isnt this query supposed to return only row1? Not sure if im using it
incorrectly or if there is a bug in the tap. Some help?

Thanks!

Search Discussions

  • Nathan Marz at Jul 11, 2012 at 5:00 am
    You may want to ask the cascading list as well, more people there may be
    familiar with the HBase tap.

    On Tue, Jul 10, 2012 at 4:15 AM, Pranav wrote:

    Hello,

    Im using cascading/maple as a hbase tap for hbase 0.92.1. I have not been
    able to query by specifying the row-id. Here is my code -

    (ns cascatrial.dbhbase
    (:require (cascalog [workflow :as w]
    [ops :as c]
    [vars :as v]))
    (:import (org.apache.hadoop.hbase.util Bytes)
    (com.twitter.maple.hbase HBaseTap HBaseScheme)))

    (use '[cascalog api playground])

    (defn hbase-tap [table-name key-field column-family & value-fields]
    (let [scheme (HBaseScheme. (w/fields key-field) column-family (w/fields
    value-fields))
    tap (HBaseTap. table-name scheme)]
    tap))

    (defn to-string [bytes]
    (if (= nil bytes)
    ""
    (String. bytes)))

    (defn trial-query
    []
    (let [h-table (hbase-tap "test" "row1" "cf" "name" "")]
    (?<- (stdout)
    [!l !l1 !l2]
    (h-table !name !temp !temp2)
    (to-string !name :> !l)
    (to-string !temp :> !l1)
    (to-string !temp2 :> !l2))))

    This returns -
    RESULTS
    -----------------------
    row1 pranav
    row2
    row3
    -----------------------

    Isnt this query supposed to return only row1? Not sure if im using it
    incorrectly or if there is a bug in the tap. Some help?

    Thanks!


    --
    Twitter: @nathanmarz
    http://nathanmarz.com
  • Simon Holgate at Jul 12, 2012 at 12:58 am
    Hi Pranav,

    I replied to your query on my blog (http://wp.me/pI0pm-C) but in case
    someone else finds this thread and is interested, I thought I'd report it
    here.

    If you know the row id (‘row1′ as you say) then you can do something like
    this:

    (defn get-key-from-table
    "Gets the designated key from the row in the table"
    [table-name row-id column-family key]
    (let [table (hbase-table table-name)
    g (Get. (Bytes/toBytes row-id))
    r (.get table g)
    nm (.getFamilyMap r (Bytes/toBytes column-family))]
    (.get nm (Bytes/toBytes key))))

    where hbase-table is:
    (defn hbase-table [table-name]
    ;; Note that (HBaseConfiguration.) is deprecated in HBase 0.95-SNAPSHOT
    ;; and will be replaced by (Configuration/create)
    (HTable. (HBaseConfiguration.) table-name))

    So you would use this as:
    (get-key-from-table "pranavs-table" "row1" "cf" "a")

    There may be a better way...

    Cheers,

    Simon
  • Pranav at Jul 16, 2012 at 5:56 am
    Alternatively, a way to get all columns for a column family using maple
    would be great too.

    I dont see org.apache.hadoop.hbase.client.Get being used in the source, so
    Im guessing its not supported, but hoping Im wrong.

    Perhaps Sam could throw some light?

    Thanks!
    On Tuesday, July 10, 2012 4:45:29 PM UTC+5:30, Pranav wrote:

    Hello,

    Im using cascading/maple as a hbase tap for hbase 0.92.1. I have not been
    able to query by specifying the row-id. Here is my code -

    (ns cascatrial.dbhbase
    (:require (cascalog [workflow :as w]
    [ops :as c]
    [vars :as v]))
    (:import (org.apache.hadoop.hbase.util Bytes)
    (com.twitter.maple.hbase HBaseTap HBaseScheme)))

    (use '[cascalog api playground])

    (defn hbase-tap [table-name key-field column-family & value-fields]
    (let [scheme (HBaseScheme. (w/fields key-field) column-family (w/fields
    value-fields))
    tap (HBaseTap. table-name scheme)]
    tap))

    (defn to-string [bytes]
    (if (= nil bytes)
    ""
    (String. bytes)))

    (defn trial-query
    []
    (let [h-table (hbase-tap "test" "row1" "cf" "name" "")]
    (?<- (stdout)
    [!l !l1 !l2]
    (h-table !name !temp !temp2)
    (to-string !name :> !l)
    (to-string !temp :> !l1)
    (to-string !temp2 :> !l2))))

    This returns -
    RESULTS
    -----------------------
    row1 pranav
    row2
    row3
    -----------------------

    Isnt this query supposed to return only row1? Not sure if im using it
    incorrectly or if there is a bug in the tap. Some help?

    Thanks!

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedJul 10, '12 at 11:15a
activeJul 16, '12 at 5:56a
posts4
users3
websiteclojure.org
irc#clojure

People

Translate

site design / logo © 2021 Grokbase