We dont have that functionality in the hbase loader yet, but technically one can get around this inconsistency by specifying max timestamp on the hbase scan. As long as the number of versions hbase is configured to keep is smaller than number of updates to a single row during your scan, you'd get a consistent snapshot of the data. There is a jira open requesting we add timestamp support....
-----Original Message-----
From: "Mridul Muralidharan" <mridulm@yahoo-inc.com>
To: "user@pig.apache.org" <user@pig.apache.org>
Cc: "Bing Wei" <blackice.wei@gmail.com>
Sent: 4/21/2011 1:19 AM
Subject: Re: pig query on Cassandra
In general (on hadoop based systems), if the input is not immutable -
you can end up with issues during task re-execution, etc.
This happens not just for cassandra but for hbase, others too - where
you modify data in-place.
Regards,
Mridul
On Thursday 21 April 2011 04:29 AM, Bing Wei wrote:
Hi, All.
When I do a pig query on Cassandra, and the Cassandra is updated by
application at the same time, what will happen? I may get inconsistent
results, right?