FAQ
Hi Tyler,
That was the original scheme I was describing in the original email.
Unfortunately, I can have more that one value per column, so I actually
have to use super columns. This way I can write more that one row key
for any given indexed value. I'm concerned that this may not scale well
(at least on version 0.6). However after looking at the limitations
page.

http://wiki.apache.org/cassandra/CassandraLimitations

It appears that the "row must fit in memory" has been removed. I'll
move back to this scheme for my querying.

todd
SENIOR SOFTWARE ENGINEER

todd nine| spidertracks ltd | 117a the square
po box 5203 | palmerston north 4441 | new zealand
P: +64 6 353 3395 | M: +64 210 255 8576
E: todd@spidertracks.co.nz W: www.spidertracks.com




On Tue, 2010-10-12 at 23:01 -0500, Tyler Hobbs wrote:

I'm not completely sure I follow your scheme, but it's fairly to
support
GT, LT, etc with your own index.

Use a row for your index where the columns names are the data values
you want to index. If you set the comparator type (in your example,
this
would be LongType), you can perform a LT or GT query just by getting a
slice of the index columns. Store the original data row keys as the
column
values, and you're there.

- Tyler


On Tue, Oct 12, 2010 at 9:33 PM, Todd Nine wrote:

Thanks Johnathan,

A follow up question. Will it be possible to migrate existing
indexes
in a future release as part of the upgrade path to support LT
and LTE
ops without equal? In the meantime in my Datanucleus Plugin
I was
thinking I could do something like the following. It's not
efficient
for space, but it will work and should hopefully be relatively
efficient
for querying.


LT and LTE ops can be though of as the distance from the MAX
value of
any given data type. For instance, if I had a data
type :"ubershort",
which goes from -200 to 200, I could say that an expression of
<= 0 is
really >= (distance) 200 from the maximum. I could use this
equation to
calculate the "distance" to persist a distance value in a
column named
"<colName>_reverse". Which would effectively give me a reverse
index.


Then the value would simply be

storedValue = MAXVALUE-userVal.
From there, whenever the user issues a < <= query, I would
simply
translate the value via the above equation and < becomes > and
<=
becomes >=. Aside from the space issue of storage, do you see
any other
problems with this approach for a 0.7 compatible version of my
plugin?

Thanks,
Todd






On Wed, 2010-10-13 at 14:00 +1300, Todd Nine wrote:

Fair enough!


Thanks Jonathan.


todd
SENIOR SOFTWARE ENGINEER

todd nine| spidertracks ltd | 117a the square
po box 5203 | palmerston north 4441 | new zealand
P: +64 6 353 3395 | M: +64 210 255 8576
E: todd@spidertracks.co.nz W: www.spidertracks.com






On Tue, 2010-10-12 at 18:47 -0500, Jonathan Ellis wrote:

On Tue, Oct 12, 2010 at 6:34 PM, Todd Nine
wrote:
Currently there is only indexing for LT and LTE
expression when an EQ
operator is present. Will it be possible to use the LT
and LTE ops
without an EQ by the 0.7.0 release? No.
If not, which of the following
would be more efficient?

1. Creating a dummy column of 1 byte that is indexed.
This is basically the same as doing a full range scan,
only less efficient.
2. Use my previous indexing scheme of 2 Super CF for
longs and strings
to get my < <= operations. Where I use the following
scheme.
I'm not sure I follow but if it's better than doing a full
range scan
then it is better than 1. :)

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 6 of 6 | next ›
Discussion Overview
groupdev @
categoriescassandra
postedOct 12, '10 at 11:34p
activeOct 13, '10 at 6:45p
posts6
users3
websitecassandra.apache.org
irc#cassandra

People

Translate

site design / logo © 2021 Grokbase