On Sun, Sep 7, 2008 at 2:41 AM, mark harwood wrote:

for example joins are not possible using SOLR).
It's largely *because* Lucene doesn't do joins that it can be made to scale
out. I've replaced two large-scale database systems this year with
distributed Lucene solutions because this scale-out architecture provided
significantly better performance. These were "semi-structured" systems too.
Lucene's comparitively simplistic data model/query model is both a weakness
and a strength in this regard.
Hey, maybe the right way to go for a truly scalable and high performance
semi-structured database is to marry HBase (Big-table like data storage)
with SOLR/Lucene.I concur with you in the sense that simplistic data models
coupled with high performance are the killer.

Let me quote this from the original Bigtable paper from Google:

" Bigtable does not support a full relational data model; instead, it
provides clients with a simple data model that supports dynamic control over
data layout and format, and allows clients to reason about the locality
properties of the data represented in the underlying storage. Data is
indexed using row and column names that can be arbitrary strings. Bigtable
also treats data as uninterpreted strings, although clients often serialize
various forms of structured and semi-structured data into these strings.
Clients can control the locality of their data through careful choices in
their schemas. Finally, Bigtable schema parameters let clients dynamically
control whether to serve data out of memory or from disk."

Search Discussions

Discussion Posts


Follow ups

Related Discussions



site design / logo © 2018 Grokbase