FAQ
Lucene,

JSON is the format used for all the configuration and property files in the RIA application we are developing. Is Lucene able to create a document from a given JSON file and index it? Is Lucene able to provide a JSON output response from a query made to an index? Does the Tika package provide this?

Local indexing and searching is needed on the local client so Solr is not a solution even though it does provide a search response in JSON format.

VL
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Yonik Seeley at May 30, 2010 at 5:47 pm

    On Sun, May 30, 2010 at 1:33 PM, Visual Logic wrote:
    JSON is the format used for all the configuration and property files in the RIA application we are developing. Is Lucene able to create a document from a given JSON file and index it? Is Lucene able to provide a JSON output response from a query made to an index? Does the Tika package provide this?
    No, and no.
    XML, JSON, etc, are out of scope for lucene, which is a core search library.
    Tika extracts text from documents like Word and PDF.
    Local indexing and searching is needed on the local client so Solr is not a solution even though it does provide a search response in JSON format.
    Solr is embeddable as well, so you can directly index/search. But why
    can't you run a separate server?

    -Yonik
    http://www.lucidimagination.com

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Visual Logic at May 30, 2010 at 6:27 pm
    Thanks for this helpful information.

    Solr is embeddable but does that not just mean that SolrJ only provides the ability to call Solr running on some server? As far as I know embedding Solr is not really embedding it at all, only using it remotely where a internet connection needs to be maintained in order to do so. If Solr were truly embeddable how large of a memory footprint would that add to my app?

    For some of my use cases using Solr on a remote server would work fine. For other cases it will not be quick enough, plus I want the user and other tool aspects of the local client to be able to utilize Lucene search for a lot more than only accessing resources on a remote server.

    Sorry to hear there is no out of the box support for JSON input/output in Lucene but that can be overcome with a bit of local code.

    VL

    On 2010-05-30, at 12:46 PM, Yonik Seeley wrote:
    On Sun, May 30, 2010 at 1:33 PM, Visual Logic wrote:
    JSON is the format used for all the configuration and property files in the RIA application we are developing. Is Lucene able to create a document from a given JSON file and index it? Is Lucene able to provide a JSON output response from a query made to an index? Does the Tika package provide this?
    No, and no.
    XML, JSON, etc, are out of scope for lucene, which is a core search library.
    Tika extracts text from documents like Word and PDF.
    Local indexing and searching is needed on the local client so Solr is not a solution even though it does provide a search response in JSON format.
    Solr is embeddable as well, so you can directly index/search. But why
    can't you run a separate server?

    -Yonik
    http://www.lucidimagination.com

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Yonik Seeley at May 30, 2010 at 6:39 pm

    On Sun, May 30, 2010 at 2:27 PM, Visual Logic wrote:
    Solr is embeddable but does that not just mean that SolrJ only provides the ability to call Solr running on some server?
    Nope - embeddable as in running in the same JVM as your application.
    For some of my use cases using Solr on a remote server would work fine. For other cases it will not be quick enough,
    Running as a separate server can be on the same host and be very
    quick. Was it too slow when you tried it?
    It's a common misconception that HTTP is slow... it's really just a
    TCP socket (which can be reused with persistent connections) with some
    standardized headers. Solr also has a binary protocol that works just
    fine over HTTP, so it's really not more overhead than doing something
    like talking to a database.

    But the right solution probably depends on the details of your
    specific usecases - if you elaborate on them, people may be able to
    provide more specific recommendations.

    -Yonik
    http://www.lucidimagination.com

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Visual Logic at May 30, 2010 at 8:11 pm
    This is interesting.

    I'm coming into this with only the knowledge I gained from the 1st version of the Lucene in Action book, which I read a few years ago. So when I think of embedded I think that Solr would be integrated into my app like any API would be. Instead you say Solr is a separate process that could work adjunct to my app. In other words in my mind it would be a extension module or even plugin for my app. Would be possible to start/stop it from my app as well? Hope I'm seeing the picture clearly now?

    What I find most useful in Lucene is its searching capability. Since I have pre-structured data which are the different configurations the different tools and tiers of my app use the analysis stage is not that involved. Once data is indexed, likely in several indexes/cores, I would like to search for specific documents and set them to be the current context for different tools. There are a number different contexts and index documents specific to each of them. I guess what I am after is a searchable configuration engine for a RIA.

    Some of my use cases can tolerate a bit of latency but for others it would not work so well. Here are two of the more dynamic use cases (slightly simplified to make them clear) where searching has to be blazingly fast:

    1 - I have a draw tool that places different shape types while dragging (each drag forms a string of shapes) across a canvas. I track the types of shapes the current drag has placed so far and then what to recommend other shape strings (prior drawn strings are indexed as they were made) that are similar in order to help the user complete the current draw string. The query would be created on the fly based on what shapes have been drawn so far and then used to search the index, the search would also need to be refreshed every time the user moves the mouse. This use case is similar to type-ahead phrase completion but completely mouse driven with fired searches in the background and immediate graphics on the canvas. Will Solr running as a separate local process be able to keep up with the user driven drawing?

    2 - My app features a Painter with Renderers for different shapes. The many possible layout and style details for these shapes and their relation to each other are defined as configurations per Renderer. The index would contain multiple versions of a configuration for a given Renderer. A search would then be used to bring up a set of configurations, the set would have a configuration for each Renderer. This set would then be the active rules (context) for rendering each type of shape. Then as the user creates shapes by drawing different shape types they are rendered by their specific Renderer. The given Renderer in turn looks up all the rules for layout and styling it should have from the active set of configurations. The question is if the lookup can be on the fly searches to a index or if it is better to pull up the set of configurations from the index and transfer them to hashmaps that the Renderers would query instead of querying the index directly? Would Solr be able to keep up with all the real-time lookups the Renders would be making?

    VL

    On 2010-05-30, at 1:38 PM, Yonik Seeley wrote:
    On Sun, May 30, 2010 at 2:27 PM, Visual Logic wrote:
    Solr is embeddable but does that not just mean that SolrJ only provides the ability to call Solr running on some server?
    Nope - embeddable as in running in the same JVM as your application.
    For some of my use cases using Solr on a remote server would work fine. For other cases it will not be quick enough,
    Running as a separate server can be on the same host and be very
    quick. Was it too slow when you tried it?
    It's a common misconception that HTTP is slow... it's really just a
    TCP socket (which can be reused with persistent connections) with some
    standardized headers. Solr also has a binary protocol that works just
    fine over HTTP, so it's really not more overhead than doing something
    like talking to a database.

    But the right solution probably depends on the details of your
    specific usecases - if you elaborate on them, people may be able to
    provide more specific recommendations.

    -Yonik
    http://www.lucidimagination.com

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Otis Gospodnetic at Jun 1, 2010 at 2:36 am
    VL,

    Solr (not Lucene, but you can embed Solr) has JsonUpdateRequestHandler, which lets you send docs to Solr for indexing in JSON (instead of the usual XML):
    http://search-lucene.com/c/Solr:/src/java/org/apache/solr/handler/JsonUpdateRequestHandler.java


    And you can get Solr to respond with JSON, as you pointed out:
    http://wiki.apache.org/solr/SolJSON

    Otis
    ----
    Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
    Lucene ecosystem search :: http://search-lucene.com/


    ----- Original Message ----
    From: Visual Logic <visual.logic@gmail.com>
    To: java-user@lucene.apache.org
    Sent: Sun, May 30, 2010 1:33:19 PM
    Subject: Using JSON for index input and search output

    Lucene,
    JSON is the format used for all the configuration and property
    files in the RIA application we are developing. Is Lucene able to create a
    document from a given JSON file and index it? Is Lucene able to provide a JSON
    output response from a query made to an index? Does the Tika package provide
    this?
    Local indexing and searching is needed on the local client so Solr
    is not a solution even though it does provide a search response in JSON
    format.
    VL
    ---------------------------------------------------------------------
    To
    unsubscribe, e-mail:
    href="mailto:java-user-unsubscribe@lucene.apache.org">java-user-unsubscribe@lucene.apache.org For
    additional commands, e-mail:
    ymailto="mailto:java-user-help@lucene.apache.org"
    href="mailto:java-user-help@lucene.apache.org">java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMay 30, '10 at 5:35p
activeJun 1, '10 at 2:36a
posts6
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase