It is in fact 1 byte per field (that stores norms), per document. So
if you have 7 fields in the doc that store norms, that uses up 7 bytes.
And, because the storage is non-sparse, even documents which don't
have a given field X will still use up 1 byte, if field X stores norms.
Also, beware when disabling norms: you must disable norms for every
single occurrence of that field in any document in your index. If
even one document exists that did not disable norms for that field
then that will "spread" to all other docs, during segment merging.
Mike
Otis Gospodnetic wrote:
Is that really 1 byte for each document? Not 1 byte for each field
of each document?
Thanks,
Otis
--
Sematext --
http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----From: Doron Cohen <cdoronc@gmail.com>
To:
java-user@lucene.apache.orgSent: Monday, August 18, 2008 12:11:28 AM
Subject: Re: Index of Lucene
Norms information comes mainly from lengths of documents - allowing
the
search time scoring to take into account the effect of document
lengths
(actually
field length within a document). In practice, norms stored within
the index
may include
other information, such as index time boosts - for a document, for
a field.
A single
byte is stored for each field, - so for this the actual value is
compressed.
At search
time, norms are loaded into memory, and so consume 1 byte for each
document.
It is possible to disable norms for a field while indexing. This is
explained
better in the javadoc for Similarity, and here:
http://lucene.apache.org/java/2_3_2/scoring.htmlDoron
On Mon, Aug 18, 2008 at 5:59 AM, blazingwolf7 wrote:Hi,
I am currently using Lucene for indexing. After a index a file, I
will use
LUKE to open it and check the index. And there is 1 part that I am
curious
about. In Luke, under the Document tab, I randomly select a
document and
display it. At the bottom will be 4 columns, Field, ITSVopLBC,
Norm and
String Value.
I am wondering, what is Norm for? And where is it created during
indexing
time? Which method calculates it?
Could anyone advise me on this? Thanks for the help
--
View this message in context:
http://www.nabble.com/Index-of-Lucene-tp19025490p19025490.htmlSent from the Lucene - Java Users mailing list archive at
Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail:
java-user-unsubscribe@lucene.apache.orgFor additional commands, e-mail:
java-user-help@lucene.apache.org ---------------------------------------------------------------------
To unsubscribe, e-mail:
java-user-unsubscribe@lucene.apache.orgFor additional commands, e-mail:
java-user-help@lucene.apache.org---------------------------------------------------------------------
To unsubscribe, e-mail:
java-user-unsubscribe@lucene.apache.orgFor additional commands, e-mail:
java-user-help@lucene.apache.org