I know in advance the structure of a whole tsquery, it has already
been reduced and lexemes have been already computed.
I'd like to directly write it in memory without having to pass
through pushValue/makepol.

Anyway I'm not pretty sure about what is the layout of a tsquery in
memory and I still haven't been able to find the MACRO that could
help me [1].

Before doing it the trial and error way can somebody just make me an
example?
I'm not pretty sure about my interpretation of the comments of the
documentation.

This is how I'd write
X:AB | YY:C | ZZZ:D

TSQuery
vl_len_ (total # of bytes of the whole following structure
QueryItems*size + total lexeme length)
size (# of QueryItems in the query)
QueryItem
type QI_OPR
oper OP_OR
left -> distance from QueryItem X:AB
QueryItem
type QI_OPR
oper OP_OR
left -> distance from QueryItem ZZZ:D
QueryItem (X)
type QI_VAL
weight 1100
valcrc ???
lenght 1
distance
QueryItem (YY)
type QI_VAL
weight 0010
valcrc ???
lenght 2
distance
QueryItem (ZZZ)
type QI_VAL
weight 0001
valcrc ???
lenght 3
distance
X
YY
ZZZ

[1] the equivalent of POSTDATALEN, WEP_GETWEIGHT, macro to compute
the size of various parts of TSQuery etc...

I couldn't see any place in the code where TSQuery is built in "one
shot" in spite of using pushValue.

Another thing I'd like to know is: what is going to be preferred
during a scan between
'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B');
vs.
'java:1A,2B '::tsvector @@ to_tsquery('java:AB')
?
they look equivalent. Are they?

thanks

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

Search Discussions

  • Teodor Sigaev at Feb 4, 2010 at 7:13 pm

    Before doing it the trial and error way can somebody just make me an
    example?
    I'm not pretty sure about my interpretation of the comments of the
    documentation.
    TSQuery
    [skipped]
    Right, valcrc is computed in pushValue
    I couldn't see any place in the code where TSQuery is built in "one
    shot" in spite of using pushValue.
    That because in all places we could parse rather complex structure. Simple OR-ed
    query could be hardcoded as
    pushValue('X')
    pushValue('YY')
    pushOperator(OP_OR);
    pushValue('ZZZ')
    pushOperator(OP_OR);

    You need to call pushValue/pushOperator imagery order of polish notation.
    Note, you can do another order:
    pushValue('X')
    pushValue('YY')
    pushValue('ZZZ')
    pushOperator(OP_OR);
    pushOperator(OP_OR);

    So, first example will produce ( X | YY ) | ZZZ, second one X | ( YY | XXX )



    Another thing I'd like to know is: what is going to be preferred
    during a scan between
    'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B');
    vs.
    'java:1A,2B '::tsvector @@ to_tsquery('java:AB')
    ?
    they look equivalent. Are they?
    Yes, but second one should be more efficient.
    --
    Teodor Sigaev E-mail: teodor@sigaev.ru
    WWW: http://www.sigaev.ru/
  • Ivan Sergio Borgonovo at Feb 5, 2010 at 2:13 am

    On Thu, 04 Feb 2010 22:13:02 +0300 Teodor Sigaev wrote:
    Before doing it the trial and error way can somebody just make
    me an example?
    I'm not pretty sure about my interpretation of the comments of
    the documentation.
    TSQuery
    [skipped]
    Right, valcrc is computed in pushValue
    Anyway the structure I posted is correct, isn't it?
    Is there any equivalent MACRO to POSTDATALEN, WEP_GETWEIGHT and
    macro to know the memory size of a TSQuery?
    I think I've seen MACRO that could help me to determine the size of
    a TSQuery... but I haven't noticed anything like POSTDATALEN that
    could come very handy to traverse a TSQuery.

    I was thinking to skip pushValue and directly build the TSQuery in
    memory since my queries have very simple structure and they are easy
    to reduce...
    Still it is not immediate to know the memory size in advance.
    For OR queries it is easy but for AND queries I'll have to loop over
    a tsvector, filter the weight according to a passed parameter and
    see how many time I've to duplicate a lexeme for each weight.

    eg.

    tsvector_to_tsquery(
    'pizza:1A,2B risotto:2C,4D barolo:5A,6C', '&', 'ACD'
    );

    should be turned into

    pizza:A & risotto:C & risotto:D & barolo:A & barolo:C

    I noticed you actually loop over the tsvector in tsvectorout to
    allocate the memory for the string buffer and I was wondering if it
    is really worth for my case as well.

    Any good receipt in Moscow? ;)

    thanks

    --
    Ivan Sergio Borgonovo
    http://www.webthatworks.it

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedFeb 4, '10 at 6:24p
activeFeb 5, '10 at 2:13a
posts3
users2
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase