FAQ
Dear Damian, dear P::RD Lovers,

I love P::RD but I would be happy if the next version has additional
modifiers for alternations. See the following problem(s):

I've an alternation where some subrules are mandatory but others are
optional (Subrule statement).

$grammar = << 'EOGRAMMAR';
contact : statement(s)
statement : email | name | phone | fax
email : 'email' '=' value # mandatory
name : 'name' '=' value # mandatory
phone : 'phone' '=' value # optional
fax : 'fax' '=' value # optional
value : /".*?"/
EOGRAMMAR

for example email and name will be required and phone or fax are optional.
In the moment I have to deal with action code working with sets. That means
I have an additional layer of grammar in the action code.

(here it's just a simple example, in reality I have a complex config file
for managing network devices, with for example required parameters like
snmp readcommunity or IP address and optional parameters like location
or description).

It would be very, very helpful to get rid of the extra
'grammar' action code if we could write the alternations
perhaps in the following manner:

statement : email! | name! | phone | fax

with the semantik, that email and name is required and phone or fax
is optional. Don't know how difficult it would be for Damian to
have an additional layer of state to handle inside the parser or wether
this is by principle not possible, but if it is possible it would
help for clarity, because the grammar stay's in the grammar part and
not in the action part.

And the following would also be sometimes useful:

exor : Mr ^ Mrs ^ hybrid

similar things are also best handled with sets in the action code in
the moment.

Best Regards
Charly
--
Karl Gaissmaier Computing Center,University of Ulm,Germany
Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
Tel.: ++49 731 50-22499

Search Discussions

  • Karl Gaissmaier at Apr 15, 2002 at 11:21 am
    Hi Marco, (a Cc: goes to the list)

    Marco Baringer writes:
    Karl Gaissmaier <karl.gaissmaier@rz.uni-ulm.de> writes:
    statement : email! | name! | phone | fax

    with the semantik, that email and name is required and phone or fax
    is optional. Don't know how difficult it would be for Damian to
    have an additional layer of state to handle inside the parser or wether
    this is by principle not possible, but if it is possible it would
    help for clarity, because the grammar stay's in the grammar part and
    not in the action part.
    but what you have there is an alternation between four things, but two
    of them are required, i'm not really sure what you mean.

    do you mean that a statement consists of two to four productions one
    of which must be 'email' one of which must be 'name' and there can be
    at most one 'phone' and at most one 'fax', and in any order? As far as
    "normal" grammars go what you want is a production alternating all the
    possible sequences:
    yes that's what I want. I know that the solution could be the
    permutation of the pieces but this blows up the grammar with n!.

    I also know that the problem arises from, that the pieces are allowed
    to be unordered, but this is normal way for config files, you
    will not force the user to be constricted to a special order.

    It makes no sense to the user that:

    person {
    email = "foo@bar.baz"
    name = "foo"
    phone = "12345"
    fax = "98765"
    }

    is allowed only in one specific order.
    statement : statement_piece { statement_action(); }

    statement_pice : email name |
    name email |
    email name phone |
    ....
    name email fax |
    ....
    fax phone name email

    it should not be difficult to auto generate this (obviously requiring
    an order would greatly reduce the number of permutations). i coudl
    imagine something as simpel as:

    $grammar = "statement : " . gen_prod(req => [ qw( email name ) ],
    opt => [ qw( phone fax ) ]) .
    " { statement_action(); }\n" .
    ...
    ;
    sure it would be possible but makes the grammar difficult to read
    and the productions go with n!.
    however, what you're talking about is a semantic validation of input,
    something which usally happens after parsing and hence isn't dealt
    with by grammars.
    hmm, I'm not sure that this is semantik. The semantik is, that the
    value of email could be used in a To: field, but the required pieces
    are still syntax, as we can see that this could be done with permuting
    the pieces.
    And the following would also be sometimes useful:

    exor : Mr ^ Mrs ^ hybrid
    isn't this just:

    exor : Mr | Mrs | hybrid

    this requires that the exor production be one and only one of Mr, Mrs
    and hybrid. what am i missing?
    perhaps not fully described (or understood?) by me:

    chooselist: salutation(s)
    salutation: Mr ^ Mrs

    that means Mr and Mrs could be only found one time, every doublette
    gets automatically rejected.

    regards
    Charly

    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
    Tel.: ++49 731 50-22499
  • Orton, Yves at Apr 15, 2002 at 11:49 am

    I've an alternation where some subrules are mandatory but others are
    optional (Subrule statement).
    This doesnt make much sense to me. An alternation is a rule that can be
    matched by any _one_ of a selection of rules. So that means you can't say
    that some of those alternations are _mandatory_ because that would mean you
    are implying that more than one of the options must be matched, and since
    that is impossible you have a problem... :-)
    $grammar = << 'EOGRAMMAR';
    contact : statement(s)
    statement : email | name | phone | fax
    email : 'email' '=' value # mandatory
    name : 'name' '=' value # mandatory
    phone : 'phone' '=' value # optional
    fax : 'fax' '=' value # optional
    value : /".*?"/
    EOGRAMMAR

    for example email and name will be required and phone or fax are optional.
    In the moment I have to deal with action code working with sets. That means
    I have an additional layer of grammar in the action code.
    Hmm, if you mean the "additional layer of grammer" is the logic that ensures
    that contact is not sucessful unless it contains both an email and a name
    then I can see what you mean but you have to realize that you _can_ do this
    with a normal context-free-grammer, but you probably dont want to.

    contact : email name fax(?) phone(?)
    email name phone(?) fax(?)
    email phone(?) name fax(?)
    phone(?) email name fax(?)
    email fax(?) name phone(?)
    email fax(?) phone(?) name
    email phone(?) fax(?) name
    phone(?) email fax(?) name
    fax(?) email name phone(?)
    fax(?) email phone(?) name
    fax(?) phone(?) email name
    phone(?) fax(?) email name
    name email fax(?) phone(?)
    name email phone(?) fax(?)
    name phone(?) email fax(?)
    phone(?) name email fax(?)
    name fax(?) email phone(?)
    name fax(?) phone(?) email
    name phone(?) fax(?) email
    phone(?) name fax(?) email
    fax(?) name email phone(?)
    fax(?) name phone(?) email
    fax(?) phone(?) name email
    phone(?) fax(?) name email
    Since its a permutation (Algorithym::FastPermute rocks!) this probably isnt
    the optimal way to proceed.

    To me the core issue that you face is that the general case of what you want
    is not possible using a context-free grammar without explicitly listing
    every possible permutation. So since its context sensitive behaviour you
    want, and no parser can handle such behaviour directly you need to consider
    that what you want to do should _not_ be part of the grammar anyway.
    (Consider that this is a relatively common state of affairs. Its not
    possible for instance using a context-free-grammar to ensure that variables
    are declared before they are used.)

    What I would do is create a small utility function that can handle your
    needs. Anyway heres what I did to your example...

    use Parse::RecDescent;
    #use diagnostics;

    $::RD_AUTOACTION = q { $item[1] }; # this allows check_mandatory to be
    simplified....
    $grammar = << 'EOGRAMMAR';

    {
    sub check_mandatory {
    my ($mand_array,$item_array)=@_;

    # make sure we dont overwrite any keys
    my %counts;
    # Transform list of mandatory items into hash
    my %mand=map {($_=>1)} @$mand_array;
    # We will convert the LOL that we recieve into a hash
    my %ret;
    foreach my $elem (@$item_array) {
    my $name=$elem->[0];
    delete $mand{$name} if $mand{$name};
    if ($counts{$name}) {
    $name.=$counts{$name};
    }
    $counts{$name}++;
    $ret{ $name }=$elem->[1];
    }
    return !(scalar keys %mand) ? \%ret : undef;
    }
    }

    contact : statement(s) {
    $return=check_mandatory([
    'email','name' ],$item{statement});
    $return
    }
    statement : mandatory
    optional
    mandatory : email | name
    optional : phone | fax
    email : 'email' '=' value { [$item[0],$item{value}]}# mandatory
    name : 'name' '=' value { [$item[0],$item{value}]}# mandatory
    phone : 'phone' '=' value { [$item[0],$item{value}]}# optional
    fax : 'fax' '=' value { [$item[0],$item{value}]}# optional
    value : /"[^"]*"/
    EOGRAMMAR

    $parser = new Parse::RecDescent ($grammar) or die "Bad grammar!\n";# acquire
    $text

    my $ret=$parser->contact(<<'EOTEST') or print "Bad text!\n";
    email = "camel@perl.org"
    name = "Joe Camel"
    phone = "(555)-555-5555"
    fax = "(555)-555-1111"
    EOTEST

    use Data::Dumper;
    print Dumper $ret;
  • Orton, Yves at Apr 15, 2002 at 12:26 pm

    yes that's what I want. I know that the solution could be the
    permutation of the pieces but this blows up the grammar with n!.
    Sure does... :-)
    however, what you're talking about is a semantic validation of input,
    something which usally happens after parsing and hence isn't dealt
    with by grammars.
    Well, thats not quite correct. Its not that uncommon for parsers to handle
    some semantic validation during the parser phase. Especially recursive
    descent ones.
    hmm, I'm not sure that this is semantik. The semantik is, that the
    value of email could be used in a To: field, but the required pieces
    are still syntax, as we can see that this could be done with permuting
    the pieces.
    Yes, but that syntax is not context-free. And as I said and Marco implied
    parsers can only directly handle context-free grammars (and P::RD can only
    handle a subset of CFG's due to it being a recursive descent parser anyway
    :-)
    And the following would also be sometimes useful:

    exor : Mr ^ Mrs ^ hybrid

    Hmm, again (afaict) as presented this is a non-context-free constraint.

    But with a bit of gymnastics you can do this with P:RD using regexes:

    salutation : /Mrs?/ first_name last_name
    first_name : /(?!Mrs?)\S+/
    last_name : /\S+/

    Or using lookahead matches

    salutation : 'Mr' ...!'Mrs'
    'Mrs' ...!'Mr'
    Cheers,
    Yves
  • Karl Gaissmaier at Apr 15, 2002 at 1:18 pm
    Hi Orton,

    thanks for your explanation on CFG. I already feared that my knowledge
    about parsers are not deep enough to understand the real underlying
    problem why Damian didn't already spent us this feature.

    As I stated already in my first mail:
    ........... Don't know how difficult it would be for Damian to
    have an additional layer of state to handle inside the parser or wether
    this is by principle not possible, but if it is possible it would
    help for clarity, because the grammar stay's in the grammar part and
    not in the action part.
    ....
    exor : Mr ^ Mrs ^ hybrid
    Hmm, again (afaict) as presented this is a non-context-free constraint.

    But with a bit of gymnastics you can do this with P:RD using regexes:

    salutation : /Mrs?/ first_name last_name
    first_name : /(?!Mrs?)\S+/
    last_name : /\S+/

    Or using lookahead matches

    salutation : 'Mr' ...!'Mrs'
    'Mrs' ...!'Mr'
    but this works only for terminals and this is again not what
    I need. I've choosen Mr and Mrs as subrule names just for an easy
    demonstration what is needed, but in general this is necessary
    for subrules:

    exor : subrule-1 ^ subrule-2 ^ subrule-3
    subrule-1: subrule-11 ^ subrule-12 ....
    ......

    but I guess this is again not possible with CFG parsers.

    Anyway, the generated parser code is pure perl and we can solve
    any of these problems with perl code in actions, why shall this
    not be possible for P::RD direct?

    You see, I've no guess how complicated it will be but perhaps it's
    a greenly thought from me:

    "Anyone who is able to create such a beast like Parse::RecDescent
    is able to do anything!"

    regards
    Charly

    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
    Tel.: ++49 731 50-22499
  • Orton, Yves at Apr 15, 2002 at 2:31 pm
    Hi Orton,
    Actually its Yves...

    :-)
    thanks for your explanation on CFG. I already feared that my knowledge
    about parsers are not deep enough to understand the real underlying
    problem why Damian didn't already spent us this feature.
    Well I suspect he's more concerned with feature that are traditionally in
    the domain of parsing context free grammars, a well studied and quite
    complex field.
    exor : Mr ^ Mrs ^ hybrid
    Hmm, again (afaict) as presented this is a non-context-free
    constraint.
    But with a bit of gymnastics you can do this with P:RD
    using regexes:
    salutation : /Mrs?/ first_name last_name
    first_name : /(?!Mrs?)\S+/
    last_name : /\S+/

    Or using lookahead matches

    salutation : 'Mr' ...!'Mrs'
    'Mrs' ...!'Mr'
    but this works only for terminals
    No, using lookahead matches applies to nonterminals as well. But you're
    (sortof) correct about the regex solutions.
    and this is again not what
    I need. I've choosen Mr and Mrs as subrule names just for an easy
    demonstration what is needed, but in general this is necessary
    for subrules:

    exor : subrule-1 ^ subrule-2 ^ subrule-3
    subrule-1: subrule-11 ^ subrule-12 ....
    Umm, ok now maybe im confused, but I would go with Marco on this. This is
    exactly the same as

    option : optlist
    optlist : rule1 | rule2 | rule3
    rule1 : subrule1 | subrule 2

    Optlist may now only match 1 of rule1 rule2 or rule3, and likewise if it
    does match rule1 it may only be subrule1 or subrule2.

    I believe the impact of changing the above to

    option : optlist(s)
    optlist : rule1 | rule2 | rule3
    rule1 : subrule1 | subrule 2

    which allows any selection of the rule1 is causing you problems. It is the
    rule option that allows multple optlist elements to be chosen, not optlist
    itself.
    but I guess this is again not possible with CFG parsers.
    Er, im not so sure. I think it may be just that the grammar needs to be
    written with a little more care. Dont forget that two dissimiler grammars
    can parse the same language. And one of those grammars may not be supported
    by P:RD. This can bee seen by looking at the examples in the docs.
    Specifically the ones dealing with the <leftop:> command.
    Anyway, the generated parser code is pure perl and we can solve
    any of these problems with perl code in actions, why shall this
    not be possible for P::RD direct?
    Parsing a CFG is a mechanical process. Anticipating every users needs for
    contextual sensitivity would mean P::RD still wouldnt be written yet. :-)
    You see, I've no guess how complicated it will be but perhaps it's
    a greenly thought from me:

    "Anyone who is able to create such a beast like Parse::RecDescent
    is able to do anything!"
    Yah. TheDamian is a pretty awesome programmer but I suspect hes not going
    to implement your requests as they are already implementable with a bit of
    elbow-grease using the current module and would probably only be useful to
    such a small set of users that it wouldnt make sense. Personally id far
    prefer to see Parse::FastDescent and Perl6 finished sooner than later. :-)

    BTW, you may find reading the Red Dragon (Compilers, Priciples and
    Techniques by Aho, Sethi, Ullman) to help our with some of the conceptual
    issues.

    Yves
  • Karl Gaissmaier at Apr 15, 2002 at 3:13 pm
    Hi Yves, (sorry Mr. Orton)
    salutation : /Mrs?/ first_name last_name
    first_name : /(?!Mrs?)\S+/
    last_name : /\S+/

    Or using lookahead matches

    salutation : 'Mr' ...!'Mrs'
    'Mrs' ...!'Mr'
    but this works only for terminals
    No, using lookahead matches applies to nonterminals as well. But you're
    (sortof) correct about the regex solutions.
    sure, I was speaking about the regex solution not the lookahead
    and this is again not what
    I need. I've choosen Mr and Mrs as subrule names just for an easy
    demonstration what is needed, but in general this is necessary
    for subrules:

    exor : subrule-1 ^ subrule-2 ^ subrule-3
    subrule-1: subrule-11 ^ subrule-12 ....
    Umm, ok now maybe im confused, but I would go with Marco on this. This is
    exactly the same as

    option : optlist
    optlist : rule1 | rule2 | rule3
    rule1 : subrule1 | subrule 2

    Optlist may now only match 1 of rule1 rule2 or rule3, and likewise if it
    does match rule1 it may only be subrule1 or subrule2.

    I believe the impact of changing the above to

    option : optlist(s)
    optlist : rule1 | rule2 | rule3
    rule1 : subrule1 | subrule 2

    which allows any selection of the rule1 is causing you problems. It is the
    rule option that allows multple optlist elements to be chosen, not optlist
    itself.
    no, you didn't really catch my problem. I'll try it again:

    option : optlist(s)
    optlist: ip | pw
    ip : 'ipaddr' '=' value
    pw : 'passw' '=' value
    value : /".*?"/

    then the input:
    ipaddr = "10.0.0.1";
    passw = "private";

    or

    ipaddr = "10.0.0.1";

    or

    passw = "private";

    should succeed but not

    ipaddr = "10.0.0.1";
    passw = "private";
    ipaddr = "192.168.10.1"

    and I see no chance to do this without {action code}.
    It could be done with:

    option : optlist(s)
    optlist: ip ^ pw
    ip : 'ipaddr' '=' value
    pw : 'passw' '=' value
    value : /".*?"/

    that means any of the options ip or pw but not any option more than
    one time. The wish is not so strange, you will stumble over it when
    you try for example to write a parser for named or dhcp cfg files,
    and of course in my config files for network management.

    regards
    Charly

    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
    Tel.: ++49 731 50-22499
  • Orton, Yves at Apr 15, 2002 at 3:44 pm

    no, you didn't really catch my problem. I'll try it again:
    Actually I do. But thats ok.
    option : optlist(s)
    optlist: ip | pw
    ip : 'ipaddr' '=' value
    pw : 'passw' '=' value
    value : /".*?"/

    then the input:
    ipaddr = "10.0.0.1";
    passw = "private";

    or

    ipaddr = "10.0.0.1";

    or

    passw = "private";

    should succeed but not

    ipaddr = "10.0.0.1";
    passw = "private";
    ipaddr = "192.168.10.1"
    Why should this _necessarily_ fail. This could be a backup machine right?
    (The point here is that you have a bunch of contextual constraints that
    derive from the fact that these symbols (words) have meaning. P::RD does
    not understand meaning, all it looks for is patterns, and then translates
    them into actions/data structures)
    and I see no chance to do this without {action code}.
    As i said in my earlier post this is correct, but (sorry) trivial. Modify
    the sub I gave you there to suit your needs and plug it in where you need
    contextual considerations.
    It could be done with:

    option : optlist(s)
    optlist: ip ^ pw
    ip : 'ipaddr' '=' value
    pw : 'passw' '=' value
    value : /".*?"/
    I dont think that this would generalize well. Tell me what the following
    should do

    toplevel : rule(s)
    rule : A ^ B ^ C | A ^ C | A | B

    (Dont forget that your addition to the syntax needs to handle all the cases)
    that means any of the options ip or pw but not any option more than
    one time. The wish is not so strange, you will stumble over it when
    you try for example to write a parser for named or dhcp cfg files,
    and of course in my config files for network management.
    I agree. The wish is not strange. It is precisely what happens when a
    compiler flags a dual declaration as being incorrect.

    But it also misses the point. Its not context free and IMO not
    generallizable.

    But more importantly you have identified the location of the problem
    incorrectly. Its not the alternation that is the problem. Its the (s) that
    is causing the trouble. It and only it are why you can have any selection of
    items. Remove it and the language matched will be one or the other, but
    _not_ both.

    Consider how you would vocalize what you want:

    I want a name and or an email, in addition to an optional phone number and
    fax number

    which is not

    I want a list of items that may be an email, a name a phone number or a fax
    number.

    which is what the rule "contact: statement(s)" means (expanding statement(s)
    out).

    Yves
  • Karl Gaissmaier at Apr 15, 2002 at 6:33 pm
    Hi Yves,
    "Orton, Yves" schrieb:
    no, you didn't really catch my problem. I'll try it again:
    Actually I do. But thats ok.
    ok

    ipaddr = "10.0.0.1";
    passw = "private";
    ipaddr = "192.168.10.1"
    Why should this _necessarily_ fail. This could be a backup machine right?
    sure, but that depends totally on the application dealing with this
    cfg file. If for this cfg file only one option is handled then this is
    typically a typo by the user, and think about #include features, the
    user will sometimes get lost and then the parser has to help him.
    It's the same with programming, you appreciate also perl -w :-)
    (The point here is that you have a bunch of contextual constraints that derive
    from the fact that these symbols (words) have meaning. P::RD does not
    understand meaning, all it looks for is patterns, and then translates them
    into actions/data structures)
    The symbols have no special meaning, the problem is the missing
    order for the symbols.

    P::RD understands constraints already very much, all modifiers
    and all directives introduce contextual constraints, it's just a matter
    of standpoint. These additional meanings for e.g.
    subrule! and subrule ^ subrule are not different to lookaheads, rejects,
    commits and the usual production rules.
    and I see no chance to do this without {action code}.
    As i said in my earlier post this is correct, but (sorry) trivial. Modify the
    sub I gave you there to suit your needs and plug it in where you need
    contextual considerations.
    The matter is not that I can't do this with action code (in fact
    I do this quite regularly) but as I can see, the grammar get's scattered
    over productions and action code and is hard to understand and maintain
    by third persons.

    statement: A! | B! | C | D

    is so easy to understand: At least A and B, optionally C and/or D
    but without ORDER, in comparison to action codes and greps and maps
    and line noise.
    I dont think that this would generalize well. Tell me what the following
    should do

    toplevel : rule(s)
    rule : A ^ B ^ C | A ^ C | A | B
    I think there is no ambiguity for the parser generator, nevertheless
    the requirements are strange but even imaginable:

    if there is a A and a B and a C then only one A and one B and one C is allowed
    if there is a A and a C then only one A and one C is allowed
    if there is only a A, then more than one A is allowed
    if there is onla a B, then more than one B is allowed

    just straigthforward!

    look for the following input stream:

    A A B C

    first try A ^ B ^ C (leftmost production)
    this second A prohibits A^B^C and A^C
    backtracking:
    try A, subrule succeeded
    try again A^B^C
    A^B^C still matches
    A^B^C subrule succeeded

    ....
    I agree. The wish is not strange. It is precisely what happens when a compiler
    flags a dual declaration as being incorrect.
    yep, that's the key! The same is with my options in very long,
    structured cfg files (think about named.cond and dhcpd.conf and includes
    and all that)
    But it also misses the point. Its not context free and IMO not
    generallizable.
    I think it's generalizeable, at least the A! | B! | C | D suggestion,
    perhaps not context free. But I think that P::RD is by far no way just
    a CFG ParserGenerator, it's a beast. I stumbled so many times over these
    two missing features (mandatory and xor) and my "feeling" tells me, there
    must be a gentler(general) solution than just doing this in action codes.
    But more importantly you have identified the location of the problem
    incorrectly. Its not the alternation that is the problem. Its the (s) that is
    causing the trouble. It and only it are why you can have any selection of
    items. Remove it and the language matched will be one or the other, but _not_
    both.

    Consider how you would vocalize what you want:

    I want a name and or an email, in addition to an optional phone number and fax
    number
    I disagree, the problem is: I need an email and a name, optionally
    a phone number and perhaps a fax number, but there should be no
    order introduced for these parameters. That's the point: "No order".

    When I don't wish an order in the options, then I need all
    permutations:

    contact: email name phone(s?) fax(s?)
    contact: email phone(s?) fax(s?) name
    ......
    contact: fax(s?) phone(s?) name email

    if I wish to do this in the grammar. The only thing to do
    this not with permutations is with action code, and now we are again
    at the beginning of the thread :-(

    Perhaps Damian can bring some light in the darkness.

    Regards and THANKS for discussing!

    Charly

    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
  • Jonathan Mayer at Apr 16, 2002 at 1:31 am
    Apologies for butting in where my opinion is not asked for, but ...
    statement: A! | B! | C | D

    is so easy to understand: At least A and B, optionally C and/or D
    but without ORDER, in comparison to action codes and greps and maps
    and line noise.
    At what point does "A!" become mandatory? In the
    block: statement(s) /* A or B must be part of block */
    construct? Or in the
    program: block(s) /* A or B must be part of program */
    block: statement(s)
    construct? There are times whem both forms are useful -- but defining
    "mandatory" as part of the syntax for the singular "statement"
    construct is limitting.

    Also, what if the programmer wants a more complicated logical function
    on the set of statements that comprises a "minimal" block?

    It seems to me, P:RD already has the functionality you desire, in a
    much more flexible form. What's wrong with:

    statement: A | B | C | D
    block: "{" statements(s) "}"
    {
    /* some code to test for the presense of A and B,
    else return undef */
    }

    I'd hate to see P:RD fall into same trap as regexps: P:RD doesn't need
    to be a complete programming language. P:RD is fine as a perl
    accessory.

    jm.
  • Karl Gaissmaier at Apr 16, 2002 at 1:49 pm
    Hi Jonathan,

    Jonathan Mayer schrieb:
    Apologies for butting in where my opinion is not asked for, but ...
    you're welcome, it's a mailing list, isn't it.

    statement: A! | B! | C | D

    is so easy to understand: At least A and B, optionally C and/or D
    but without ORDER, in comparison to action codes and greps and maps
    and line noise.
    At what point does "A!" become mandatory? In the
    block: statement(s) /* A or B must be part of block */
    construct? Or in the
    program: block(s) /* A or B must be part of program */
    block: statement(s)
    construct? There are times whem both forms are useful -- but defining
    "mandatory" as part of the syntax for the singular "statement"
    construct is limitting.
    hmmm, what is wrong when it is necessary for the block statement, because
    then it is automatically true for program
    Also, what if the programmer wants a more complicated logical function
    on the set of statements that comprises a "minimal" block?

    It seems to me, P:RD already has the functionality you desire, in a
    much more flexible form. What's wrong with:

    statement: A | B | C | D
    block: "{" statements(s) "}"
    {
    /* some code to test for the presense of A and B,
    else return undef */
    }
    syntax is scattered between grammar rules and action code and the
    presense check isn't always so intuitive as in this primitive example.
    A and B are also complex subrules and it is really only a question
    of style.
    I'd hate to see P:RD fall into same trap as regexps: P:RD doesn't need
    to be a complete programming language. P:RD is fine as a perl
    accessory.
    yes it's really fine, even it it stays as it already is.

    Regards
    Charly
    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
    Tel.: ++49 731 50-22499
  • Damian Conway at Apr 16, 2002 at 5:45 am
    Karl Gaissmaier wrote:

    ....some suggestions for additional features of Parse::RecDescent, specifically
    the ability to specify "required alternatives" within a repeated subrule:
    statement: A! | B! | C | D
    and "mutually exclusive alternations":
    rule : A ^ B ^ C

    The problem with these proposed features is that they are not *localized*.

    That is, the presence of an A! doesn't affect the rule that the A! is in,
    it affects the rule that calls the rule that the A! is in. And that is difficult
    to implement in a recursive descent parser.

    But I think that P::RD is by far no way just a CFG ParserGenerator, it's a beast.
    <grin> I should quote you in the documentation. ;-)

    I stumbled so many times over these two missing features (mandatory and xor)
    and my "feeling" tells me, there must be a gentler(general) solution
    than just doing this in action codes.
    Clearly this has been a problem for you. But I would have solved it like so:

    $grammar = << 'EOGRAMMAR';
    contact : statement(s)
    { $return = { map %$_ @{$item[-1]} } }
    <reject: do{!($return->{email} && $return->{name})) >
    statement : email | name | phone | fax
    email : 'email' '=' value {{email=>$item[-1]}}
    name : 'name' '=' value {{name=>$item[-1]}}
    phone : 'phone' '=' value {{phone=>$item[-1]}}
    fax : 'fax' '=' value {{fax=>$item[-1]}}
    value : /".*?"/
    EOGRAMMAR

    Since you needed to collect the data anyway, the actions would have had to be
    there no matter what. So the overhead for the requirements testing is just the single
    <reject> directive.

    Likewise, if (for example) phone and fax were mutually exclusive, you could
    just extend that to:

    $grammar = << 'EOGRAMMAR';
    contact : statement(s)
    { $return = { map %$_ @{$item[-1]} } }
    <reject: do{!($return->{email} && $return->{name})} >
    <reject: do{ $return->{phone} && $return->{fax} } >
    statement : email | name | phone | fax
    email : 'email' '=' value {{email=>$item[-1]}}
    name : 'name' '=' value {{name=>$item[-1]}}
    phone : 'phone' '=' value {{phone=>$item[-1]}}
    fax : 'fax' '=' value {{fax=>$item[-1]}}
    value : /".*?"/
    EOGRAMMAR

    Perhaps Damian can bring some light in the darkness.
    Perhaps. But I doubt it. ;-)

    However, since you've taken the trouble to mention (and obviously think about)
    this problem, I will certainly devote some time to it myself and see if I can
    develop a better solution; one that's consistent with a recursive descent implementation.

    Damian
  • Karl Gaissmaier at Apr 16, 2002 at 2:00 pm
    Hi Damian,

    Damian Conway schrieb:
    Karl Gaissmaier wrote:

    ...some suggestions for additional features of Parse::RecDescent, specifically
    the ability to specify "required alternatives" within a repeated subrule:
    statement: A! | B! | C | D
    and "mutually exclusive alternations":
    rule : A ^ B ^ C
    The problem with these proposed features is that they are not *localized*.

    That is, the presence of an A! doesn't affect the rule that the A! is in,
    it affects the rule that calls the rule that the A! is in. And that is difficult
    to implement in a recursive descent parser.
    as I feared already in my first mail about adding an additional state
    But I think that P::RD is by far no way just a CFG ParserGenerator, it's a beast.
    <grin> I should quote you in the documentation. ;-)
    you're welcome
    I stumbled so many times over these two missing features (mandatory and xor)
    and my "feeling" tells me, there must be a gentler(general) solution
    than just doing this in action codes.
    Clearly this has been a problem for you. But I would have solved it like so:

    $grammar = << 'EOGRAMMAR';
    contact : statement(s)
    { $return = { map %$_ @{$item[-1]} } }
    <reject: do{!($return->{email} && $return->{name})) >
    statement : email | name | phone | fax
    email : 'email' '=' value {{email=>$item[-1]}}
    name : 'name' '=' value {{name=>$item[-1]}}
    phone : 'phone' '=' value {{phone=>$item[-1]}}
    fax : 'fax' '=' value {{fax=>$item[-1]}}
    value : /".*?"/
    EOGRAMMAR

    Since you needed to collect the data anyway, the actions would have had to be
    there no matter what. So the overhead for the requirements testing is just the single
    <reject> directive.
    again it's just a question of maintainance. If I introduce a different
    subrule in statement: I have to change action code logic. I think
    therefore you introduced for example the %item hash, because we are
    always not perfect. My stupid thinking:

    Everything already done very well by Damian Conway will not hurt
    me any more by my imperfection :-)
    Likewise, if (for example) phone and fax were mutually exclusive, you could
    just extend that to:

    $grammar = << 'EOGRAMMAR';
    contact : statement(s)
    { $return = { map %$_ @{$item[-1]} } }
    <reject: do{!($return->{email} && $return->{name})} >
    <reject: do{ $return->{phone} && $return->{fax} } >
    statement : email | name | phone | fax
    email : 'email' '=' value {{email=>$item[-1]}}
    name : 'name' '=' value {{name=>$item[-1]}}
    phone : 'phone' '=' value {{phone=>$item[-1]}}
    fax : 'fax' '=' value {{fax=>$item[-1]}}
    value : /".*?"/
    EOGRAMMAR
    sure, the same comments as before.
    Perhaps Damian can bring some light in the darkness.
    Perhaps. But I doubt it. ;-)

    However, since you've taken the trouble to mention (and obviously think about)
    this problem, I will certainly devote some time to it myself and see if I can
    develop a better solution; one that's consistent with a recursive descent implementation.
    thanks in advance, even if P::RD stays at it is already!

    Best Regards
    Charly
    --
    Karl Gaissmaier Computing Center,University of Ulm,Germany
    Email:karl.gaissmaier@rz.uni-ulm.de Network Administration
    Tel.: ++49 731 50-22499

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouprecdescent @
categoriesperl
postedApr 15, '02 at 9:35a
activeApr 16, '02 at 2:00p
posts13
users4
websitemetacpan.org...

People

Translate

site design / logo © 2018 Grokbase