FAQ
Hello,
I have written a small pyparsing parser to recognize dates in the style
"november 1st". I wrote something to the effect of:

expression = task + date

and tried to parse "Doctor's appointment on november 1st", hoping that
task would be "Doctor's appointment" and date would be "on november
1st" (the parser does match "on november 1st" to "date"). I have set
task as Regex(".*?"), ZeroOrMore(Word(alphas)), etc, but I can't get it
to match, it matches everything to task and ignores date until it gets
to the end of the string.

Can anyone help?

Search Discussions

  • Harry George at Dec 11, 2006 at 2:57 pm

    poromenos at gmail.com writes:

    Hello,
    I have written a small pyparsing parser to recognize dates in the style
    "november 1st". I wrote something to the effect of:

    expression = task + date

    and tried to parse "Doctor's appointment on november 1st", hoping that
    task would be "Doctor's appointment" and date would be "on november
    1st" (the parser does match "on november 1st" to "date"). I have set
    task as Regex(".*?"), ZeroOrMore(Word(alphas)), etc, but I can't get it
    to match, it matches everything to task and ignores date until it gets
    to the end of the string.

    Can anyone help?
    As described, this is a Natural Language Programming (NLP) problem,
    which means you will have a lot more trouble with understanding what
    you want to do than in coding it. Also, dates are notoriously tough
    to parse, because of so many variants, so there are libraries to do
    just that.

    If you want to tackle it systematically:

    1. Get a "corpus" of texts which illustrate the ways the users might
    state the date. E.g., "2006-11-01", "1-Nov-06", "November 1",
    "Nov. first", "first of November", "10 days prior to Veterans Day",
    "next week", .....

    2. If you can control the input, much better. Either by a form which
    forces specific values for day, month, year, hour, minute, or by
    requiring IETF format (yyyy-mm-ddThh:mm:ss).

    3. Determine the syntax rules for each example. If possible, abstract
    these to general rules which work on more than one example.

    4. At this point, you should know enough to decide if it is a:

    a) Regular expression, parseable with a regexp engine

    b) Context Free Grammar (CFG), parseable with a LL(1) or LALR(1) parser.

    c) Context Dependent Grammar, parseable with an ad hoc parser with special rules.

    d) Free text, not parseable in the normal sense, but perhaps
    understandable with statistical analysis NLP techniques.

    f) Hodgepodge not amenable to machine analysis.

    5. Then we could look at using pyparser. But we'd have to see
    the pyparser code you tried.

    --
    Harry George
    PLM Engineering Architecture

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedDec 10, '06 at 12:39a
activeDec 11, '06 at 2:57p
posts2
users2
websitepython.org

2 users in discussion

Harry George: 1 post Poromenos: 1 post

People

Translate

site design / logo © 2022 Grokbase