FAQ
Hello guys! Need your precious help again!

In every html file i have in the very first line a page_id fro counetr
countign purpsoes like in a format of a comment like this:

<!-- 1 -->
<!-- 2 -->
<!-- 3 -->

and so on. every html file has its one page_id

How can i grab that string representaion of a number from inside
the .html file using regex and convert it to an integer value?

# ==============================
# open current html template and get the page ID number
# ==============================

f = open( '/home/webville/public_html/' + page )

#read first line of the file
firstline = f.readline()

page_id = re.match( '<!-- \d -->', firstline )
print ( page_id )

Search Discussions

  • Νίκος at Aug 7, 2010 at 5:31 pm
    i also dont know what wrong with this line:

    host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

    hostmatch = re.search('cyta', host)

    if cookie.has_key('visitor') != 'nikos' or hostmatch is None:
    # do stuff

    the 'stuff' never gets executed, while i ant them to be as long as i
    dont have regex match!
  • MRAB at Aug 7, 2010 at 6:27 pm

    ????? wrote:
    i also dont know what wrong with this line:

    host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

    hostmatch = re.search('cyta', host)

    if cookie.has_key('visitor') != 'nikos' or hostmatch is None:
    # do stuff

    the 'stuff' never gets executed, while i ant them to be as long as i
    dont have regex match!
    Try printing out repr(host). Does it contain "cyta"?
  • Νίκος at Aug 7, 2010 at 6:48 pm

    On 7 ???, 21:27, MRAB wrote:
    ????? wrote:
    i also dont know what wrong with this line:
    host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
    hostmatch = re.search('cyta', host)
    if cookie.has_key('visitor') != 'nikos' or hostmatch is None:
    ? ? ?# do stuff
    the 'stuff' never gets executed, while i want them to be as long as i
    dont have regex match!
    Try printing out repr(host). Does it contain "cyta"?
    Yes it does contain it as print shown!

    is something wrong with this line in logic or syntax?

    if cookie.has_key('visitor') != 'nikos' or re.search('cyta', host) is
    None:
    # do database stuff
  • MRAB at Aug 7, 2010 at 7:07 pm

    ????? wrote:
    On 7 ???, 21:27, MRAB wrote:
    ????? wrote:
    i also dont know what wrong with this line:
    host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
    hostmatch = re.search('cyta', host)
    if cookie.has_key('visitor') != 'nikos' or hostmatch is None:
    # do stuff
    the 'stuff' never gets executed, while i want them to be as long as i
    dont have regex match!
    Try printing out repr(host). Does it contain "cyta"?
    Yes it does contain it as print shown!

    is something wrong with this line in logic or syntax?

    if cookie.has_key('visitor') != 'nikos' or re.search('cyta', host) is
    None:
    # do database stuff
    You said "i want them to be as long as i dont have regex match".

    re.search('cyta', host) will return None if there's no match, but you
    said "Yes it does contain it", so there _is_ a match, therefore:

    hostmatch is None

    is False.
  • Νίκος at Aug 7, 2010 at 7:36 pm

    On 7 ???, 22:07, MRAB wrote:

    re.search('cyta', host) will return None if there's no match, but you
    said "Yes it does contain it", so there _is_ a match, therefore:

    ? ? ?hostmatch is None

    is False.
    The code block inside the if structure must be executes ONLY if the
    'visitor' cookie is not set to the client's browser or the hostname
    address of the client doesn't contain in it the string 'cyta'.

    # ======================================
    # do not increment the counter if a Cookie is set to the visitors
    browser already
    # ======================================

    if cookie.has_key('visitor') != 'nikos' or re.search('cyta', host) is
    None:

    I still don't get it :)
  • Thomas Jollans at Aug 7, 2010 at 7:52 pm

    On 08/07/2010 09:36 PM, ????? wrote:
    cookie.has_key('visitor') != 'nikos'
    This is always True. has_key returns a bool, which is never equal to any
    string, even 'nikos'.
  • MRAB at Aug 7, 2010 at 8:07 pm

    Thomas Jollans wrote:
    On 08/07/2010 09:36 PM, ????? wrote:
    cookie.has_key('visitor') != 'nikos'
    This is always True. has_key returns a bool, which is never equal to any
    string, even 'nikos'.
    I missed that bit! :-)

    Anyway, the OP said "the 'stuff' never gets executed". Kinda puzzling...
  • Νίκος at Aug 7, 2010 at 8:29 pm

    On 7 ???, 22:52, Thomas Jollans wrote:
    On 08/07/2010 09:36 PM, ????? wrote:

    cookie.has_key('visitor') != 'nikos'
    This is always True. has_key returns a bool, which is never equal to any
    string, even 'nikos'.
    if cookie.has_key('visitor') or re.search('cyta', host) is None:

    adresses the problem :-)

    Thanks alot Thomas and MRAB for ALL your help!
  • MRAB at Aug 7, 2010 at 6:24 pm

    ????? wrote:
    Hello guys! Need your precious help again!

    In every html file i have in the very first line a page_id fro counetr
    countign purpsoes like in a format of a comment like this:

    <!-- 1 -->
    <!-- 2 -->
    <!-- 3 -->

    and so on. every html file has its one page_id

    How can i grab that string representaion of a number from inside
    the .html file using regex and convert it to an integer value?

    # ==============================
    # open current html template and get the page ID number
    # ==============================

    f = open( '/home/webville/public_html/' + page )

    #read first line of the file
    firstline = f.readline()

    page_id = re.match( '<!-- \d -->', firstline )
    print ( page_id )
    Use group capture:

    found = re.match(r'<!-- (\d+) -->', firstline).group(1)
    print(page_id)
  • Νίκος at Aug 7, 2010 at 6:51 pm

    On 7 ???, 21:24, MRAB wrote:

    Use group capture:

    ? ? ?found = re.match(r'<!-- (\d+) -->', firstline).group(1)
    ? ? ?print(page_id)
    Worked like a charm! Thanks a lot!

    So match method here not only searched for the string representation
    of the number but also convert it to integer as well?

    r stand for retrieve the string here?

    and group?

    Wehn a regex searched a .txt file when is retrieving something for it
    always retrieve it as string right? or can get it as a number as well?
  • Thomas Jollans at Aug 7, 2010 at 7:03 pm

    On 08/07/2010 08:51 PM, ????? wrote:
    On 7 ???, 21:24, MRAB wrote:

    Use group capture:

    found = re.match(r'<!-- (\d+) -->', firstline).group(1)
    print(page_id)
    Worked like a charm! Thanks a lot!

    So match method here not only searched for the string representation
    of the number but also convert it to integer as well?

    r stand for retrieve the string here?
    r"xyz" is a raw string literal. That means that backslash escapes are
    turned off -- r'\n' == '\\n'
    and group?

    Wehn a regex searched a .txt file when is retrieving something for it
    always retrieve it as string right? or can get it as a number as well?
  • MRAB at Aug 7, 2010 at 7:17 pm

    ????? wrote:
    On 7 ???, 21:24, MRAB wrote:

    Use group capture:

    found = re.match(r'<!-- (\d+) -->', firstline).group(1)
    print(page_id)
    Worked like a charm! Thanks a lot!

    So match method here not only searched for the string representation
    of the number but also convert it to integer as well?

    r stand for retrieve the string here?

    and group?

    Wehn a regex searched a .txt file when is retrieving something for it
    always retrieve it as string right? or can get it as a number as well?
    The 'r' prefix makes it a 'raw string literal'. That means that the
    string literal won't treat backslashes as special. Before raw string
    literals were added to the Python language I would have needed to write:

    '<!-- (\\d+) -->'

    instead.

    (Actually, that's not strictly true in this case, because \d doesn't
    have a special meaning Python strings, but it's a good idea to use raw
    string literals habitually when writing regexes in order to reduce the
    chance of forgetting them when they _are_ necessary. Well, that's what I
    think, anyway. :-))
  • Νίκος at Aug 7, 2010 at 7:37 pm

    On 7 ???, 22:17, MRAB wrote:
    ????? wrote:
    On 7 ???, 21:24, MRAB wrote:

    Use group capture:
    ? ? ?found = re.match(r'<!-- (\d+) -->', firstline).group(1)
    ? ? ?print(page_id)
    Worked like a charm! Thanks a lot!
    So match method here not only searched for the string representation
    of the number but also convert it to integer as well?
    r stand for retrieve the string here?
    and group?
    Wehn a regex searched a .txt file when is retrieving something for it
    always retrieve it as string right? or can get it as a number as well?
    The 'r' prefix makes it a 'raw string literal'. That means that the
    string literal won't treat backslashes as special. Before raw string
    literals were added to the Python language I would have needed to write:

    ? ? ?'<!-- (\\d+) -->'

    instead.

    (Actually, that's not strictly true in this case, because \d doesn't
    have a special meaning Python strings, but it's a good idea to use raw
    string literals habitually when writing regexes in order to reduce the
    chance of forgetting them when they _are_ necessary. Well, that's what I
    think, anyway. :-))
    Couln't agree more!

    As the saying goes, better safe than sorry! :-)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedAug 7, '10 at 5:29p
activeAug 7, '10 at 8:29p
posts14
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase