FAQ
Hi all,

I am trying to parse some html string with BeatifulSoup.

The string is,

<table colWidths='530.0' style='Table_Main_Table'>
<tr>
<td>
<blockTable colWidths='54.0,80.0,67.0' style='Table_Tax_Header'>
<tr>
<th>
<p style='terp_tblheader_Details_Centre'>Tax</p></th>
<th>
<p style='terp_tblheader_Details_Right'>Base</p></th>
<th>
<p style='terp_tblheader_Details_Right'>Amount</p></th>
</tr>
</blockTable>
</td>
</tr>
</table>


rtables=soup.findAll(re.compile('table$'))

The rtables is,

[<table colwidths="530.0" style="Table_Main_Table">
<tr>
<td>
<blocktable colwidths="54.0,80.0,67.0" style="Table_Tax_Header">
</blocktable></td></tr><tr>
<th>
<p style="terp_tblheader_Details_Centre">Tax</p></th>
<th>
<p style="terp_tblheader_Details_Right">Base</p></th>
<th>
<p style="terp_tblheader_Details_Right">Amount</p></th>
</tr>
</table>, <blocktable colwidths="54.0,80.0,67.0" style="Table_Tax_Header">
</blocktable>]



The tr inside the blocktable are appearing inside the table, while
blocktable contains nothing.

Is there any way, I can get the tr in the right place (inside blocktable) ?

--
Regards,
S.Selvam
SG E-ndicus Infotech Pvt Ltd.
http://e-ndicus.com/

" I am because we are "
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110105/3de0de6b/attachment-0001.html>

Search Discussions

  • Selvam at Jan 5, 2011 at 10:15 am

    On Wed, Jan 5, 2011 at 2:58 PM, Selvam wrote:

    Hi all,

    I am trying to parse some html string with BeatifulSoup.

    The string is,

    <table colWidths='530.0' style='Table_Main_Table'>
    <tr>
    <td>
    <blockTable colWidths='54.0,80.0,67.0' style='Table_Tax_Header'>
    <tr>
    <th>
    <p style='terp_tblheader_Details_Centre'>Tax</p></th>
    <th>
    <p style='terp_tblheader_Details_Right'>Base</p></th>
    <th>
    <p style='terp_tblheader_Details_Right'>Amount</p></th>
    </tr>
    </blockTable>
    </td>
    </tr>
    </table>


    rtables=soup.findAll(re.compile('table$'))

    The rtables is,

    [<table colwidths="530.0" style="Table_Main_Table">
    <tr>
    <td>
    <blocktable colwidths="54.0,80.0,67.0" style="Table_Tax_Header">
    </blocktable></td></tr><tr>
    <th>
    <p style="terp_tblheader_Details_Centre">Tax</p></th>
    <th>
    <p style="terp_tblheader_Details_Right">Base</p></th>
    <th>
    <p style="terp_tblheader_Details_Right">Amount</p></th>
    </tr>
    </table>, <blocktable colwidths="54.0,80.0,67.0" style="Table_Tax_Header">
    </blocktable>]



    The tr inside the blocktable are appearing inside the table, while
    blocktable contains nothing.

    Is there any way, I can get the tr in the right place (inside blocktable) ?

    --
    Regards,
    S.Selvam
    SG E-ndicus Infotech Pvt Ltd.
    http://e-ndicus.com/

    " I am because we are "
    Replying to myself,

    BeautifulSoup.BeautifulSoup.NESTABLE_TABLE_TAGS['tr'].append('blocktable')

    adding this, solved the issue.

    --
    Regards,
    S.Selvam
    SG E-ndicus Infotech Pvt Ltd.
    http://e-ndicus.com/

    " I am because we are "
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110105/080f533a/attachment.html>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJan 5, '11 at 9:28a
activeJan 5, '11 at 10:15a
posts2
users1
websitepython.org

1 user in discussion

Selvam: 2 posts

People

Translate

site design / logo © 2022 Grokbase