i am looking for an idea on how to handle un-nesting tags.
i know i can use something build on top of a htmltidy, but i'm rather
wondering if this could be done using only python standard library. my
input tags can not be crossed (i mean "<a> w1 <b> w2 </a> w3 </b>" is
impossible from my input)
actually i had produced some data with :
some input : (line number / content)
where in fact i should i have :
i am wondering how i can repair that.
i had built a small script which already do that, but as i know there
are clever brains here, may be i will get some better suggestions...
(i need to clean/rewrite my code, but here is how it works : it first
find paired opening/closing tags, their width and positions, then from
the smallest to the largest, it encloses the previous text inside the
current tag and build a text that will be the next one to be enclosed
and so on.)