i opening xml file, lxml, , doing lot of editing before saving out new xml file, working fine. within opened xml, have url links web page. in webpage values record , use in open xml. have searched can't find start.
kind regards.
update -
i using below code gleam url xml, , working. can read page data variable, prints fine:
url = tree.find("//video/products/product/read_only_info/read_only_value[@key='storeurl-gb']") if url not none: url = url.text data = urllib2.urlopen(url) data = data.read() print data
how can find particular string buried in web page, here piece of web page data want get:
<div id="content"> <div class="padder"> <div id="title" class="intro"> <div class="left"> <h1>this title</h1> <span rating-system="bbfc" rating-id="37" class="content-rating">15</span> <h2>this more text</h2> </div> <div class="right"> <a href="https://rthuere.erwerwer.ghty4e.fdfsdf.com" class="view-more">view more in sci-fi & fantasy</a> </div>
i need value "view more in sci-fi & fantasy" or whatever other value there.
kind regards.
if want text of a-nodes, can using beautifulsoup:
soup = beautifulsoup(html_page) link in soup.findall('a'): print link.text
does answer question?
Comments
Post a Comment