Really, just how hard can it be to read some html into Python and query it with XPath?
Well, as usual the Pyhon manual is pretty patchy, and there are a myriad of third party libraries in various states of disrepair. After a while, I got this to work:
from lxml import etree parser = etree.HTMLParser() tree = etree.parse("stuff.html", parser) cells = tree.xpath("//table[@class='guildBattlesInner']/tbody/tr/td[2]") for td in cells: if td.attrib.has_key('class') and td.attrib['class'].find('highlight') != -1: print(tr.text)