Screen Scraping With Htmlagilitypack And Xpath
[This question has a relative that lives at: Selective screen scraping with HTMLAgilityPack and XPath ] I have some HTML to parse which has general appearance as follow: ...
Solution 1:
Following query selects a
element with non-empty href
attribute from each cell. If there is no such element, then inner text of cell is used:
var dataList =
currentDoc.DocumentNode.Descendants("tr")
.Select(tr => from td in tr.Descendants("td")
let a = td.SelectSingleNode("a[@href!='']")
select a == null ? td.InnerText :
a.Attributes["href"].Value);
Feel free to add ToList()
calls.
Post a Comment for "Screen Scraping With Htmlagilitypack And Xpath"