今天拿個露天w10序號來展示一下好了XD
假設我要取得非注目以外的商品的標題以及金額,那就必須知道那段文字的XPath路徑
大概看一下原始碼
得到XPath路徑應該為:
#所有的標題
//dl[@class='search_form s_grid']//div[@class='prod_info']//h5//a
#所有的價格
//dl[@class='search_form s_grid']//div[@class='prod_info']//ul//li//span[@class='price'][1]
接著就用Html Agility Pack去載入網頁,並XPath路徑餵給Html Agility Pack去取得標題與價格的Nodes
程式碼:
using HtmlAgilityPack;
using System;
using System.Windows.Forms;
using HtmlDocument = HtmlAgilityPack.HtmlDocument;
namespace WindowsFormsApp1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
HtmlWeb web = new HtmlWeb();
HtmlDocument document = web.LoadFromBrowser("https://find.ruten.com.tw/s/?cateid=001100060001&q=win10");
var docNode = document.DocumentNode;
var titleNodes = docNode.SelectNodes(@"//dl[@class='search_form s_grid']//div[@class='prod_info']//h5//a");
var priceNodes = docNode.SelectNodes(@"//dl[@class='search_form s_grid']//div[@class='prod_info']//ul//li//span[@class='price'][1]");
int count = titleNodes.Count;
for (int index = 0; index < count; index++)
{
Console.WriteLine("{0} price={1}", titleNodes[index].InnerText, priceNodes[index].InnerText);
}
}
}
}
執行結果: