如果获取网站的文本和超连接(50分)

  • 主题发起人 主题发起人 windows.net
  • 开始时间 开始时间
W

windows.net

Unregistered / Unconfirmed
GUEST, unregistred user!
我有如下的一文本,现在我想取出其中的超连接文本和url应该怎么写
<table width="100%" border="0" cellspacing="0" cellpadding="4">
<tr>
<td bgcolor="F0F0F0" class="S" align="right">
显示 1-10 条 共 41 条 5 页 
<strong>1</strong>
<a href="/athena/offerlist/qianliwei-default-2-false.html">2</a>
<a href="/athena/offerlist/qianliwei-default-3-false.html">3</a>
<a href="/athena/offerlist/qianliwei-default-4-false.html">4</a>
<a href="/athena/offerlist/qianliwei-default-5-false.html">5</a>
</td>
</tr>
</table>
我想取出共5页的5和2 ,3 ,4的连接和文本都要取出来的文本格式像这样
5
2 /athena/offerlist/qianliwei-default-2-false.html
3 /athena/offerlist/qianliwei-default-3-false.html
4 /athena/offerlist/qianliwei-default-4-false.html
5 /athena/offerlist/qianliwei-default-5-false.html
 
我也在[?]关注中。[?][?]。
 
关注中........

 
如果是网页的话
var
doc: IHTMLDocument2;
all: IHTMLElementCollection;
item: OleVariant;
len, i: integer;
begin
doc := wb.Document as IHTMLDocument2;
all := doc.Get_links; //doc.Links亦可
len := all.length;
for i := 0 to len - 1 do
begin
item := all.item(i, varempty); //EmpryParam亦可
listitem := OpenAllLinkForm.ListView_link.Items.Add;
listitem.Caption := item.href;
listitem.SubItems.Add(item.innertext);
end;
end; //end for
end;
 
方法之一,使用ScriptControl来执行JavaScript脚本,在javaScript脚本中使用正则表达式,
正达表达式能快速有效地搜录符合条件的字符.
uses
ComObj;

procedure TForm1.Button1Click(Sender: TObject);
var
Obj:OleVariant;
Str:string;
begin
Obj:=CreateOleObject('ScriptControl');
Obj.Language:='JavaScript';
Obj.AddCode('function test(S){'
+'var re=/<a href=/"(.[^<>]*)">(.[^<>]*)<//a>/ig;'
+'var r1; var outStr="";'
+'while ((r1=re.exec(S))!=null){outStr+=r1[1]+","+r1[2]+"/n"} return outStr}');
Str:='<table width="100%" border="0" cellspacing="0" cellpadding="4">'
+'<tr>'
+'<td bgcolor="F0F0F0" class="S" align="right">'
+' 显示 1-10 条 共 41 条 5 页'
+' <strong>1</strong>'
+' <a href="/athena/offerlist/qianliwei-default-2-false.html">2</a>'
+' <a href="/athena/offerlist/qianliwei-default-3-false.html">3</a>'
+' <a href="/athena/offerlist/qianliwei-default-4-false.html">4</a>'
+' <a href="/athena/offerlist/qianliwei-default-5-false.html">5</a>'
+'</td>'
+'</tr>'
+'</table>';
ShowMessage(Obj.run('test',Str));
end;
 
后退
顶部