所以说有点小bug吗!不过上面的帖子已经改好了,你再试试!
下面的结果是通过
http://www.delphibbs.com/delphibbs/dispq.asp?lid=1414973
得到的
rules.asp
register.asp
download.asp
index.asp
listroom.asp
listq.asp
listu.asp
dispu.asp
askqn.asp
uonline.asp
calendar.asp
http://richsearch.com
mailto:yysun@263.net
这程序没有考虑特别文件名的问题,不过也很好改,检查所得结果的后三个字符就行了。
你要是做不了的话我帮你做。
/////////////////////////////////////////////////////////////////////////////
2002-11-5 23:19
下面这段代码是可以得到网页中所有exe文件链接的代码,记得找一个有exe链接的网页试试
你要试163的话什么也得不到,最好自己在本地先做一个测试页。
unit Unit1;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls, ExtCtrls, IdBaseComponent, IdComponent,
IdTCPConnection, IdTCPClient, IdHTTP, ComCtrls;
type
TForm1 = class(TForm)
IdHTTP1: TIdHTTP;
Panel1: TPanel;
Edit1: TEdit;
Button1: TButton;
Memo1: TMemo;
ProgressBar1: TProgressBar;
procedure Button1Click(Sender: TObject);
procedure IdHTTP1Workbegin
(Sender: TObject;
AWorkMode: TWorkMode;
const AWorkCountMax: Integer);
procedure IdHTTP1WorkEnd(Sender: TObject;
AWorkMode: TWorkMode);
procedure IdHTTP1Work(Sender: TObject;
AWorkMode: TWorkMode;
const AWorkCount: Integer);
private
{ Private declarations }
public
{ Public declarations }
procedure getLink(str:String);
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.Button1Click(Sender: TObject);
var
body:String;
begin
Memo1.Clear;
body:=idHTTP1.Get(Edit1.Text);
//得到html源文件
//Memo1.Text:=body;
getLink(body);
end;
procedure TForm1.IdHTTP1Workbegin
(Sender: TObject;
AWorkMode: TWorkMode;
const AWorkCountMax: Integer);
begin
ProgressBar1.Max:= AWorkCountMax;
end;
procedure TForm1.IdHTTP1WorkEnd(Sender: TObject;
AWorkMode: TWorkMode);
begin
ProgressBar1.Position:= 0;
end;
procedure TForm1.IdHTTP1Work(Sender: TObject;
AWorkMode: TWorkMode;
const AWorkCount: Integer);
begin
ProgressBar1.Position:= AWorkCount;
end;
procedure TForm1.getLink(str: String);
var
i,j:Integer;
isLink:Boolean;
tmpurl:String;
function geturl(linkStr:String):String;
var
eqpos,m,n,lsLen:Integer;
tmpLink:String;
begin
Result:='';
eqpos:=Pos('=',linkStr);
lsLen:=Length(linkStr);
if linkStr[eqpos+1]='"' then
begin
for m := 0 to lsLen-1do
begin
if linkStr[m+eqpos+2]='"' then
break;
end;
tmpLink:=copy(linkStr,eqpos+2,m);
if ExtractFileExt(tmpLink)='.exe' then
Result:=tmpLink;
end
else
begin
for n := 0 to lsLen-1do
begin
if (linkStr[n+eqpos+1]=' ') or (linkStr[n+eqpos+1]='>') then
break;
end;
tmpLink:=copy(linkStr,eqpos+1,n);
if ExtractFileExt(tmpLink)='.exe' then
Result:=tmpLink;
end;
end;
begin
isLink:=false;
for i := 0 to Length(str)-1do
begin
if str
+str[i+1]='<a' then
isLink:=true;
if str+str[i+1]+str[i+2]+str[i+3]='<src' then
isLink:=true;
if isLink then
begin
for j := 0 to Length(str)-i-1do
begin
if str[i+j]='>' then
begin
break;
end;
end;
tmpurl:=geturl(copy(str,i,j+1));
if Pos('exe',tmpurl)>0 then
Memo1.Lines.Add(tmpurl);
isLink:=false;
end;
end;
end;
end.
//////////////////////////////////////////////////////////////////////////
2002-11-6 10:00
to sydan:下载网页源文件的代码也要mail啊!?
不就简简单单的一句idHTTP1.Get(url)吗!
to heipi2002:
>>代码里面判断好像有点问题,html代码里面<a href=a.exe >和<a href="a.exe">效
>>果相同<a href>和<a href... >效果也是一样的,所以代码思路可能会出现问题;
这个问题我考虑到了,你试试程序就知道了。有没有"都是一样的。
to Richard3000:
我用了ExtractFileExt函数,误差可能会小点。[]