高分求一个html节点分析函数(200分)

sth2007 · 2008-12-02

//strHtml : html源代码;
//TagName : 节点名称
//AttrName : 节点属性名称
//AttrValue : 节点属性值
//outItems : 匹配节点输出, 每行一个匹配节点
procedure xQuery(strHtml, TagName, AttrName, AttrValue: string; outItems: TStringList);

比如html代码如下, 现在要取得TagName=div, AttrName=id, AttrValue=content1节点(就是<div id="content1">...</div>)的源代码, 该如何实现?

另外如果参数AttrName或者AttrValue为空, 则表示匹配所有!

谢谢!

<body>
<div id="content1">
<div>

</div>

<div>

</div>

<div>

</div>
</div>

<div id="content2">
<div>

</div>

<div>

</div>

<div>

</div>
</div>

</body>

sth2007 · 2008-12-02

自己顶起！

HapBegin · 2008-12-02

把html放进Tstringlist,一行行的分析吧！
html语法不严谨规范，估计不好分析呵呵
帮顶

mscode · 2008-12-02

把它加得,webbrowse控件里,用webbrowse的方法去调用javascript 的方法吧,document.getelementByTagName('*')得到一个集合,然后遍历它

wql · 2008-12-03

学过编译原理的词法分析就简单了!呵呵!

linbren · 2008-12-04

mscode 的方法可试，当然也D里面还有WEB的其他控件，基本用法都一样。就是取源码
然后用控件的函数如document.getelementByTagName ==

hfghfghfg · 2008-12-04

如果你只是找标记
你直接用 xml可以解析啊
doc := TXMLDocument.Create(self);
try
ss.Position := 0;
doc.AfterOpen := AfterOpen;
hEvent := CreateEvent(nil, true, false, nil);
try
doc.LoadFromStream(ss);
except
begin
FreeAndNil(doc);
exit;
end;
end;
waitComplete(hEvent);
CloseHandle(hEvent);
hEvent := 0;
t := nil;
addDel := false;
for j := 0 to doc.ChildNodes.Count - 1 do
begin

sth2007 · 2008-12-05

不能使用mshtml，只能使用字符串分析+正则表达式

hfghfghfg · 2008-12-06

TXMLDocument 都不可以吗
有个 html分析的控件
THtmlParser
你看看行不行

sth2007 · 2008-12-06

hfghfghfg你说的是uindex作者的哪个THtmlParser吧？有用过。

高分求一个html节点分析函数(200分)

sth2007

Unregistered / Unconfirmed

sth2007

Unregistered / Unconfirmed

HapBegin

Unregistered / Unconfirmed

mscode

Unregistered / Unconfirmed

wql

Unregistered / Unconfirmed

linbren

Unregistered / Unconfirmed

hfghfghfg

Unregistered / Unconfirmed

sth2007

Unregistered / Unconfirmed

hfghfghfg

Unregistered / Unconfirmed

sth2007

Unregistered / Unconfirmed

Similar threads