如何用IE提取HTML元素--我有vc的代码,谁帮我翻译过来! 不需要webBrowser,也不是分析字符串 (100分)

  • 主题发起人 主题发起人 wzgss
  • 开始时间 开始时间
W

wzgss

Unregistered / Unconfirmed
GUEST, unregistred user!
我要从HTML中提取所使用的元素进行分析。如提取input的name,
type等我开始是用字符串操作的方法,这种方法如果考虑
不周,定有bug。看到有关资料说可以用IE来实现,不知道
如何具体操作,请大家帮忙给点资料.
 
我找到的资料但不行
我知道怎样调用HTML的所有元素.不知道有没有用.
var
doc:Ihtmldocument2;
i,len:integer;
FElements:IHTMLElementCollection;

begin
doc:=webBrowser1.document as IhtmlDocument2;
FElements:=doc.all;
len:=FElements.length;
for i:=0 to len-1 do begin
combobox1.items.add(FElements.item(i,varempty).tagname);
 
我有vc的代码,谁帮我翻译过来!具体见http://www.codeproject.com/internet/parse_html.asp
特别是前段不用webbrow类如何取得IHTMLDOCUment2的.因为我要在dll里用此功能。
CWaitCursor wait;
if(m_csFilename.IsEmpty()){
AfxMessageBox(_T("Please specify the file to parse"));
return;
}
CFile f;

//let's open file and read it into CString (u can use any buffer to read though
if (f.Open(m_csFilename, CFile::modeRead|CFile::shareDenyNone)) {
m_wndLinksList.ResetContent();
CString csWholeFile;
f.Read(csWholeFile.GetBuffer(f.GetLength()), f.GetLength());
csWholeFile.ReleaseBuffer(f.GetLength());
f.Close();

//declare our MSHTML variables and create a document
MSHTML::IHTMLDocument2Ptr pDoc;
MSHTML::IHTMLDocument3Ptr pDoc3;
MSHTML::IHTMLElementCollectionPtr pCollection;
MSHTML::IHTMLElementPtr pElement;

HRESULT hr = CoCreateInstance(CLSID_HTMLDocument, NULL, CLSCTX_INPROC_SERVER,
IID_IHTMLDocument2, (void**)&pDoc);

//put the code into SAFEARRAY and write it into document
SAFEARRAY* psa = SafeArrayCreateVector(VT_VARIANT, 0, 1);
VARIANT *param;
bstr_t bsData = (LPCTSTR)csWholeFile;
hr = SafeArrayAccessData(psa, (LPVOID*)&param);
param->vt = VT_BSTR;
param->bstrVal = (BSTR)bsData;

hr = pDoc->write(psa);
hr = pDoc->close();

SafeArrayDestroy(psa);

//I'll use IHTMLDocument3 to retrieve tags. Note it is available only in IE5+
//If you don't want to use it, u can just run through all tags in HTML
//(IHTMLDocument2->all property)
pDoc3 = pDoc;

//display HREF parameter of every link (A tag) in ListBox
pCollection = pDoc3->getElementsByTagName(L"A");
for(long i=0; i<pCollection->length; i++){
pElement = pCollection->item(i, (long)0);
if(pElement != NULL){
//second parameter says that you want to get text inside attribute as is
 
发个邮件给我,我给份代码你。
主要是我怕我看不到你的回帖。
 
谢谢大哥,songshuang@topgroup.com.cn
 
悲酥清风
Please email to me ( jxsandal@21cn.com )
thanks!
 
悲酥清风
也请email给我,
newsweep@tom.com
谢谢!
 
悲酥清风
给我发一份好吗,
yuhe@ustc.edu
麻烦你了,谢谢!
 
后退
顶部