如何获取网页中不同文本框的内容，用SPY++不能一一获取每个文本框的句柄，怎么办！！！ ( 积分: 50 )

batconv · 2007-04-20

我想获取一个动态生成的页面中文本框的内容并写入新的内容，但是用SPY++获取句柄时无法获取每个文本框和按钮的句柄，怎么办啊，请高手赐教，最好有例子谢谢，我的mail: batconv@163.com

暗夜中独舞 · 2007-04-20

下面这骗文章是从VCBASE上CUT下来的

DOM应用---遍历网页中的元素
一、摘要
在我们编写的程序中，如果想要实现对浏览器打开的网页进行监视、模拟操纵、动态提取用户输入、动态修改......等功能，那么请你抽出宝贵的时间，继续往下阅读。本文介绍的知识和示例程序都是围绕如何遍历 HTML 中的表单(form)并枚举出表单域的属性为目标的，对于网页中的其它元素，比如图象、连接、脚本等等，应用同样的方法都可以轻松实现。

二、网页的文档层次结构
IE 浏览器，采用 DOM（文档对象模型）来管理网页的数据。它通过一个容器（IWebBrowser2/IHTMLWindow2）来装载网页文档（IHTMLDocument2），而一个文档，又可以由 0 或多个贞(frame)组成，管理这些贞的接口叫“框架集合（IHTMLFramesCollection2）”,而每个贞的容器又是IHTMLWindow2，和IWebBrowser2一样，它也装载着各自的文档（IHTMLDocument2）。因此，我们的第一个任务，就是想方设法能够得到IHTMLDocument2的接口。因为文档可能包含贞，而贞又包含着子文档，子文档可能再包含贞......，如此要得到所有的文档，这里有一个递归遍历的处理过程。
得到文档（IHTMLDocument2）后，下一步任务就是要设法取得表单了（IHTMLFormElement）。因为在一个文档中可以包含 0 或多个表单(form)，而管理这些表单的又是一个表单集合（IHTMLElementCollection），所以必须先得到集合，然后再枚举出所有的表单条目了。
得到表单（IHTMLFormElement）后，接下来的事情就简单了，逐个提取表单中的元素（也叫表单域 IHTMLInputElement）就可以读写这些域的属性了。
说了半天，我估计初次接触的朋友一定没有听懂

呵呵，还是用图的方式表示一下吧，这样比较清晰一些。

三、程序实现

<1> 取得 IHTMLDocument2 的接口指针。根据IE浏览器的运行方式，有多种不同的方式可以获取文档指针。
<1.1> 如果你在程序中使用MFC的 CHtmlView 视来浏览网页。
取得文档的方法最简单，调用 CHtmlView::GetHtmlDocument() 函数。
<1.2> 如果你的程序中使用了“Web 浏览器” 的ActiveX 控件。
取得文档的方法也比较简单，调用 CWebBrowser2::GetDocument() 函数。
<1.3> 如果你的程序是用 ATL 写的 ActiveX 控件。
那么需要调用 IOleClientSite::GetContainer 得到 IOleContainer 接口，然后就可以通过 QueryInterface() 查询得到 IHTMLDocument2 的接口。主要代码如下：

CComPtr < IOleContainer > spContainer;
m_spClientSite->GetContainer( &spContainer );
CComQIPtr < IHTMLDocument2 > spDoc = spContainer;
if ( spDoc )
{
// 已经得到了 IHTMLDocument2 的接口指针
}

<1.4> 如果你的程序是用 MFC 写的 ActiveX 控件。
那么需要调用 COleControl::GetClientSite() 得到 IOleContainer 接口，然后的操作和<1.3>是一致的了。
<1.5> IE 浏览器作为独立的进程正在运行。
每个运行的浏览器（IE 和资源浏览器）都会在 ShellWindows 中进行登记，因此我们要通过 IShellWindows 取得实例（示例程序中使用的就是这个方法）。主要代码如下：

#include < atlbase.h >
#include < mshtml.h >

void FindFromShell()
{
CComPtr< IShellWindows > spShellWin;
HRESULT hr = spShellWin.CoCreateInstance( CLSID_ShellWindows );
if ( FAILED( hr ) ) return;

long nCount=0;
spShellWin->get_Count(&nCount); // 取得浏览器实例个数

for(long i=0; i<nCount; i++)
{
CComPtr< IDispatch ><nCount; i++)
{
CComPtr< IDispatch ><nCount; i++)
{
CComPtr< IDispatch > spDisp;
hr=spShellWin->Item(CComVariant( i ), &spDisp );
if ( FAILED( hr ) ) continue;

CComQIPtr< IWebBrowser2 > spBrowser = spDisp;
if ( !spBrowser ) continue;

spDisp.Release();
hr = spBrowser->get_Document( &spDisp );
if ( FAILED ( hr ) ) continue;

CComQIPtr< IHTMLDocument2 > spDoc = spDisp;
if ( !spDoc ) continue;

// 程序运行到此，已经找到了 IHTMLDocument2 的接口指针
}
}

<1.6> IE 浏览器控件被一个进程包装在一个子窗口中。那么你首先要得到那个进程的顶层窗口句柄（使用 FindWindow() 函数，或其它任何可行的方法），然后枚举所有子窗口，通过判断窗口类名是否是“Internet Explorer_Server”，从而得到浏览器的窗口句柄，再向窗口发消息取得文档的接口指针。主要代码如下：

#include < atlbase.h >
#include < mshtml.h >
#include < oleacc.h >
#pragma comment ( lib, "oleacc" )

BOOL CALLBACK EnumChildProc(HWND hwnd,LPARAM lParam)
{
TCHAR szClassName[100];

::GetClassName( hwnd, &szClassName, sizeof(szClassName) );
if ( _tcscmp( szClassName, _T("Internet Explorer_Server&quot

) == 0 )
{
*(HWND*)lParam = hwnd;
return FALSE; // 找到第一个 IE 控件的子窗口就停止
}
else return TRUE; // 继续枚举子窗口
};

void FindFromHwnd(HWND hWnd)
{
HWND hWndChild=NULL;
::EnumChildWindows( hWnd, EnumChildProc, (LPARAM)&hWndChild );
if(NULL == hWndChild) return;

UINT nMsg = ::RegisterWindowMessage( _T("WM_HTML_GETOBJECT&quot

);
LRESULT lRes;
::SendMessageTimeout( hWndChild, nMsg, 0L, 0L, SMTO_ABORTIFHUNG, 1000, (DWORD*) &lRes );

CComPtr < IHTMLDocument2 > spDoc;
HRESULT hr = ::ObjectFromLresult ( lRes, IID_IHTMLDocument2, 0 , (LPVOID *) &spDoc );
if ( FAILED ( hr ) ) return;

// 程序运行到此，已经找到了 IHTMLDocument2 的接口指针
}

<2> 得到了 IHTMLDocument2 接口指针后，如果网页是单贞的，那么转第<4>步骤。如果是多贞（有子框架）则还需要遍历所有的子框架。这些子框架（IHTMLWindow2），被保存在集合中（IHTMLFramesCollection2）,取得集合指针的方法比较简单，取属性 IHTMLDocument2::get_frames()。
<3> 首先取得子框架的总数目 IHTMLFramesCollection::get_length()，接着就可以循环调用 IHTMLFramesCollection::item()函数一个一个地取得子框架 IHTMLWindow2 指针，然后转第<1>步。
<4> 一个文档中可能拥有多个表单，因此还是同样的道理，先要取得表单的集合（IHTMLElementCollection，其实这个不光是表单的集合，其他元素的集合，比如图片集合也是用它）。这个操作也很简单，取得属性 IHTMLDocument2::get_forms()。
<5> 属性 IHTMLElementCollection::get_length() 得到表单总数目，就可以循环取得每一个表单指针了 IHTMLElementCollection::item()。
<6> 在第<5>步中的item()函数，得到的是一个IDispatch的指针，你通过QueryInterface()查询，就可以得到某类型输入的指针，代码如下：
// 假设 spDisp 是由IHTMLElementCollection::item() 得到的 IDispatch 指针
CComQIPtr < IHTMLInputTextElement > spInputText(spDisp);
CComQIPtr < IHTMLInputButtonElement > spInputButton(spDisp);
CComQIPtr < IHTMLInputHiddenElement > spInputHidden(spDisp);
......
if ( spInputText )
{
//如果是文本输入表单域
}
else if ( spInputButton )
{
//如果是按纽输入表单域
}
else if ( spInputHiddent )
{
//如果是隐藏输入表单域
}
else if ........ //其它输入类型

上面的方法，由于使用具体类型的接口指针，因此程序的效率比较高。但是通过 QueryInterface 接口查询，然后再进行条件判断显然是比较烦琐的，所以这个方法适合于特定的已知网页设计内容的程序。在示例程序中，我则是直接使用 IDispatch 接口进行操作的，这个方式执行起来稍微慢一些，但程序比较简单。主要代码和说明如下：
#include < atlbase.h >
CComModule _Module; // 由于需要使用 CComDispatchDriver 的 IDispatch 包装类ATL智能指针，所以这个是必须的
#include < atlcom.h >
......
long nElemCount=0; //表单域的总数目
spFormElement->get_length( &nElemCount );

for(long j=0; j< nElemCount; j++)
{
CComDispatchDriver spInputElement; // IDispatch 的智能指针
spFormElement->item( CComVariant( j ), CComVariant(), &spInputElement );

CComVariant vName,vVal,vType; // 域名称，域值，域类型
spInputElement.GetPropertyByName( L"name", &vName );
spInputElement.GetPropertyByName( L"value",&vVal );
spInputElement.GetPropertyByName( L"type", &vType );
// 使用 IDispatch 的智能指针的好处就是：象上面这样读取、设置属性很简单
// 另外调用 Invoke 函数也异常方便，Invoke0(),Invoke1(),Invoke2()....
......

暗夜中独舞 · 2007-04-20

再给你段我自己写的DELPHI的取表单密码的CODE
function TForm1.GetDocInterface(hwnd:THandle):IHtmlDocument2;
var
hInst: THandle;
hr:HResult;
lRes:Cardinal;
MSG: Integer;
spDisp:IDispatch;
spDoc:IHTMLDocument;
pDoc2:IHTMLDocument2;
spWin:IHTMLWindow2;
ObjectFromLresult: TObjectFromLresult;
begin
hInst := LoadLibrary('Oleacc.dll');
if hInst=0 then exit;
@ObjectFromLresult := GetProcAddress(hInst, 'ObjectFromLresult');
if @ObjectFromLresult <> nil then begin
try
MSG := RegisterWindowMessage('WM_HTML_GETOBJECT');
SendMessageTimeOut(Hwnd, MSG, 0, 0, SMTO_ABORTIFHUNG, 1000, lRes);
hr:= ObjectFromLresult(lRes, IHTMLDocument2, 0, spDoc);
if SUCCEEDED(hr) then
begin
spDisp:=spDoc.Script;
spDisp.QueryInterface(IHTMLWindow2,spWin);
//spWin:=IHTMLWindow2(spDisp);
result:=spWin.document;
end;
finally
FreeLibrary(hInst);
end;
end;
end;

procedure TForm1.GetPassword(pdoc2:IHTMLDocument2;pt:TPoint);
var
ltype:string;
pwd:string;
pElement:IHTMLElement;
pPwdElement:IHTMLInputTextElement;
hr:HRESULT;
begin
if (pDoc2=Nil) then exit;
pElement:=pDoc2.elementFromPoint(pt.X,pt.Y);
hr:=pElement.QueryInterface(IID_IHTMLInputTextElement,pPwdElement);
if(SUCCEEDED(hr)) then
begin
if (pPwdElement.type_='password') and (pPwdElement.value<>'') then
begin
Edit1.text:=pPwdElement.value;
end else
if (pPwdElement.type_='text') and (pPwdElement.value<>'') then
Edit2.Text:=pPwdElement.value;
end;

end;

batconv · 2007-04-20

非常感谢我试一下

暗夜中独舞 · 2007-04-20

unit Unit1;

interface

uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, MSHTML, OleCtrls, SHDocVw, StdCtrls, Activex, ExtCtrls;

type
TObjectFromLResult = function(LRESULT: lResult; const IID: TIID; WPARAM: wParam; out pObject): HRESULT; stdcall;

TForm1 = class(TForm)
Edit1: TEdit;
Timer1: TTimer;
Label1: TLabel;
Button2: TButton;
Edit2: TEdit;
Edit3: TEdit;
Label2: TLabel;
Label3: TLabel;
procedure Timer1Timer(Sender: TObject);
procedure Button2Click(Sender: TObject);
private
procedure GetPassword(pdoc2: IHTMLDocument2; pt: TPoint);
function GetDocInterface(hwnd: THandle): IHtmlDocument2;
{ Private declarations }
public
{ Public declarations }
end;

var
Form1: TForm1;

implementation

{$R *.dfm}

//获得光标位置
function GetPoint():TPoint;
var
FocusHandle

WORD;
CurrentPos:TPoint;
begin
FocusHandle := GetFocus();
AttachThreadInput(GetCurrentThreadId, GetWindowThreadProcessId(GetForegroundWindow(), nil), TRUE);
GetCaretPos(CurrentPos);
Windows.ClientToScreen(FocusHandle, CurrentPos);
Result:=CurrentPos;
end;

function TForm1.GetDocInterface(hwnd:THandle):IHtmlDocument2;
var
hInst: THandle;
hr:HResult;
lRes:Cardinal;
MSG: Integer;
spDisp:IDispatch;
spDoc:IHTMLDocument;
pDoc2:IHTMLDocument2;
spWin:IHTMLWindow2;
ObjectFromLresult: TObjectFromLresult;
begin
hInst := LoadLibrary('Oleacc.dll');
if hInst=0 then exit;
@ObjectFromLresult := GetProcAddress(hInst, 'ObjectFromLresult');
if @ObjectFromLresult <> nil then begin
try
MSG := RegisterWindowMessage('WM_HTML_GETOBJECT');
SendMessageTimeOut(Hwnd, MSG, 0, 0, SMTO_ABORTIFHUNG, 1000, lRes);
hr:= ObjectFromLresult(lRes, IHTMLDocument2, 0, spDoc);
if SUCCEEDED(hr) then
begin
spDisp:=spDoc.Script;
spDisp.QueryInterface(IHTMLWindow2,spWin);
//spWin:=IHTMLWindow2(spDisp);
result:=spWin.document;
end;
finally
FreeLibrary(hInst);
end;
end;
end;

procedure TForm1.GetPassword(pdoc2:IHTMLDocument2;pt:TPoint);
var
ltype:string;
pwd:string;
pElement:IHTMLElement;
pPwdElement:IHTMLInputTextElement;
hr:HRESULT;
begin
if (pDoc2=Nil) then exit;
pElement:=pDoc2.elementFromPoint(pt.X,pt.Y);
hr:=pElement.QueryInterface(IID_IHTMLInputTextElement,pPwdElement);
if(SUCCEEDED(hr)) then
begin
if (pPwdElement.type_='password') and (pPwdElement.value<>'') then
begin
Edit1.text:=pPwdElement.value;
end else
if (pPwdElement.type_='text') and (pPwdElement.value<>'') then
Edit2.Text:=pPwdElement.value;
end;

end;

procedure TForm1.Timer1Timer(Sender: TObject);
var
pt:TPoint;
handle:Thandle;
buffer

Char;
strbuffer:string;
begin
//GetCursorPos(pt);
pt:=GetPoint;
handle:=WindowFromPoint(pt);
if handle<>0 then
begin
GetClassName(handle,buffer,100);
strbuffer:=strpas(buffer);
if strbuffer='Internet Explorer_Server' then
begin
// pt:=ScreenToClient(pt);
Windows.ScreenToClient(handle,pt);
Edit3.Text:=IntToStr(pt.x)+' '+IntToStr(pt.Y);
GetPassword(GetDocInterface(handle),pt);
end;
end;
end;

procedure TForm1.Button2Click(Sender: TObject);
begin
if timer1.Enabled then begin
timer1.Enabled:=False;
Button2.Caption:='开始';
end else begin
timer1.Enabled:=True;
Button2.Caption:='停止';
end;
end;

end.

这是完整的代码！

batconv · 2007-04-22

谢谢

如何获取网页中不同文本框的内容，用SPY++不能一一获取每个文本框的句柄，怎么办！！！ ( 积分: 50 )

batconv

Unregistered / Unconfirmed

暗夜中独舞

Unregistered / Unconfirmed

暗夜中独舞

Unregistered / Unconfirmed

batconv

Unregistered / Unconfirmed

暗夜中独舞

Unregistered / Unconfirmed

batconv

Unregistered / Unconfirmed

Similar threads