如何从IE中拷贝文本?(50分)

  • 主题发起人 主题发起人 grhunter
  • 开始时间 开始时间
G

grhunter

Unregistered / Unconfirmed
GUEST, unregistred user!
; 不能使用clipboard.astext,我想应该用getbuffer,不知对不对?请各位高手解答。
 
来自:grhunter 时间:00-5-2 18:55:13


下面是MSDN里面的说法。原理明白了,但还是不知道在Delphi里面应该如何做。还有,
前面问题里面应该是clipboard.gettextbuff,写错了。

HTML Clipboard Format

The requirements for transferring HTML text by means of the clipboard differ
depending on scenario. This article is concerned with cutting and pasting
fragments of an HTML document. There may be requirements for transferring
entire HTML documents through the clipboard; however, this article is driven
by a requirement to transfer fragments of selected HTML text. As such, a
method that required the entire HTML document to be copied to the clipboard
is seen as too heavyweight.

The CF_HTML clipboard format allows a fragment of raw HTML text and its
context to be stored on the clipboard as ASCII. This allows the context
of the HTML fragment, which consists of all preceding surrounding tags,
to be examined by an application so that the surrounding tags of the HTML
fragment can be noted with their attributes. Although it is up to
applications to decide how to interpret such fragments, some basic
guidelines are included here based on IE4/MSHTML implementations.

The official name of the clipboard (the string used by RegisterClipboardFormat)
is HTML Format.

Description


Scenarios


Description
CF_HTML is entirely text format (to be, among other things, in the HTML
spirit, and uses UTF-8) and includes a description, an optional context,
and, within the context, the fragment.

The following is an example of a clipboard:

Version:0.9
StartHTML:71
EndHTML:170
StartFragment:140
EndFragment:160
StartSelection:140
EndSelection:160
<!DOCTYPE>
<HTML>
<HEAD>
<TITLE> The HTML Clipboard</TITLE>
<BASE HREF="http://sample/specs">
</HEAD>
<BODY>
<UL>
<!--StartFragment -->
<LI> The Fragment </LI>
<!--EndFragment -->
</UL>
</BODY>
</HTML>

The description includes the clipboard version number and offsets, indicating
where the context and the fragment start and end. The description is a list
of ASCII text keywords (min/maj is not meaningful) followed by a string
and separated by a colon (:).

Version: vv version number of the clipboard. Starting version is 0.9.
StartHTML: bytecount from the beginning of the clipboard to the start of the
context, or -1 if no context.
EndHTML: bytecount from the beginning of the clipboard to the end of the
context, or -1 if no context.
StartFragment: bytecount from the beginning of the clipboard to the start
of the fragment.
EndFragment: bytecount from the beginning of the clipboard to the end of
the fragment.
StartSelection: bytecount from the beginning of the clipboard to the start
of the selection.
EndSelection: bytecount from the beginning of the clipboard to the end of
the selection.


The StartSelection and EndSelection keywords are optional and must both be
omitted if you do not want the application to generate this information.

Other information of this kind could be added here later, since the HTML
starts at the StartHTMLoffset. For example, multiple pairs of
StartFragment / EndFragment could be added later to support noncontiguous
selection of fragments.

For the convenience of the programs generating the bytecounts, bytecounts
could be left padded by zeros. For example, programs generating the
bytecounts could arbitrarily affect ten (10) zeros to each keyword
(StartHTML: 0000000000) and then, when the exact StartHTML bytecount is
known (say, 71), the program can replace the appropriate number of zeros
by the bytecount (StartHTML: 0000000071).

The only character set supported by the clipboard is Unicode in its UTF-8
encoding. Because the first characters of UTF-8 and ASCII match, the
description is always ASCII, but the bytes of the context (starting at
StartHTML) could be using any other characters coded in UTF-8.

End of lines in the clipboard format header could be CR or CR/LF or LF.

The fragment contains pure, valid HTML representing the area the user has
selected (to Copy, for example). This contains the selected text plus the
opening tags and attributes of any element that has an end tag within the
selected text, and end tags at the end of the fragment for any start tag
included. This is all information required for basic pasting of an HTML
fragment.

The fragment should be preceded and followed by the HTML comments
<!--StartFragment--> and <!--EndFragment--> (no space allowed between
the !-- and the text) to conveniently indicate where the fragment starts
and ends. Thus the start and end of the fragment are indicated by the
presence of these comments and by StartFragment and EndFragment byte counts
in the description. Tools are expected to produce this information. This
redundancy has been introduced to be able to rapidly find the start of
the fragment (from the byte count) and mark the position of the fragment
directly in the HTML tree.

The selection indicates inside the fragment the exact HTML area the user
has selected (to Copy, for example). This adds more information to the
fragment by indicating the exact selected text, without the opening tags
and end tags that have been added to ensure the fragment is well-formed HTML.

The selection is optional, as sufficient information is included in the
fragment for basic pasting. If the selection is not stored, both
StartSelection and EndSelection are not stored in the header.

The context is a valid, complete HTML document. This article contains the
fragment and all preceding surrounding tags (start and end tags; these
preceding surrounding tags represent all the parent nodes of the fragment,
until the HTML node). The article also contains the complete HEAD, and
allows the BASE and TITLE elements, for example, to be included so this
additional information can be obtained. An application copying a fragment
of HTML to the clipboard can choose to create a BASE element to include in
the context if such an element is not already present so that partial URLs
in the fragment can be resolved.

The context is optional, as sufficient information is included in the
fragment for basic pasting of an HTML fragment to take place. If the
context is not stored, the fragment only is stored and the
StartHTML=EndHTML=-1.

Scenarios
The following scenarios describe how the IE4/MSHTML HTML editor handles
HTML cut and paste; other applications may or may not follow these scenarios.
The clipboard format described here is intended to allow flexibility for how
an application chooses to function. (These scenarios show only good HTML,
that is, no overlapping tags.)

Simple Fragment of HTML.
HTML text:
<BODY>This is normal <B>This is bold </B><I><B>This is bold
italic </B>This is italic </I></BODY>
Appears as:
This is normal This is bold This is bold italic This is italic
The text between the ** is selected and copied to the clipboard:
This is normal This is **bold This is bold italic This** is italic
This is what will be on the clipboard (note this is IE4/MSHTML's interpretation):
Version:0.9
StartHTML:71
EndHTML:160
StartFragment:130
EndFragment:150
StartSelection:130
EndSelection:150
<!DOCTYPE ...>
<HTML>
<BODY>
<!--StartFragment-->
<B>bold </B><I><B>This is bold italic </B>This</I>
<!--EndFragment-->
</BODY>
</HTML>
In this scenario only the BODY tag and the HTML tag appear in the context as
it precedes the selected fragment. Note that start tags and end tags are
included in the context. The selection, as delimited by StartSelection and
EndSelection, is shown in bold.
Fragment of a table in HTML.
HTML text:
<BODY><TABLE BORDER><TR><TH ROWSPAN=2>Head1</TH><TD>Item 1
</TD> <TD>Item 2</TD> <TD>Item 3</TD> <TD>Item 4</TD>
</TR><TR><TD>Item 5</TD> <TD>Item 6</TD> <TD>Item 7
</TD> <TD>Item 8</TD></TR><TR><TH>Head2</TH><TD>
Item 9</TD> <TD>Item 10</TD> <TD>Item 11</TD> <TD>Item 12
</TD></TR></TABLE></BODY>
Appears as: Head1 Item 1 Item 2 Item 3 Item 4
Item 5 Item 6 Item 7 Item 8
Head2 Item 9 Item 10 Item 11 Item 12


The Item 6, Item7, Item 10, and Item 11 elements of the table are selected as
a block and copied to the clipboard.
This is what will be on the clipboard (note this is IE4/MSHTML's interpretation):
<!DOCTYPE ...>
<HTML><BODY><TABLE BORDER>
<!--StartFragment-->
<TR><TD>Item 6</TD> <TD>Item 7</TD></TR><TR><TD>Item 10
</TD> <TD>Item 11</TD></TR>
<!--EndFragment-->
</TABLE>
</BODY></HTML>
The selection, as delimited by StartSelection and EndSelection, is shown in
bold.
Pasting a fragment of an ordered list into plain text.
HTML text:
<BODY><OL TYPE = a><LI>Item 1<LI>Item 2<LI>Item 3<LI>Item 4
<LI>Item 5<LI>Item 6</OL></BODY>
Appears as:
Item 1
Item 2
Item 3
Item 4
Item 5
Item 6
The user selects and copies items 3 through 5 to the clipboard.
The following HTML is in the clipboard:
<DOCTYPE...><HTML><BODY><OL TYPE = a>
<!-- StartFragment-->
<LI>Item 3<LI>Item 4<LI>Item 5
<!-- EndFragment-->
</OL></BODY></HTML>
The selection, as delimited by StartSelection and EndSelection, is show in bold.
If this fragment is now pasted into an empty document, the following HTML will
be created:
<BODY><OL TYPE = a><LI>Item 3<LI>Item 4<LI>Item 5</OL>
</BODY>
Appearing as:
Item 3
Item 4
Item 5
Pasting a partially selected region.
HTML text:
<P> IE4/MSHTML is a WYSIWYG Editor that supports :<UL><LI>Cut<LI>
Copy<LI>Paste
</UL> <P>This is a Great Tool !
Appears as:
IE4/MSHTML is a WYSIWYG Editor that supports :

Cut
Copy
Paste
The user selects from "WYSIWYG" until "Cop".
The following HTML is in the clipboard:
<DOCTYPE...><HTML><BODY>
<!-- StartFragment-->
<P>
WYSIWYG Editor, which supports
<UL><LI>Cut<LI>Cop
</UL>
<!-- EndFragment-->
</BODY></HTML>
The selection, as delimited by StartSelection and EndSelection, is shown in
bold.
The user selects from "opy" until "Great".
The following HTML is in the clipboard:
<DOCTYPE...><HTML><BODY>
<!-- StartFragment-->
<UL><LI>
opy<LI>Paste</UL><P> This is a Great
</P>
<!-- EndFragment-->
</BODY></HTML>
The selection, as delimited by StartSelection and EndSelection, is shown in
bold.
 
若只是需要文本的话,clipboard.paste
 
下面的代码得到Webbrowser里面选中的文字并将其送到剪贴板。
var
Doc: IHtmlDocument2;
TxtRange: IHtmlTxtRange;
begin
Doc :=WebBrowser1.Document as IHtmlDocument2;
TxtRange :=Doc.Selection.CreateRange as IHtmlTxtRange;
Clipboard.SetTextBuf(Pchar(TxtRange.Get_text));
end;
 
多人接受答案了。
 
后退
顶部