socks 2 http 编程问题！ (120分)

coolbaby · 2002-07-27

环境：我们公司的NAT权限没开，所以无法通过udp与外界发送数据包，终端通过一个
http代理服务器上网。可以用Socks Online上qq等，不过奇慢！
现在程序编写中遇到以下问题，请高手指教。

1.如果一个数据包要借http proxy转发的话，是不是也可以直接构造一个http请求，在这个
http请求里，把这个数据包放进去。如果可以的话，该放在http请求的哪个位置，或者，
如何组包发给http proxy。
2.http-tunnel是不是要用到https,而且，https用的是443端口，对于我这种情况是否可行？
Socks Online 好象用的就是http-tunnel.

coolbaby · 2002-07-28

gxcooo · 2002-07-28

我认为Socks Online 是通过http请求转发的数据包（好像Socks Online 有服务器端）。
所以你必须有可以IP直通的服务器来转发客户端的请求。

xiao_min · 2002-07-28

呵呵，下星期四左右我可以回去，到时候我们再调试

wlmmlw · 2002-07-28

这么难的问题。

盛利 · 2002-07-28

我到是用过socks2http这么一个软件，外国的
你要自己写一个啊
佩服！

coolbaby · 2002-07-28

to 盛利：
没办法，socksOnline实在是太慢了，无法忍受。
至于socks2http，我一直都没成功的用过他，也是照其他软件那样配置的
socks2http好象根本就不接收qq的数据。不知道你是如何配置的。
能否指点一二

coolbaby · 2002-07-28

如果谁知道这个问题，也就差不多了
就是http如何发送一个二进制的数据包，比如常见的asp里的无组件上传
web 邮局里的粘贴附件等。。。
他们的http请求是什么格式的

gxcooo · 2002-07-28

asp里的无组件上传是需要自己编码的

coolbaby · 2002-07-28

对呀，要的就是他的http编码格式

snowsky · 2002-07-28

请参照RFC-1867 HTTP 请求

encType= "multipart/form-data"

以下是RFC-1867

Network Working Group E. Nebel
Request For Comments: 1867 L. Masinter
Category: Experimental Xerox Corporation
November 1995

Form-based File Upload in HTML

Status of this Memo

This memo defines an Experimental Protocol for the Internet
community. This memo does not specify an Internet standard of any
kind. Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.

1. Abstract

Currently, HTML forms allow the producer of the form to request
information from the user reading the form. These forms have proven
useful in a wide variety of applications in which input from the user
is necessary. However, this capability is limited because HTML forms
don't provide a way to ask the user to submit files of data. Service
providers who need to get files from the user have had to implement
custom user applications. (Examples of these custom browsers have
appeared on the www-talk mailing list.) Since file-upload is a
feature that will benefit many applications, this proposes an
extension to HTML to allow information providers to express file
upload requests uniformly, and a MIME compatible representation for
file upload responses. This also includes a description of a
backward compatibility strategy that allows new servers to interact
with the current HTML user agents.

The proposal is independent of which version of HTML it becomes a
part.

2. HTML forms with file submission

The current HTML specification defines eight possible values for the
attribute TYPE of an INPUT element: CHECKBOX, HIDDEN, IMAGE,
PASSWORD, RADIO, RESET, SUBMIT, TEXT.

In addition, it defines the default ENCTYPE attribute of the FORM
element using the POST METHOD to have the default value
"application/x-www-form-urlencoded".

This proposal makes two changes to HTML:

1) Add a FILE option for the TYPE attribute of INPUT.
2) Allow an ACCEPT attribute for INPUT tag, which is a list of
media types or type patterns allowed for the input.

In addition, it defines a new MIME media type, multipart/form-data,
and specifies the behavior of HTML user agents when interpreting a
form with ENCTYPE="multipart/form-data" and/or <INPUT type="file">
tags.

These changes might be considered independently, but are all
necessary for reasonable file upload.

The author of an HTML form who wants to request one or more files
from a user would write (for example):

<FORM ENCTYPE="multipart/form-data" ACTION="_URL_" METHOD=POST>

File to process: <INPUT NAME="userfile1" TYPE="file">

<INPUT TYPE="submit" VALUE="Send File">

</FORM>

The change to the HTML DTD is to add one item to the entity
"InputType". In addition, it is proposed that the INPUT tag have an
ACCEPT attribute, which is a list of comma-separated media types.

... (other elements) ...

<!ENTITY % InputType "(TEXT | PASSWORD | CHECKBOX |
RADIO | SUBMIT | RESET |
IMAGE | HIDDEN | FILE )">
<!ELEMENT INPUT - 0 EMPTY>
<!ATTLIST INPUT
TYPE %InputType TEXT
NAME CDATA #IMPLIED -- required for all but submit and reset
VALUE CDATA #IMPLIED
SRC %URI #IMPLIED -- for image inputs --
CHECKED (CHECKED) #IMPLIED
SIZE CDATA #IMPLIED --like NUMBERS,
but delimited with comma, not space
MAXLENGTH NUMBER #IMPLIED
ALIGN (top|middle|bottom) #IMPLIED
ACCEPT CDATA #IMPLIED --list of content types
>

... (other elements) ...

3. Suggested implementation

While user agents that interpret HTML have wide leeway to choose the
most appropriate mechanism for their context, this section suggests
how one class of user agent, WWW browsers, might implement file
upload.

3.1 Display of FILE widget

When a INPUT tag of type FILE is encountered, the browser might show
a display of (previously selected) file names, and a "Browse" button
or selection method. Selecting the "Browse" button would cause the
browser to enter into a file selection mode appropriate for the
platform. Window-based browsers might pop up a file selection window,
for example. In such a file selection dialog, the user would have the
option of replacing a current selection, adding a new file selection,
etc. Browser implementors might choose let the list of file names be
manually edited.

If an ACCEPT attribute is present, the browser might constrain the
file patterns prompted for to match those with the corresponding
appropriate file extensions for the platform.

3.2 Action on submit

When the user completes the form, and selects the SUBMIT element, the
browser should send the form data and the content of the selected
files. The encoding type application/x-www-form-urlencoded is
inefficient for sending large quantities of binary data or text
containing non-ASCII characters. Thus, a new media type,
multipart/form-data, is proposed as a way of efficiently sending the
values associated with a filled-out form from client to server.

3.3 use of multipart/form-data

The definition of multipart/form-data is included in section 7. A
boundary is selected that does not occur in any of the data. (This
selection is sometimes done probabilisticly.) Each field of the form
is sent, in the order in which it occurs in the form, as a part of
the multipart stream. Each part identifies the INPUT name within the
original HTML form. Each part should be labelled with an appropriate
content-type if the media type is known (e.g., inferred from the file
extension or operating system typing information) or as
application/octet-stream.

If multiple files are selected, they should be transferred together
using the multipart/mixed format.

While the HTTP protocol can transport arbitrary BINARY data, the
default for mail transport (e.g., if the ACTION is a "mailto:" URL)
is the 7BIT encoding. The value supplied for a part may need to be
encoded and the "content-transfer-encoding" header supplied if the
value does not conform to the default encoding. [See section 5 of
RFC 1521 for more details.]

The original local file name may be supplied as well, either as a
'filename' parameter either of the 'content-disposition: form-data'
header or in the case of multiple files in a 'content-disposition:
file' header of the subpart. The client application should make best
effort to supply the file name; if the file name of the client's
operating system is not in US-ASCII, the file name might be
approximated or encoded using the method of RFC 1522. This is a
convenience for those cases where, for example, the uploaded files
might contain references to each other, e.g., a TeX file and its .sty
auxiliary style description.

On the server end, the ACTION might point to a HTTP URL that
implements the forms action via CGI. In such a case, the CGI program
would note that the content-type is multipart/form-data, parse the
various fields (checking for validity, writing the file data to local
files for subsequent processing, etc.).

3.4 Interpretation of other attributes

The VALUE attribute might be used with <INPUT TYPE=file> tags for a
default file name. This use is probably platform dependent. It might
be useful, however, in sequences of more than one transaction, e.g.,
to avoid having the user prompted for the same file name over and
over again.

The SIZE attribute might be specified using SIZE=width,height, where
width is some default for file name width, while height is the
expected size showing the list of selected files. For example, this
would be useful for forms designers who expect to get several files
and who would like to show a multiline file input field in the
browser (with a "browse" button beside it, hopefully). It would be
useful to show a one line text field when no height is specified
(when the forms designer expects one file, only) and to show a
multiline text area with scrollbars when the height is greater than 1
(when the forms designer expects multiple files).

4. Backward compatibility issues

While not necessary for successful adoption of an enhancement to the
current WWW form mechanism, it is useful to also plan for a migration
strategy: users with older browsers can still participate in file
upload dialogs, using a helper application. Most current web browers,
when given <INPUT TYPE=FILE>, will treat it as <INPUT TYPE=TEXT> and
give the user a text box. The user can type in a file name into this
text box. In addition, current browsers seem to ignore the ENCTYPE
parameter in the <FORM> element, and always transmit the data as
application/x-www-form-urlencoded.

Thus, the server CGI might be written in a way that would note that
the form data returned had content-type application/x-www-form-
urlencoded instead of multipart/form-data, and know that the user was
using a browser that didn't implement file upload.

In this case, rather than replying with a "text/html" response, the
CGI on the server could instead send back a data stream that a helper
application might process instead; this would be a data stream of
type "application/x-please-send-files", which contains:

* The (fully qualified) URL to which the actual form data should
be posted (terminated with CRLF)
* The list of field names that were supposed to be file contents
(space separated, terminated with CRLF)
* The entire original application/x-www-form-urlencoded form data
as originally sent from client to server.

In this case, the browser needs to be configured to process
application/x-please-send-files to launch a helper application.

The helper would read the form data, note which fields contained
'local file names' that needed to be replaced with their data
content, might itself prompt the user for changing or adding to the
list of files available, and then repackage the data & file contents
in multipart/form-data for retransmission back to the server.

The helper would generate the kind of data that a 'new' browser
should actually have sent in the first place, with the intention that
the URL to which it is sent corresponds to the original ACTION URL.
The point of this is that the server can use the *same* CGI to
implement the mechanism for dealing with both old and new browsers.

The helper need not display the form data, but *should* ensure that
the user actually be prompted about the suitability of sending the
files requested (this is to avoid a security problem with malicious
servers that ask for files that weren't actually promised by the

user.) It would be useful if the status of the transfer of the files
involved could be displayed.

5. Other considerations

5.1 Compression, encryption

This scheme doesn't address the possible compression of files. After
some consideration, it seemed that the optimization issues of file
compression were too complex to try to automatically have browsers
decide that files should be compressed. Many link-layer transport
mechanisms (e.g., high-speed modems) perform data compression over
the link, and optimizing for compression at this layer might not be
appropriate. It might be possible for browsers to optionally produce
a content-transfer-encoding of x-compress for file data, and for
servers to decompress the data before processing, if desired; this
was left out of the proposal, however.

Similarly, the proposal does not contain a mechanism for encryption
of the data; this should be handled by whatever other mechanisms are
in place for secure transmission of data, whether via secure HTTP or
mail.

5.2 Deferred file transmission

In some situations, it might be advisable to have the server validate
various elements of the form data (user name, account, etc.) before
actually preparing to receive the data. However, after some
consideration, it seemed best to require that servers that wish to do
this should implement this as a series of forms, where some of the
data elements that were previously validated might be sent back to
the client as 'hidden' fields, or by arranging the form so that the
elements that need validation occur first. This puts the onus of
maintaining the state of a transaction only on those servers that
wish to build a complex application, while allowing those cases that
have simple input needs to be built simply.

The HTTP protocol may require a content-length for the overall
transmission. Even if it were not to do so, HTTP clients are
encouraged to supply content-length for overall file input so that a
busy server could detect if the proposed file data is too large to be
processed reasonably and just return an error code and close the
connection without waiting to process all of the incoming data. Some
current implementations of CGI require a content-length in all POST
transactions.

If the INPUT tag includes the attribute MAXLENGTH, the user agent
should consider its value to represent the maximum Content-Length (in

bytes) which the server will accept for transferred files. In this
way, servers can hint to the client how much space they have
available for a file upload, before that upload takes place. It is
important to note, however, that this is only a hint, and the actual
requirements of the server may change between form creation and file
submission.

In any case, a HTTP server may abort a file upload in the middle of
the transaction if the file being received is too large.

5.3 Other choices for return transmission of binary data

Various people have suggested using new mime top-level type
"aggregate", e.g., aggregate/mixed or a content-transfer-encoding of
"packet" to express indeterminate-length binary data, rather than
relying on the multipart-style boundaries. While we are not opposed
to doing so, this would require additional design and standardization
work to get acceptance of "aggregate". On the other hand, the
'multipart' mechanisms are well established, simple to implement on
both the sending client and receiving server, and as efficient as
other methods of dealing with multiple combinations of binary data.

5.4 Not overloading <INPUT>:

Various people have wondered about the advisability of overloading
'INPUT' for this function, rather than merely providing a different
type of FORM element. Among other considerations, the migration
strategy which is allowed when using <INPUT> is important. In
addition, the <INPUT> field *is* already overloaded to contain most
kinds of data input; rather than creating multiple kinds of <INPUT>
tags, it seems most reasonable to enhance <INPUT>. The 'type' of
INPUT is not the content-type of what is returned, but rather the
'widget-type'; i.e., it identifies the interaction style with the
user. The description here is carefully written to allow <INPUT
TYPE=FILE> to work for text browsers or audio-markup.

5.5 Default content-type of field data

Many input fields in HTML are to be typed in. There has been some
ambiguity as to how form data should be transmitted back to servers.
Making the content-type of <INPUT> fields be text/plain clearly
disambiguates that the client should properly encode the data before
sending it back to the server with CRLFs.

5.6 Allow form ACTION to be "mailto:"

Independent of this proposal, it would be very useful for HTML
interpreting user agents to allow a ACTION in a form to be a

"mailto:" URL. This seems like a good idea, with or without this
proposal. Similarly, the ACTION for a HTML form which is received via
mail should probably default to the "reply-to:" of the message.
These two proposals would allow HTML forms to be served via HTTP
servers but sent back via mail, or, alternatively, allow HTML forms
to be sent by mail, filled out by HTML-aware mail recipients, and the
results mailed back.

5.7 Remote files with third-party transfer

In some scenarios, the user operating the client software might want
to specify a URL for remote data rather than a local file. In this
case, is there a way to allow the browser to send to the client a
pointer to the external data rather than the entire contents? This
capability could be implemented, for example, by having the client
send to the server data of type "message/external-body" with
"access-type" set to, say, "uri", and the URL of the remote data in
the body of the message.

5.8 File transfer with ENCTYPE=x-www-form-urlencoded

If a form contains <INPUT TYPE=file> elements but does not contain an
ENCTYPE in the enclosing <FORM>, the behavior is not specified. It
is probably inappropriate to attempt to URN-encode large quantities
of data to servers that don't expect it.

5.9 CRLF used as line separator

As with all MIME transmissions, CRLF is used as the separator for
lines in a POST of the data in multipart/form-data.

5.10 Relationship to multipart/related

The MIMESGML group is proposing a new type called multipart/related.
While it contains similar features to multipart/form-data, the use
and application of form-data is different enough that form-data is
being described separately.

It might be possible at some point to encode the result of HTML forms
(including files) in a multipart/related body part; this is not
incompatible with this proposal.

5.11 Non-ASCII field names

Note that mime headers are generally required to consist only of 7-
bit data in the US-ASCII character set. Hence field names should be
encoded according to the prescriptions of RFC 1522 if they contain
characters outside of that set. In HTML 2.0, the default character

set is ISO-8859-1, but non-ASCII characters in field names should be
encoded.

6. Examples

Suppose the server supplies the following HTML:

<FORM ACTION="http://server.dom/cgi/handle"
ENCTYPE="multipart/form-data"
METHOD=POST>
What is your name? <INPUT TYPE=TEXT NAME=submitter>
What files are you sending? <INPUT TYPE=FILE NAME=pics>
</FORM>

and the user types "Joe Blow" in the name field, and selects a text
file "file1.txt" for the answer to 'What files are you sending?'

The client might send back the following data:

Content-type: multipart/form-data, boundary=AaB03x

--AaB03x
content-disposition: form-data; name="field1"

Joe Blow
--AaB03x
content-disposition: form-data; name="pics"; filename="file1.txt"
Content-Type: text/plain

... contents of file1.txt ...
--AaB03x--

If the user also indicated an image file "file2.gif" for the answer
to 'What files are you sending?', the client might client might send
back the following data:

Content-type: multipart/form-data, boundary=AaB03x

--AaB03x
content-disposition: form-data; name="field1"

Joe Blow
--AaB03x
content-disposition: form-data; name="pics"
Content-type: multipart/mixed, boundary=BbC04y

--BbC04y
Content-disposition: attachment; filename="file1.txt"

Content-Type: text/plain

... contents of file1.txt ...
--BbC04y
Content-disposition: attachment; filename="file2.gif"
Content-type: image/gif
Content-Transfer-Encoding: binary

...contents of file2.gif...
--BbC04y--
--AaB03x--

7. Registration of multipart/form-data

The media-type multipart/form-data follows the rules of all multipart
MIME data streams as outlined in RFC 1521. It is intended for use in
returning the data that comes about from filling out a form. In a
form (in HTML, although other applications may also use forms), there
are a series of fields to be supplied by the user who fills out the
form. Each field has a name. Within a given form, the names are
unique.

multipart/form-data contains a series of parts. Each part is expected
to contain a content-disposition header where the value is "form-
data" and a name attribute specifies the field name within the form,
e.g., 'content-disposition: form-data; name="xxxxx"', where xxxxx is
the field name corresponding to that field. Field names originally in
non-ASCII character sets may be encoded using the method outlined in
RFC 1522.

As with all multipart MIME types, each part has an optional Content-
Type which defaults to text/plain. If the contents of a file are
returned via filling out a form, then the file input is identified as
application/octet-stream or the appropriate media type, if known. If
multiple files are to be returned as the result of a single form
entry, they can be returned as multipart/mixed embedded within the
multipart/form-data.

Each part may be encoded and the "content-transfer-encoding" header
supplied if the value of that part does not conform to the default
encoding.

File inputs may also identify the file name. The file name may be
described using the 'filename' parameter of the "content-disposition"
header. This is not required, but is strongly recommended in any case
where the original filename is known. This is useful or necessary in
many applications.

8. Security Considerations

It is important that a user agent not send any file that the user has
not explicitly asked to be sent. Thus, HTML interpreting agents are
expected to confirm any default file names that might be suggested
with <INPUT TYPE=file VALUE="yyyy">. Never have any hidden fields be
able to specify any file.

This proposal does not contain a mechanism for encryption of the
data; this should be handled by whatever other mechanisms are in
place for secure transmission of data, whether via secure HTTP, or by
security provided by MOSS (described in RFC 1848).

Once the file is uploaded, it is up to the receiver to process and
store the file appropriately.

9. Conclusion

The suggested implementation gives the client a lot of flexibility in
the number and types of files it can send to the server, it gives the
server control of the decision to accept the files, and it gives
servers a chance to interact with browsers which do not support INPUT
TYPE "file".

The change to the HTML DTD is very simple, but very powerful. It
enables a much greater variety of services to be implemented via the
World-Wide Web than is currently possible due to the lack of a file
submission facility. This would be an extremely valuable addition to
the capabilities of the World-Wide Web.

coolbaby · 2002-07-29

谢谢snowsky，我看了rfc1867
可惜讲得不是很详细，还是没讲具体的封包格式
我再看看吧

wingerk · 2002-08-02

傻，CSDN有一篇文章分析HTTP TUNNEL的实现的，自己找找看吧

远帆 · 2002-08-02

我认为这个不单单是写一个客户端就能搞定的。你还需要一台服务器和服务器端软件啊。
socksOnline为什么慢？还不是因为服务器不行吗？

Luis Pater · 2002-08-02

好东西！抬一把！

coolbaby · 2002-08-02

to 远帆:
我上面的意思就是要一个客户端和一个服务器端，我所问的就是客户端如何通过http协议把消息
发给服务器端。

to wingerk:
那篇文章看了，没什么用，基本上。

coolbaby · 2002-08-03

能否给个示范的http请求头
必须能携带一定长度的数据
300以上字节就可以了

wingerk · 2002-08-05

那篇文章还是有一定的参考价值，比如那里就有你所需要的HTTP请求头，以及指定可以传送
的数据量（就是CONTENT-LENGTH那项），其实可以不设，断开连接，一般是由服务器来控制，
只要你写的服务器不断开就仍然保持连接，当然，也不排除PROXY会把连接断掉，如果有这种
情况，则只要把CONTENT-LENGTH设得很大，以及断开再重连，应该可以解决。
我记得那篇文章对这些东西都讲得很详细。实现的一些思路也有介绍。仔细看看吧。

wingerk · 2002-08-05

看看别人说的：
flywhc (2002-3-13 15:13:34)

看他写的分析就知道这个人沽名钓誉，不太懂http协议！这里使用的是waysonline等等
许多站都提供的http -> sock5代理转换类似的技术看他的分析，其实这个http -tunnel
的原理是: 用两个单向的tcp连接代替一个标准的双向tcp连接。因为http协议规定一个
tcp连接要么是request，要么是response，不可以同时收发。两个tcp连接中，第一个
tcp连接使用post方法，因为post方法的http body部分理论上可以发送无限长度的数据，
这就模拟了一个tcp连接的client发送部分而另一个是作为client接受，需要能接收无限
长度的数据，因此会这样： HTTP/1.1 200 OK Content-Length: 102400 这里的102400给
个最大的下载数据量，当传够以后应该重新建立连接再次GET发送请求，但其实不断开连
接也是可以的，既然server 和client都是自己做的，不必遵守http，这里我认为是用于
应付某些proxy的需要。这样有发送有接受就构成了完整的一个tcp连接，后面传什么数
据当然就不需要GET/POST等等东西，而且完全符合HTTP协议！理论上不会被发现的。这
样作的好处就是可以做一个穿透只支持http防火墙的木马，但条件是你有一个静态IP的
server。网络神偷使用公共个人主页保存IP地址解决了这个问题，但还使用了UDP协议，
而且使用标准的tcp连接，不能穿透标准http proxy。我还有一个设想，用个
http/mail/ftp server做中转，这样C/S两边就可以都在firewall后，而且被黑者用什么
方法都无法知道是谁在控制自己的机器，而且可以做到批处理离线控制节省控制者的网
费。唉，一直想做，可惜没空作

coolbaby · 2002-08-05

好的，谢谢wingerk
我有时间再看看那篇文档

你有qq么？有时间交流一下
另外，你有flywhc 的qq么
我也很想和他交流一下。

socks 2 http 编程问题！ (120分)

coolbaby

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

gxcooo

Unregistered / Unconfirmed

xiao_min

Unregistered / Unconfirmed

wlmmlw

Unregistered / Unconfirmed

盛利

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

gxcooo

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

snowsky

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

wingerk

Unregistered / Unconfirmed

远帆

Unregistered / Unconfirmed

Luis Pater

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

wingerk

Unregistered / Unconfirmed

wingerk

Unregistered / Unconfirmed

coolbaby

Unregistered / Unconfirmed

Similar threads