[thelist] charset, multipart/form-data and multipart/x-www-form-urlencoded
Bill Moseley
moseley at hank.org
Sat Jan 23 11:40:11 CST 2010
I have a working application that is all utf-8. On my web forms I either
use this:
<form method="post" action="..." accept-charset="utf-8">
Or if I have any upload fields on the form I use:
<form method="post" action="..." enctype="multipart/form-data"
accept-charset="utf-8">
Note that the accept-charset is requesting the client to encode in utf8.
With the two browsers I tested (Firefox 3 and Crome) the respective content
type headers in the POST are:
Content-Type: application/x-www-form-urlencoded
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Content-Type: multipart/form-data;
boundary=---------------------------124668924421214781111253174633
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
(I realize the Accept-Charset header is not relative in my question below).
I'm curious why the browser is not telling me what character encoding it
used. Do I just have to assume that the character encoding is what I
specified in the accept-charset in the <form> element? Obviously, clients
don't have to read my form before posting. I do decode all content as utf-8
(and thus an error will be generated if invalid utf8 is detected).
Just seems odd. When sending a series of octets that represent text to some
remote server sure seems like the client would need to specify the character
encoding used to encode those octets.
Am I missing some fundamental part of http?
--
Bill Moseley
moseley at hank.org
More information about the thelist
mailing list