Search : 
  Powered by FindinSite

ASP Documentation
• Introduction
• Getting started
• Dynamic-CD wizard
Passwords & Encryption
• Overview
• Example 1
• Example 2
• The script language
• Script examples
• Database scripts
Technical details
• Applications
• Built-in objects
• Character Encoding
• Cookies
• Database CDs
• Development tips
• FAQs
• Future developments
• Global.asa
• Network support
• Object registration
• Resources
• ScriptingContext
• Server-side includes
• Sessions
• Technical limits
<FORM> Character Encoding

Suppose you have a page with a <FORM> accepting user input. How can your results page script ensure that it has received the correct characters for all languages?

Text information typed into an input field in a <FORM> will be returned to the server using the page's encoding character set (charset). The same characters will be encoded in different ways for different charsets.

For example, the text "10" for an input field called "Text" will be encoded as follows in charset "ISO-8859-1":

  • Text=%A310

    In the "UTF-8" charset, the following is sent:

  • Text=%C2%A310
    where %XX is the hexadecimal string for a single character.

    If you enter the non-Western characters &#20013;&#25991; into this field using the "ISO-8859-1" charset, then the following would be sent for these two Chinese characters:

  • Text=%26%2320013%3B%26%2325991%3B

    Decoding the hex characters, you can see that the browser is using HTML character escape sequences, so this is equivalent to:

  • Text=&#20013;&#25991;

    However, some charsets do not use this escape encoding, consequently some characters may not be sent to the server. A safe technique that ensures that all characters are sent is to use the "UTF-8" charset (Note that your entire page needs to use this charset.)

    The browser does not provide any information about how it has encoded input information. The best you can do is to assume that the received data has been returned using the charset of the originating page. Some browsers allow a user to change the page encoding ... this will break your code.

    Western ASP correctly decodes ISO-8859-1 encoded data, although non-Western characters sent as HTML character escape sequences are not decoded.

    If the page with your <FORM> uses a different charset, then you must set the ASP codepage and the page charset to match. For UTF-8, the codepage must be set to 65001, eg:

    <%@ CODEPAGE = 65001%>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

    Look here for a list of codepages for each charset: MSDN library reference.

    See also Response.CodePage and Session.CodePage.

    Be careful when displaying user input in case it is a security risk.

       Home     Purchase     Licenses     Limitations     Version details     Site map     Contact     Copyright © 2000-2016 phdcc

      Dynamic-CD : the web server on CD Learn more | What's new