
Data Conversions in CGI Programs (Original)
This page lists information about CGI output mode data conversions.
Contents
Data Conversions in CGI Programs
The server can perform ASCII to EBCDIC conversions before sending data to CGI programs. This is because the Internet is an ASCII world and the iSeries is primarily an extended binary-coded decimal interchange code (EBCDIC) world. The server can also perform EBCDIC to ASCII conversions before sending data back to the browser. You must provide data to a CGI program through environment variables and standard-input (stdin). HTTP and HTML specifications allow you to tag text data with a character set (charset parameter on the Content-Type header). However, this practice is not widely in use today (although technically required for HTTP1.0/1.1 compliance). According to this specification, text data that is not tagged can be assumed to be in the default character set ISO-8859-1 (US-ASCII). AS/400 correlates this character set with ASCII CCSID 819.
There are basically three different ways the server can process the input to your CGI program. You can configure the server to control which mode is used by specifying an overall server directive (CGIMode or CGIConvMode) or an optional parameter on the Exec or Post-Script script directives:
CGIMode Mode
Exec request-template program-path [server-IP-address or hostname] [Mode]
Post-Script program_path_and_name [server-IP-address or hostname] [Mode]
Where x Mode is one of the following:
%%MIXED%% or %%MIXED/MIXED%% - This is the default.
%%EBCDIC%% or %%EBCDIC/MIXED%%
%%EBCDIC/EBCDIC%%
%%BINARY%% or %%BINARY/MIXED
%%BINARY/EBCDIC
%%BINARY/BINARY
%%EBCDIC_JCD%% or %%EBCDIC_JCD/MIXED%%
%%EBCDIC_JCD/EBCDIC%%
The CgiMode can be thought of as 2 logical pieces. The input and output mode. They are separated by the "/". If only the input mode is provided, the output mode is defaulted to MIXED for compatibility.
In addition, the system provides the following CGI environment variables to the CGI program:
- CGI_MODE - which input conversion mode the server is using (%%MIXED%% or %%EBCDIC%% or %%BINARY%% or %%EBCDIC_JCD%%)
- CGI_ASCII_CCSID - from which ASCII CCSID was used to convert the data
- CGI_EBCDIC_CCSID - which EBCDIC CCSID the data was converted into
- CGI_OUTPUT_MODE - which output conversion mode the server is using (%%MIXED%% or %%EBCDIC%% or %%BINARY%%)
So, for the input half of the CgiMode, we have this short explanation:
%%MIXED%% - environment variables in CCSID 37, stdin data that is converted to CCSID of the job, escape sequences remain in ASCII representations. This is the default.
%%EBCDIC%% - the system converts environment variables and stdin data, including escape sequences, to the CCSID of the job.
%%BINARY%% - the server performs no conversion for stdin data and QUERY_STRING; all other environment variables are in the CCSID of the job.
%%EBCDIC_JCD%% - same as %%EBCDIC%% except for servers running under a Japanese CCSID. Servers running under a Japanese CCSID use the JCD utility to determine which Japanese CCSID to use to convert from.
CGI Conversion Modes
The following section explains CGI input conversion modes in more detail.
- MIXED
- This mode is the default mode of operation for the server. The system converts values for CGI environment variables to EBCDIC CCSID 37, including QUERY_STRING. The system converts stdin data to the CCSID of the job. However, the system still represents the encoded characters "%xx" by the ASCII 819 octet. This requires the CGI program to convert these further into EBCDIC to process the data. For more information, see symptom, Special characters are not being converted or handled as expected in Web Programming Guide Chapter 10. Troubleshooting your CGI programs.
- EBCDIC
- In this mode, the server will convert everything into the EBCDIC CCSID of the job. The server checks the Entity bodies for a charset tag. If found, the server will convert the corresponding ASCII CCSID to the EBCDIC CCSID of the job. If the server does not find a charset tag, it uses the value of the DefaultNetCCSID configuration directive as the conversion CCSID. In addition, the system converts escaped octets from ASCII to EBCDIC, eliminating the need to perform this conversion in the CGI program.
- BINARY
- In this mode, the server processes environment variables (except QUERY_STRING) the same way as EBCDIC mode. The server performs no conversions on either QUERY_STRING or stdin data.
- EBCDIC_JCD
- Japanese browsers can potentially send data in one of three code pages, JIS (ISO-2022-JP), S-JIS (PC-Windows), or EUC (UNIX). In this mode, the server uses a well-known JCD utility to determine which codepage to use (if not explicitly specified by a charset tag) to convert stdin data.
Conversion action for text in CGI Stdin
This table summarizes the type of conversion that is performed by the server for each CGI mode.
| CGI_MODE |
Conversion |
Stdin encoding |
Environment variables |
Query_String encoding |
argv encoding |
| %%BINARY%% |
None |
No conversion |
FsCCSID |
No conversion |
No conversion |
| %%EBCDIC%% |
NetCCSID to FsCCSID |
FsCCSID |
FsCCSID |
FsCCSID |
FsCCSID |
| %%EBCDIC%% - with charset tag received |
Calculate target EBCDIC CCSID based on received ASCII charset tag |
EBCDIC equivalent of received charset |
FsCCSID |
FsCCSID |
FsCCSID |
| %%EBCDIC_JCD%% |
Detect input based on received data. Convert data to FsCCSID |
FsCCSID |
FsCCSID |
Detect ASCII input based on received data. Convert data to FsCCSID |
Detect ASCII input based on received data. Convert data to FsCCSID |
| %%MIXED%% (Compatibility mode) |
NetCCSID to FsCCSID (receive charset tag is ignored) |
FsCCSID with ASCII escape sequences |
CCSID 37 |
CCSID 37 with ASCII escape sequences |
CCSID37 with ASCII escape sequences |
Data Conversions on CGI Output
The CgiMode has been enhanced to include an output mode. The output mode, or second half of the CgiMode directive has the following short explanation:
%%MIXED%% - HTTP header output is in CCSID 37, however, the escape sequences must be the EBCDIC representation of the ASCII code point for the 2 characters following the "%" in the escape sequence. An example of a HTTP header that may contain escape sequences is the Location header. HTTP body output is treated as described below.
%%EBCDIC%% - HTTP header output is in CCSID 37, however, the escape sequences must be the EBCDIC representation of the EBCDIC code point for the 2 characters following the "%" in the escape sequence. An example of a HTTP header that may contain escape sequences is the Location header. HTTP body output is treated as described below.
%%BINARY%% - HTTP header output is in CCSID 819 with the escape sequences also being the ASCII representation of the ASCII code point. An example of a HTTP header that may contain escape sequences is the Location header. HTTP body output is treated as described below.
For HTTP body standard-output (stdout) data that is sent from the CGI program, the server recognizes and uses the charset or CCSID parameter on the text/* Content-Types. If you specify ASCII, the server performs no conversions on the data. Otherwise, the system uses the value instead of the DefaultFsCCSID on conversions back to the browser. The system sets an appropriate charset tag for all text/* Content-Types that it sends back to the browser. The exception to this is %%MIXED%% or %%MIXED/MIXED%% or %%BINARY/BINARY%% modes and when the charset/CCSID parameter is set to 65535.
Conversion action and charset tag generation for text in CGI Stdout
This table summarizes the type of conversion that is performed and the charset tag that is returned to the browser by the server.
| CGI Stdout CCSID/Charset in HTTP header |
Conversion action |
Server reply charset tag |
| EBCDIC CCSID/Charset |
Calculate EBCDIC to ASCII conversion based on supplied EBCDIC CCSID/Charset |
Calculated ASCII charset |
| ASCII CCSID/Charset |
No conversion |
Stdout CCSID/Charset as Charset |
| 65535 |
No conversion |
None |
| None (%%BINARY%% or %%BINARY/MIXED%% or %%BINARY/EBCDIC%%) |
Default Conversion - FsCCSID to NetCCSID |
NetCCSID as charset |
| None (%%BINARY/BINARY%%) |
No conversion |
None |
| None (%%EBCDIC%% or %%EBCDIC/MIXED%% or %%EBCDIC/EBCDIC%%) |
Default Conversion - FsCCSID to NetCCSID |
NetCCSID as charset |
| None (%%EBCDIC%% or %%EBCDIC/MIXED%% or %%EBCDIC/EBCDIC%% with charset tag received on HTTP request) |
Use inverse of conversion calculated for stdin |
Charset as received on HTTP request |
| None %%EBCDIC_JCD%% or %%EBCDIC_JCD/MIXED%% or %%EBCDIC_JCD/EBCDIC%% |
Default Conversion - FsCCSID to NetCCSID |
NetCCSID as charset |
| None (%%MIXED%% or %%MIXED/MIXED%%) |
Default Conversion - FsCCSID to NetCCSID |
None
(compatibility
mode) |
| Invalid |
CGI error 500 generated by server |
The server also sets an environment variable CGI_OUTPUT_MODE to reflect the setting for the Cgi output mode. It contains the CGI output conversion mode the server is using for this request. Valid values are %%EBCDIC%%, %%MIXED%%, or %%BINARY%%. The program can use this information to determine what conversion, if any, will be performed by the server on CGI output.
|