Hi,
These are the steps that I followed to implement character encoding.
We are upgrading the application to support data entered in French.
The application is supposed to store and retrieve data entered in French.
French Alphabets � Even Before making any changes for character encoding , Except the chars � � and � ,everything was stored and retrieved properly
French AlphabetsA a <code>(� �, � �), (� �), B b, C c (� �), D d, E e (� �, � �, � �,
� �), F f, G g, H h, I i (� �, � �), J j, K k, L l, M m, N n
(� �), O o (� �), (� �), P p, Q q, S s, T t, ),
V v, W w, X x, (� �), Z </code>
When the control passes to
servlet from the
JSP, the chars � � and � are not preserved. They get passed as rectangles. On storing to the database they get stored as inverted question marks.
To handle this, I followed the following steps:
1.The following servlets are added.
<code>
import java.io.IOException;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletResponse;
public class UTF8EncodingFilter implements javax.servlet.Filter
{
public void init( FilterConfig filterConfig ) throws ServletException
{
// This would be a good place to collect a parameterized
// default encoding type. For brevity, we're going to
// use a hard-coded value in this example.
}
public void doFilter( ServletRequest request,
ServletResponse response,
FilterChain filterChain )
throws IOException, ServletException
{
// Wrap the response object.
You should create a mechanism
// to ensure the response object only gets wrapped once.
// In this example, the response object will inappropriately
// get wrapped multiple times during a forward.
response = new UTF8EncodingServletResponse((HttpServletResponse)response );
// Specify the encoding to assume for the request so
// the parameters can be properly decoded/.
request.setCharacterEncoding( "UTF-8" );
response.setContentType("UTF-8");
System.out.println("UTF8EncodingFilter : doFilter() -> Both request &reponse are in UTF-8 Format.");
filterChain.doFilter( request, response );
}
public void destroy()
{
// no-op
}
}
------------------------------------------------------------------------------------------------------------
/*
* Created on Oct 30, 2007
*
* To change the template for this generated file go to
* Window>Preferences>
Java>Code Generation>Code and Comments
*/
import javax.servlet.http.HttpServletResponse;
public class UTF8EncodingServletResponse
extends javax.servlet.http.HttpServletResponseWrapper
{
private boolean encodingSpecified = false;
public UTF8EncodingServletResponse( HttpServletResponse response )
{
super( response );
}
public void setContentType(
String type )
{
String explicitType = type;
// If a specific encoding has not already been set by the app,
// let's see if this is a call to specify it. If the content
// type doesn't explicitly set an encoding, make it UTF-8.
if (!encodingSpecified)
{
String lowerType = type.toLowerCase();
// See if this is a call to explicitly set the character encoding.
if (lowerType.indexOf( "charset" ) < 0)
{
// If no character encoding is specified, we still need to
// ensure the app is specifying text content.
if (lowerType.startsWith( "text/" ))
{
// App is sending a text response, but no encoding
// is specified, so we'll force it to UTF-8.
explicitType = type + "; charset=UTF-8";
}
}
else
{
// App picked a specific encoding, so let's make
// sure we don't override it.
encodingSpecified = true;
}
}
// Delegate to supertype to record encoding.
super.setContentType( explicitType );
}
}
</code>
WEB.XML changes
-----------------
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd">
<web-app id="WebApp">
<display-name>App Name</display-name>
<filter>
<filter-name>UTF8 Filter</filter-name>
<filter-class>com.vie.remoteDiagnostics.view.servlets.UTF8EncodingFilter
</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>UTF8 Filter</filter-name>
<servlet-name>UploadServlet</servlet-name>
</filter-mapping>
-UploadServlet is the servlet to which the request is passed on submit of the input page.
-Instead of <servlet-name>UploadServlet</servlet-name> if <url-patter>/*</url-pattern> is used, then the chars are getting converted to format like this �? ��), F f, G g, H h, I i (�? ��, �? ��), J j, K k, L l, M m, N n
-(�? ��), O o (�? ��), (�? �?),
JSP page changes:
-----------------
<HEAD>
<META http-equiv="Content-Language" content="en-us">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
(The above tag is added only on the page where the data is displayed. Not on the page where the user provides the input. if this include tag is provided on the input page, the entered characters are getting converted to same format as given above like �? ��), O o (�? ��), (�? �. )
Tomcat: 5.5.9 changes
---------------------
In Server.xml in conf directory
The URIEncoding="UTF-8" is added:
<Connector port="8080" maxHttpHeaderSize="8192" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="8443" acceptCount="100" connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="UTF-8" />
<Connector port="8009" enableLookups="false" redirectPort="8443" uriencoding="UTF-8" protocol="AJP/1.3" />
Copied �Catalina.bat� and Catalina.xml files to the bin folder coz there where not there initially in Tomcat 5.5.9.
-Dfile.encoding=UTF-8 is added to the Catalina.bat as below
%_EXECJAVA% %JAVA_OPTS% %CATALINA_OPTS% %DEBUG_OPTS% -Dfile.encoding=UTF-8 -Djava.endorsed.dirs="%JAVA_ENDORSED_DIRS%" -classpath "%CLASSPATH%" -Dcatalina.base="%CATALINA_BASE%" -Dcatalina.home="%CATALINA_HOME%" -Djava.io.tmpdir="%CATALINA_TMPDIR%" %MAINCLASS% %CMD_LINE_ARGS% %ACTION%
goto end
So after making all these changes, still the characters � � and � are getting stored as inverted question marks.
When I searched in Tomcat bug list, it says that there is a patch (
http://issues.apache.org/jira/browse/OFBIZ-281)that has to be applied for encoding in Tomcat 5.5.9. but the patch points to the CatalinaContainer.java class that is in the OFBiz framework (opensource framework ). I could'nt see this class in Tomcat 5.5.9 installation.
It would be really great if someone could throw some light on this.
Is there anything wrong with the steps followed? or
Is this really a Tomcat 5.5.9 issue.
Please help