Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

force users to use english character set

 
Alex Hank
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The site that I work on uses several forms to collect various information. Sometimes when users use non english character sets, it can cause trouble with our systems.

How can I force users to use the english character set, or how do I convert it to the english character set before it put it in the database.

I am not sure if this is done on the client side or the server side.

can someone please shed some light on this


thanks

Alex
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can't stop users from submitting in whatever characterset they please, unless you want to filter all data in JavaScript before it is submitted, and remove all non-us-ascii values. I would seriously question what kind of system that is which is confused by non-us-ascii characters, in this day and age. Does it not allow users to have accented names, for instance? If it must be done, you should filter the data on the server.
 
Alex Hank
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
our database guy (my boss) says that when he moves the data between a linux and windows machine, the records with non english character sets can cause trouble. It will stop transfering the data at the record with the non english characters.

Our database guy has a method to do the conversion in the database, however, I guess that he rather not have to do it. We use a Red Back database(I wish we used mysql)

Our database guy also says that in the past, we have also had problem printing badges that have nonenglish characters.

I am just following bosses orders.

So this should be done on the client side with javascript before submit?

Is there a method to do it with JSP?

thanks

alex
 
Paul Bourdeaux
Ranch Hand
Posts: 783
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could filter it on the server in whatever action you are submitting the form to. If you are submitting into a servlet, just put a filter in the doPost method and remove/replace non-english characters.
 
Ben Souther
Sheriff
Posts: 13411
Firefox Browser Redhat VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
RedBack is middleware for U2 databases (Universe and Unidata).
Both databases treat certain characters as control characters (mostly in the ascii 250 ~ 255 range). Somewhere in your code, you will need to check for those characters and escape them with your own sequence, change them to another character, or blow up when a user tries to enter them.
 
Alex Hank
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for the advice.

I figured it out by creating the method charFix.

<%!
String replace(String s, String one, String another) {
// In a string replace one substring with another
if (s.equals("")) return "";
String res = "";
int i = s.indexOf(one,0);
int lastpos = 0;
while (i != -1) {
res += s.substring(lastpos,i) + another;
lastpos = i + one.length();
i = s.indexOf(one,lastpos);
}
res += s.substring(lastpos); // the rest
return res;
}

String charFix(String s){
//REPLACE ALL NONENGLISH CHARACTERS WITH ENGLISH CHARACTERS
if (s.equals("")) return "";
String res = s;
String badChar = "";

badChar = new Character((char)013).toString();res = replace(res,badChar, "");
badChar = new Character((char)034).toString();res = replace(res,badChar, ""); //"
badChar = new Character((char)000).toString();res = replace(res,badChar, ""); //BLANK

badChar = new Character((char)192).toString();res = replace(res,badChar, "A");
badChar = new Character((char)193).toString();res = replace(res,badChar, "A");
badChar = new Character((char)194).toString();res = replace(res,badChar, "A");
badChar = new Character((char)195).toString();res = replace(res,badChar, "A");
badChar = new Character((char)196).toString();res = replace(res,badChar, "A");
badChar = new Character((char)197).toString();res = replace(res,badChar, "A");
badChar = new Character((char)198).toString();res = replace(res,badChar, "A");

badChar = new Character((char)200).toString();res = replace(res,badChar, "E");
badChar = new Character((char)201).toString();res = replace(res,badChar, "E");
badChar = new Character((char)202).toString();res = replace(res,badChar, "E");
badChar = new Character((char)203).toString();res = replace(res,badChar, "E");

badChar = new Character((char)204).toString();res = replace(res,badChar, "I");
badChar = new Character((char)205).toString();res = replace(res,badChar, "I");
badChar = new Character((char)206).toString();res = replace(res,badChar, "I");
badChar = new Character((char)207).toString();res = replace(res,badChar, "I");

badChar = new Character((char)210).toString();res = replace(res,badChar, "O");
badChar = new Character((char)211).toString();res = replace(res,badChar, "O");
badChar = new Character((char)212).toString();res = replace(res,badChar, "O");
badChar = new Character((char)213).toString();res = replace(res,badChar, "O");
badChar = new Character((char)214).toString();res = replace(res,badChar, "O");

badChar = new Character((char)217).toString();res = replace(res,badChar, "O");
badChar = new Character((char)218).toString();res = replace(res,badChar, "O");
badChar = new Character((char)219).toString();res = replace(res,badChar, "O");
badChar = new Character((char)220).toString();res = replace(res,badChar, "O");

badChar = new Character((char)224).toString();res = replace(res,badChar, "a");
badChar = new Character((char)225).toString();res = replace(res,badChar, "a");
badChar = new Character((char)226).toString();res = replace(res,badChar, "a");
badChar = new Character((char)227).toString();res = replace(res,badChar, "a");
badChar = new Character((char)228).toString();res = replace(res,badChar, "a");
badChar = new Character((char)229).toString();res = replace(res,badChar, "a");
badChar = new Character((char)230).toString();res = replace(res,badChar, "a");

badChar = new Character((char)232).toString();res = replace(res,badChar, "e");
badChar = new Character((char)233).toString();res = replace(res,badChar, "e");
badChar = new Character((char)234).toString();res = replace(res,badChar, "e");
badChar = new Character((char)235).toString();res = replace(res,badChar, "e");

badChar = new Character((char)236).toString();res = replace(res,badChar, "i");
badChar = new Character((char)237).toString();res = replace(res,badChar, "i");
badChar = new Character((char)238).toString();res = replace(res,badChar, "i");
badChar = new Character((char)239).toString();res = replace(res,badChar, "i");

badChar = new Character((char)241).toString();res = replace(res,badChar, "n");

badChar = new Character((char)242).toString();res = replace(res,badChar, "o");
badChar = new Character((char)243).toString();res = replace(res,badChar, "o");
badChar = new Character((char)244).toString();res = replace(res,badChar, "o");
badChar = new Character((char)245).toString();res = replace(res,badChar, "o");
badChar = new Character((char)246).toString();res = replace(res,badChar, "o");

badChar = new Character((char)248).toString();res = replace(res,badChar, "u");
badChar = new Character((char)249).toString();res = replace(res,badChar, "u");
badChar = new Character((char)250).toString();res = replace(res,badChar, "u");

return res;
}
%>
 
Ben Souther
Sheriff
Posts: 13411
Firefox Browser Redhat VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If your business model allows you to change characters that way, this is an acceptable solution. If your customers expect to see their legal names rendered properly, you might consider creating some escape sequences for the non english characters.

If you do go with this approach, you might want to stop by the Performance forum for some tips on doing this more efficiently.

I'm guessing that looping through the string and comparing characters in a switch statement will better than re-reading the entire string for every possible character you could encounter.
 
Yuriy Zilbergleyt
Ranch Hand
Posts: 429
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


Should these change it to 'U' instead of 'O'? Didn't check the codes, just going by the pattern.

-Yuriy
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Of course, the ones you're replacing are just a select few. These days, people use Unicode fonts that may have thousands of characters. You might want to handle two-byte characters especially.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic