Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Validating other language chars

 
Mohammed Yousuff
Ranch Hand
Posts: 198
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Their was a requirment to validate the other language input (may be hindi or Tamil) in java. As we know localisation concepts which will take care of ONLY ocalizing the string based on the locale. However it has nothing to do with other language input.

I was thing is there any started way to identifiy a unicode is char or a integer ? and how can i find the length of the input other lang string (because string.length() will give wrong length).

Please let me also know if you guys knows any framework for this. Thank you ;)
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am not very sure I am getting your questions, so far I understand you would like to know what is the language of a given input.

I am pretty sure that there could be several ways to do it, but you could easily determine the unicode block of every character to know for sure if it is tamil. For instance somewhat like this could work



By doing this, you could ensure that all characters on a given text are in Tamil characters. Simply use the java.lang.Character.UnicodeBlock class as I did.

Beware however that this only checks the characters belong to the Tamil unicode block. If the input contains mixed characters, this would yield false. I do not know if the numbers are written in other character blocks in Tamil. Perhaps, you could determine the percentage of characters in Tamil in the text and from there take decisions.

This is just an idea, it is up to you how to use it.

Does this help somehow?
 
Mohammed Yousuff
Ranch Hand
Posts: 198
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Edwin for your quick response ;)

let me rephrase my question, if a person giving a tamil language as a input. From java i should able to know if the given starting contains Tamil number or Tamil characters.

How to Find the length of the given input, the reason here in tamil two words "கா" represents only one in Tamil.
 
Edwin Dalorzo
Ranch Hand
Posts: 961
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry pal, you will need to elaborate on your question. I continue without understanding what is exactly that you would like to do. Provide some information, post some code of what you are trying to accomplish and perhaps we can help you out.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic