• Post Reply Bookmark Topic Watch Topic
  • New Topic

Is there a list of languages supported by a single char variable in java  RSS feed

 
Vijay Tyagi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is there a list of languages supported by a single char variable in java ?
Not by joining 2 chars,just a single char variable, 0x0000 to 0xFFFF which I understand is referred to as "basic multilingual plane" <BMP>
 
Paul Clapham
Sheriff
Posts: 22828
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't understand the question. What does it mean (to you) for a character to support a language?
 
Vijay Tyagi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I mean , all symbols<alphabets and numerals and special characters> of that particular human language, will fit into a single char,(range 0x0000 to 0xFFFF),one at a time.

 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16060
88
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not even sure if such a question really makes sense, but if you want to know more about the Basic Multilingual Plane, then you can find information on that on the Unicode.org website. Wikipedia also has an explanation.
 
Campbell Ritchie
Marshal
Posts: 56541
172
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are some languages which do not have characters at all, because the people who speak them have not developed writing. Remember there are several thousand languages spoken on earth, and Unicode is a way of recording their alphabets. The languages preceded Unicode, so what you mean is how many languages Unicode has managed to record with characters < 0xffff. It is probably easier to find which languages use characters ≥ 0x010000. Go to the Unicode website, open the charts for the different languages, and see which numbers they support. For example, Ethiopic comes up as U1200 (< 0xffff) and CJK Extension B comes out as U20000 (> 0x010000). CJK means Chinese Japanese and Korean.
You might find lists of characters somewhere else in the Unicode website, if you look for its history. That is all I can think of.

There are some languages which share character sets; for example all Western European languages use similar letters, and can probably be fitted into U0000…U0100. That would include Romany, English, Lallans, Gaelic, Cornish, Welsh, Latin, Huguenot and Yiddish. Not Hebrew or Chinese, which are Asian languages. Those are all languages which have been spoken in Great Britain for several hundred years, so you can see what a big list you would have. In some areas, eg New Guinea, there is a different language in each village, and people cannot talk to those who live ten miles from them.
 
Vijay Tyagi
Ranch Hand
Posts: 52
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks,
according to these sources ,BMP plane 0, supports most modern languages
 
Paul Clapham
Sheriff
Posts: 22828
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For related information, have a look at the Letter Database from the Estonian Language Institute. As you might expect it's an Estonian-centric site in that it only considers languages spoken in Europe and the former Soviet Union, but it's still interesting to browse through there and see the complexities involved in just that subset of languages.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!