• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Problem loading Urdu language text from database

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The website is displaying symbols not original Urdu language text, loading from database.
pakistan political forum
 
Saloon Keeper
Posts: 15529
364
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to CodeRanch!

Do you mean the blocks that read, for instance, "زکاٰۃ مانگنا چھوڑ دیں"?

This is a classic encoding problem. You are using a different character set to interpret the data coming from the database than the character set that you used to store the data in the database, OR you're using a different character set to display the page than the character set you used to store the data in the database.

Since the rest of the page seems to display Urdu correctly, I'm going to assume you're either storing or retrieving the strings in the database incorrectly. How are you storing the strings? How are you getting them out of the database?
 
Stephan van Hulst
Saloon Keeper
Posts: 15529
364
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can you also please tell us what the default encoding is of the server where the web application is running? Windows-1256? ISO 8859-6?
 
Amar Majeed
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
utf-8 encoding - server collation is utf8mb4_unicode_ci - From now on its storing/retrieving correctly - Is there anyway to decode old data in original Urdu text form?
 
Amar Majeed
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How to get rid of these symbols (Old data) and covert them into Urdu language text??? I have more than 5000 posts those are just scrap from now on...
 
Saloon Keeper
Posts: 27807
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you can figure out what the old encoding scheme was, you should be able to dump the defective items, convert them to the proper encoding, and reload them.

A thought occurs to me - you can use an ETL utility (like Pentaho DI/Kettle) to setup this process without having to do any arduous coding (after all, that's what you want to do: ETL - Extract, Transform and (re)Load). Dump the offending items out as text files (make sure you have a multi-lingual text editor!) using different encoding schemes for the source until you get something readable, then you can create the part of the pipeline that reloads with the repaired values. Needless to say, you should backup the database first - or better yet, clone the database and do your experiments on the clone copy.
 
reply
    Bookmark Topic Watch Topic
  • New Topic