• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

Transformation Problem For Arabic/French Character : HTML to XML

 
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I have a program, which converts html into xml.
I am facing problem when transforming french/arabic character.

The code is as follows--



The transformer used is --



and the version of jar is xalan-2.7.0.

Any idea, what i am missing?

Thanks,
Tanzy.
 
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

By default xalan uses UTF-8, so there is no problem with the transformer/parser. The problem should b with your xsl.

Could you try giving the UTF-8 encoding in your xsl. You could do that by adding the following in the xsl file:





You can have a more look for the syntax here:

http://www.w3schools.com/xsl/el_output.asp


Cheers
Aneesh
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank You Aneesh for the respoonse.

<xsl:output
method="xml"
encoding="UTF-8"

....
...
/>



This is been already done.

Meaning, there is some other issue.

Thanks,
Tanzy.
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Could you please attach the xsl & xml here, through the attachments?

Cheers
Aneesh
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Aneesh, this program is running fine if i am not using xalan transformer.

Problem occurs only when using xalan.
Is there any other transformer to which i should use instead of xalan?
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I didnt get the point:

Aneesh, this program is running fine if i am not using xalan transformer



Are you getting

???

in the transformed xml. or any exception. Could you please let me know the exact problem?

You can try using cocoon. But I can say the problem is not with the transformer. I have done numerous transformations with a variety of unicode Indian characters.

Cheers
Aneesh
 
Sheriff
Posts: 22862
132
Eclipse IDE Spring TypeScript Quarkus Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

tanzy akhtar wrote:I am facing problem when transforming french/arabic character.


What's the problem? http://faq.javaranch.com/java/TellTheDetails
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Are you getting

???



Yes exactly Aneesh. This is my problem. After transformation arabic/french character get replaced by "???".


Sorry Rob, i could not specify my problem earlier and thanks for pointing that.
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Before doing actual transformation, replacing "nbsp" character with "\n".
Below is the program which gets execute before the transformation takes place--



Here may be some problem when copying.
 
Rob Spoor
Sheriff
Posts: 22862
132
Eclipse IDE Spring TypeScript Quarkus Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
&nbsp; is not the same as an enter. The nearest equivalent is a space. That's where the name comes from: non breaking space.
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Rob.

Well is that nbsp; creating problem in my case?
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Tanzy,

Shoudn't it be

Well what I think is, there isn't nay problem with your code, I guess it's rather the problem with your browser. Did you try making the browser charset to unicode?

(Mozilla) View ->Character Encoding -> Unicode
(IE) View ->Enclding->Unicode

Let me know this.

Cheers
Aneesh
 
Rob Spoor
Sheriff
Posts: 22862
132
Eclipse IDE Spring TypeScript Quarkus Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

tanzy akhtar wrote:Thanks Rob.

Well is that nbsp; creating problem in my case?


Possibly;   is a valid HTML entity, but unless you declare it explicitly again, it's not in XML. For instance, the following XML document gives me the following errors when running through xmllint:
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Rob, I guess it's becasue &amb;nbsp; is not an xml entity, it's only an html entity. So I guess his problem seems to be the encoding issue.

Cheers
Aneesh
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Shoudn't it be

 rather than nbsp; ?



Yes Aneesh. It is the same character you are saying.
Actually while posting the ampersand character, it got invisible.

Thanks for correcting this.

Well, browser setting is also done, even though it's not working.

 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi,

Have send you a Pm with my email.


Cheers
Aneesh
 
Rob Spoor
Sheriff
Posts: 22862
132
Eclipse IDE Spring TypeScript Quarkus Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Aneesh Vijendran wrote:Have send you a Pm with my email.


http://faq.javaranch.com/java/UseTheForumNotEmail
Don't use email or private messages to come to a solution; other people will not see those so you are withholding that solution from everybody else.
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Rob, I would definitely post the solution here, if ever I come to a solution. If there are some files which he can't put coderanch, what could be done?

Cheers
Aneesh
 
Rob Spoor
Sheriff
Posts: 22862
132
Eclipse IDE Spring TypeScript Quarkus Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
He should obfuscate them - replace all sensitive data. I once had to export our customer database; I ended up giving them all my manager's name in the export
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
yeah you are right

oops manager's details lol!. He might have got a hundred calls regarding market research and stuff lol!
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
lolz...
nice conversation..
 
Tanzy Akhtar
Ranch Hand
Posts: 110
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I got work around of my problem..

Just put "UTF-8" as parameter wherever creation of inputstream/outputstream takes place.

It's working fine for me.

Thank you Rob and Aneesh for useful guidelines.

Life Rocks,
Tanzy.
 
Aneesh Vijendran
Ranch Hand
Posts: 125
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Excellant!!!
reply
    Bookmark Topic Watch Topic
  • New Topic