• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How can I get SAX report character references?

 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My problem is this: in a file I have   and �, and I want to get out from sax the same characters. I need to know when I am in characters() that the char(160) is comming from a char reference so to write out   and that char(233) is plain text and to write out �. So is there a way to make SAX report character references in the same way as entities?
Remus
 
Ranch Hand
Posts: 1179
Mac OS X Eclipse IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't have a problem getting '�' ' ' out of the parser using SAX.
In the top of my XML file, I have this line:

The importen part here is you find an encoding that fit your characters.
Rene
[ July 30, 2002: Message edited by: Rene Larsen ]
 
Rene Larsen
Ranch Hand
Posts: 1179
Mac OS X Eclipse IDE
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
and you can always override this method:

Rene
[ July 30, 2002: Message edited by: Rene Larsen ]
 
Remus Stratulat
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That was not what I wanted.
if I have in text '�' and é I want to obtain both of this. With your methods I get only �.
Here are two solutions for this problem that I found since I post this.
1. in characters(char ch[], int start, int length) the ch[] contains all the text of the input *except* when a character reference is procesed and then ch[] = {'�'} (this � is comming from a é and not from �)
Found this from looking into the xerces's sources.
2. after talking with somebody at apache I found that there is a not documented feature: http://apache.org/xml/features/scanner/notify-char-refs.
Put this to true and you will get character references reported in the same way as entities.
 
Remus Stratulat
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
in the above text I was refering at é and é
 
Looky! I'm being abducted by space aliens! Me and this tiny ad!
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic