• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Liutauras Vilda
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
Bartenders:

How to put clob in org.w3c.dom.Document

 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does anyone know how to put clobs in org.w3c.dom.Document?
According to the org.w3c.dom.Document API, the createTextNode method only takes String as input parameter.
Thanks!
Yi
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do you mean Clob as in java.sql.Clob? That is really just a string, so there should be no problems. In any event, the "C" in clob is for "character", and what consists of characters in Java can be represented by strings.
 
Sheriff
Posts: 28372
99
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The createTextNode method only allows Strings because XML is a text-based format. Everything in XML is text. So if you want to put Java objects into XML you must convert them to text somehow.

As Ulf says, it's easy to convert a Clob into text.
 
Ian Chen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for your responses. I converted my clobs to strings and put them in the document instance and everything worked.
My only concern is that strings may not be large enough to hold all the possible clobs that my users want to upload. Are you guys aware of any upper limit in the string size?
Ian
 
Paul Clapham
Sheriff
Posts: 28372
99
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The maximum theoretical length of a String is Integer.MAX_VALUE, which is about 2 billion characters. The maximum practical length of a String is limited by the actual memory available.
 
Ian Chen
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Great! Thanks again for your replies.
Ian
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I hope you realize that any number of characters and character sequences could cause an XML disaster when inserted as a Text node. Things are only slightly better when inserted as CDATA.
Examples:
1. & < and > have special XML meaning
2. some characters such as MS Word "smart punctuation" are illegal Unicode
3. some control characters such as ctrl-z are illegal Unicode
Bill
 
Paul Clapham
Sheriff
Posts: 28372
99
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

1. & < and > have special XML meaning
2. some characters such as MS Word "smart punctuation" are illegal Unicode
3. some control characters such as ctrl-z are illegal Unicode

1. The org.w3c.dom.Document and other standard XML classes take care of this. The users of the classes don't have to concern themselves with escaping ampersands and less-thans.

2. Not true. You'll find those characters between U+2018 and U+201F. If you have them in your data correctly then the standard XML classes will handle them correctly. What is true is that people often paste those characters into text files without regard to the proper encoding of those files, or hand-generate XML which doesn't declare its encoding properly.

3. This one is true. I don't know what happens if you pass a String containing one of those characters into a DOM text node.
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

The org.w3c.dom.Document and other standard XML classes take care of this. The users of the classes don't have to concern themselves with escaping ampersands and less-thans.


I suspect you have not actually tried to handle all the bizarre situations one can get into when trying to handle user text input.
Bill
 
Paul Clapham
Sheriff
Posts: 28372
99
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by William Brogden:

I suspect you have not actually tried to handle all the bizarre situations one can get into when trying to handle user text input.
Bill

No, I haven't. I know there can be problems with non-ASCII characters if input is coming from a browser where the encoding isn't handled properly by the browser and/or the server, for example, but I still say the XML serializer will handle conversion of ampersands to the escaped form. The programmer doesn't have to escape them before putting them into a DOM text node.
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My view of the & problem is probably colored by my experience dealing with a client's XML design in which a CDATA section contained more XML formatted data. I had to write a browser based editor and converting between stored text and displayed text was a frustrating job.

Where I hit the "smart punctuation" problem was both cut-and-paste and files resulting from "output as text" from MS Word.

The only legal control characters are tab, carriage return, line feed. You may hit a <ctrl>z in text generated by an older application that uses it as an end of file marker. Really old word processor formats used other control characters. XML parsers will throw an exception when hitting one of those characters on input but I don't know if output writers would turn them into something legal.
Bill
Bill
 
Space pants. Tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic