• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to convert html file with image and tabular form data to xml ?  RSS feed

 
SunilK Chauhan
Ranch Hand
Posts: 62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Guys,

I am stuck here to convert html file containing images and tabular form data to simple XML file. So that i can get back all images and those html table data basck from XML file itself.
I don't know how to convert images and tabular data format to XML file as it contains only string format.

Kindly help me out here.....
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can store images in XML files by converting them to ASCII using something like the base-64 encoding. Then you can use a CDATA section for it in the XML.

What is the particular issue with tables? If you're familiar with how HTML tables look like (using TABLE, TR and TD elements), surely you can use something like in your XML file?

But lets take a step back: why do you think it's a good idea to store the contents of an HTML file as an XML file?
 
SunilK Chauhan
Ranch Hand
Posts: 62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to send the tabular data of HTML to XML file. Than after i will send this XML file to my existing desktop system and it will fetch all the data from it automatically.

That's why i required XML file only which should carry all the informations itself.

There is another way also that i can do it using RTF file format also that i can convert HTML data to RTF file format and i will set manually these RTF file content to my XML and than i will send it to my existing System. This way it will works well.

That's why i am trying to convert to RTF or XML directly if possible wth image and all.
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So are you looking for a menual solution, or a programmatic one? It sounds like you already have found a manual one. The programmatic one I described above.
 
SunilK Chauhan
Ranch Hand
Posts: 62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Currently i have not implemented any solutions yet. But i have an idea that there is an options available to convert HTML file to RTF or Base64 encoding or directly XML file that i can send it to my system.

My HTML file may contain Image or tabular format data (i.e. <tr><td>  </td></tr>  )

I want to convert this file using any of the format but how to convert them.

For that purpose, I have used that PDML libraries also but it shows RTF is not supported function as of now.
 
Tim Moores
Saloon Keeper
Posts: 3967
94
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, if you want to tackle it programmatically, the first step might be to convert the HTML into XML (using a library such as NekoHtml or TagSoup), which would then make the result amenable to more orderly processing, possibly using XML libraries.

The images would need to be treated especially - downloading them, converting them to base-64, and adding them to the final XML via as CDATA sections.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!