Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

html to xml (wordml) converstion

 
karthik venkatesan
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I need to convert html tags to wordml format withour losing html formatting( like bold, italic, etc..). Is there any way to accomplish this? I mean suppose I have a html input like the below,

<html>
<head>
</html>
<body>
<B> This is bold text</B>
</body>
</html>

The output wordml should be able to print the text "This is bold text" in bold.



The output should be...
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sure, there's a way to do it. You just convert the <B> element to its equivalent in WordML. Generally speaking I would convert the HTML to well-formed XML (using JTidy or TagSoup or some similar product), then use XSLT to transform that into WordML.

However I don't know WordML so I can't tell you how to use it to mark text as bold. But as I said, I am sure it can be done. So if you actually meant to ask how to do that in WordML, then sorry, I don't know.
 
karthik venkatesan
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the reply Paul. I got your point. But the problem is, we dont know how the users going to form the input using html tags(for example they may have nested tables, image link... ).

So it may not be possible to write a generic xsl template to covert an unknown input with html tags. But I came across in net that, microsoft has released a generic template to convert wordml to html. But not the vice-versa, which is required for me.

So I would appreciate if anybody came across the same issue and the solution for the issue (if any).
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ah, I see, you are looking for a generic XSLT to convert HTML into WordML. And you don't want to write it yourself because it's a large and complicated task and surely somebody must have done it already? I would definitely agree with that.

But I don't exactly see a solution when I search the web for one. If you look at this article, for example, you can see the complexities involved in converting HTML to XSL-FO, and exactly the same would be required in converting to WordML.
 
Neerav Narielwala
Ranch Hand
Posts: 106
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have been using wordML as a content management tool and using xslt to
convert it to XHTML via Cocoon. Is this of use to you? If so I can post an
example.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic