we use Word xml templates. We would like to replace predefined words by user text obtained from database.
But there's problem - the text to substitute e.g. "%001", can be in xml source sparated by formatting tags. Something like this
We would like to have special replace algorithm which can find the above tag separated text and replace it this way:
Does anybody have an idea how to do it ?
We don't want to parse the whole document even if it is valid xml.
Maybe it would be better to describe the whole problem at large:
Our application cat output data as reports in various formats using Jasper server.
There is no way to learn our customers how to edit .jrxml templates.
MS Word is rather different. Customers know MS Word and would like to modify the reports according to their needs.
So we created this solution for them:
- We give them predefined XML Word Template(s). On it we put dynamic text words such as %001, %002, %003,...
- Those words will be dynamically substituted by data obtained from sql query. Query always returns only one row.
- Those data are already formatted inside sql (date formats, decimal places,...)
- First value from query row substitues string %001, second %002 and so on
- We don't want to change the XML structure! Substitution is meant to be a simple string replacement
- Our customers then can edit those xml templates in MS Word. Only they must leave the the substitution words
%001, %002, %003... as they are. Other text they can freely edit.
And the problem is this - though we see the string '%001' on template, it doesn't mean it is stored in xml as string
'%001' but is probably divided by formatting tags. I see 2 solutions
- When making teplates we must check every substitute words that resulting xml contains it undivided. If not, we must reformat
it or better retype it
- Write a smart replace algorithm which finds and replaces sustitution words even if they are divided.
But never change formating tags.
I can imagine that algorithm without parsing xml , just scanning xml as string. I just asked if anybody encountered the same problem. I see probably not.
Jiri Nejedly wrote:I can imagine that algorithm without parsing xml , just scanning xml as string.
You'd be grossly mistaken. Processing XML as textual data involves LOTS of edge-cases that start out with tweaking your string operations to take care of a simple oversight, and ends with writing a complete but buggy XML processor yourself. Do yourself a favor and learn from the mistakes that those before you have made time and again. Never treat XML as text. For every piece of software that you write to handle your templates, I can write a valid template that will break your software.
Having said that, there IS a way to treat the data as text, and that's by treating it as your own custom format, and it just so happens that XML is embedded into it. That means you must treat the data as flat text that contains special placeholders that you replace with data from the database before you treat it as XML. It also means you must provide special escape sequences for your placeholders. Finally, it means that you can not split placeholders over multiple tags.
If you DO want to split the placeholders over multiple tags, you must treat the data as your custom format embedded in XML, instead of the other way around and you must also properly treat the data as XML, including handling for CDATA and other considerations that might have slipped your mind.