Win a copy of Terraform in Action this week in the Cloud forum!

g tsuji

Ranch Hand
+ Follow
since Jan 18, 2011
Cows and Likes
Cows
Total received
7
In last 30 days
0
Total given
0
Likes
Total received
90
Received in last 30 days
0
Total given
2
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by g tsuji

There is a fundamental and conceptual difference between <description>String</description> and <description></description> which is that the length (via say .getLength()) of the child nodes being 1 for the former whereas being 0 in the latter. If you check the source code of IgnoreTextAndAttributeValuesDifferenceListener class, you would see that that length is being checked and any difference would be considered as a Difference - for good reason, though, but it would result in some inconvenience to the user in use case like yours.

I already used IgnoreTextAndAttributeValuesDifferenceListener, but I still got the problem.


That setting points to a different functionality. If it were checking <description>String</description> and <description>Str</description>, it would be fine and a match would be obtained. But not the case as discussed above.

Some would take the approach of writing one's own Listener. I can point you to an example:
https://stackoverflow.com/questions/28737977/ignore-text-differences-when-comparing-xml-with-xmlunit
The way of writing it is modeling the source code of IgnoreTextAndAttributeValuesDifferenceListener and just modifying the bit of checking getLength bit mentioned above. I see no reason to repeat those kinds of answer, probably repeated all over.

I can propose my approach which is based on preprocessor using xslt. I think this approach is more powerful and versatile and conceptually clearer. XMLUnit has built-in xslt support which actually tacitly aims at resolving a much bigger classes of differences not easily resolved using the build-in Listeners and/or their extension by subclassing. Differences between xml files and/or text files, ... are difficult problem in general, mathematically like ... (I refrain from putting a name of some topic, risking to make people wondering), so vast but not necessarily very rewarding !

Here is how.

First you write a short xslt stylesheet, let's say, _ignoreText.xsl placing in the right directory of course, like this.


Then you modify your classes, adding import of org.custommonkey.xmlunit.Transform and also catching javax.xml.transform.TransformerException.

Then you are good to go.
Yes, I would only supplement that by mentioning that sequence is a type introduced in xslt 2. One would need an xslt2 compliant processor to process the transformation.
You're testing for the impossible. It is always false.

Whether you've in mind this, I cannot be sure.
@Geet mini
If you want to use variables $EndDate, $ReceiverProductID, ... I don't know what, you have to define them. Otherwise, the whole thing can be done simply like this.

Or is it to simple or simplistic? And be careful not to use // axis too lightly.
Do you mean you generally succeed in doing that, but this time it has thrown exceptions on that line? If yes, tell the forum more about which kind of packages are you working with and other related matters which may have a bearing on the problem.
Just want to add a remark: to use xs:any for the sole purpose announced cannot be done with side-effects which is not what wanted in the original design. The above solution is the closest one can do as a compromise... nothing forbid you to pass technicaldata to itself resulting in a recursion construction. But one has to live with it... (as xs:any is a tool with too much built-in power.)

It is working but I want only unit should be passed
Can I give unit element as madatory under any elements


So I am led to understand that you only want to write xs:any under technicaldata but all the same you want whatever being admitted as pertinent to xs:any contains a unit element ("... under any elements" with "under" meant descendant of technicaldata, and "any" refers to xs:any.)

[1] First, you must understand in order to validate a "unit" element, the engine must perform validaion in order to validate it. Therefore, processContents="skip" cannot be.

So what to do? You must strengthen it to processContents="lax", strengthen to "strict" seems not what you want at all.

The validation according to lax entails that you must have the element unit "defined" as global element in the schema... So you first must isolate it and define it as global (in case you don't understand the concept, it means xs:element for "unit" must be the direct child of xs:schema element.

If your ChartLimit8 be complex type, change the above accordingly.

[2] Then you must understand that any other element such as co2NefzGas etc I suppose are defined only locally (other than globally) in the original schema. And then you want to spare any of them. That is fine. If by bad luck you have some globally defined xs:element with name that you want to pass to technicaldata as its descendant, then you have to make sure you have all those having unit as descendant having the cardinality minOccurs="1" that means mandatory exactly in narrative/normative term. (I am not sure you really understand but leave it as such for the moment.) Furthermore, those explicit globally defined element containing unit element must be written with a reference to the global element unit so defined. For instance this.

The only rigid thing I want to illustrate is the way to write the "unit" element (minOccurs="1" by default which means mandatory, if you want multiple unit, you add maxOccurs="unbounded" tp it, otherwise the default is maxOccurs="1".) All the rest y, z their types and cardinality are for illustration only. With that, the x will also be validated together with unit (mandatory) too.

I hope you understand... otherwise, I just have to say it is not that elementary as an excuse not able to convey to you the proper way of doing it in that approach.
First you said technicalData, but the schema said technicaldata. It is not a good start.
Then the summary of said "initial" schema is not really valid...

Giving all the benefit of the doubt, the schema you said you modified the initial one would not work as it will result in a non-determistic schema which is one of the constraint of w3c schema (1.0) design. The xs:any so written will take on a default namespace attribute which is "##any". Hence the schema engine will not be able to determine if the last element unit be intended to be one for xs:any or one actually intended to comply with schema's following xs:element ... hence, non-deterministic.

You can contemplate in alternating the conceptual design of it by
1) either you say the element unit come first before anything else validated by xs:any, ie, putting xs:element name="unit" before xs:any - but it is in a sense not a small design change;
2) or you can say my unit element belongs to a different namespace (using xs:element ref to refer to the element unit in another namespace of a schema other than the shown one) and add an attribute namespace="##targetNamespace" to the xs:any element;
2.1) or in the same spirit, you say unit belongs to the present targetNamespace but xs:any admits only anything other than the present targetNamespace, ie, adding an attribute namespace="##other" to the xs:any element - but 2) or 2.1) is quite a major design change.

With either of those, the non-deterministic issue is then resolved and you can proceed with the schema corrected accordingly.
2 years ago
If you're 100% certain that it is a text file kind of data, you can do it like this (and I suppose you've caught whatever reported exceptions needed already).


If eventually those are files of binary data like images or office spreedsheet or whatever, you can Base64 encoding the data to embed it in the text nodes... In java8+, you've the support natively.

In jdk1.6 or 1.7, you can still look for helpers like javax.xml.bind.DateTypeConverter, or org.apache.commons.codec.Base64 or else you can find.
Excellent. Thank you for underlining that.

I have kept a copy of exslt download when exslt official site still provided itself a download link. The copy contains still so many bugs that I had to modify it to make it work for some functionality that I needed. It contains bugs even at the level some basic like inconsistent namespace declarations (like http://exslt.org/dates-and-times wrongly written as http://exslt.org/dates and times and http://exslt.org/Dates and Times etc...) and xsl:import line written after func:script lines etc.

I have taken a look at the latest kept available at github, many have been rectified and apparently the "bug" you observed is still there.

To keep the story short, the correction you proposed should work for the present case. But I would rather propose a modification of the original script at the function return of Duration() which takes on eight arguments, including the last argument fraction. To properly taking into account of more general case, modification should be made there... A preliminary modification could be like this at the return line (I break up the line for clarity reason).

Maybe you are in a position to make a fair opinion about it in comparison with yours. In this version, the fractional second will be preserved otherwise it would be lost affecting obviously the precision of the data.
The error message suggests you to look up the log with its whereabout. Have you done that?
Not using exslt for quite a bit of time. If you have xslt2 processor available, like saxon, you can directly use fn:seconds-from-duration() xpath function directly (where its namespace is internally recognized, or you can explicitly declared it to http://www.w3.org/2005/xpath-functions with the prefix fn). It takes an argument of xs:duration. Prefix xs is the usual w3 schema namespace that you must declare it, though.

If you _must_work with exslt, I perhaps could take a look in the direction. In that case, tell more about your related namespace declarations and xsl:import ... structure you have in place.
The print out of doc being null in case where xml is not well-formed and that being [#document: null] if it is guaranteed well-formedness and parsed are obviously two different things.

In case you are wondering what is it, the [#document: null], and how to properly interprete it ? This is how: where the library implement org.w3c.dom.Node interface, you can have its default toString() implemented as you decide to show. Here it is implementation dependent. What you see reflects the implementation:

where #document is the "node name" of the abstract Document, and Document's node value is null by design and by w3c recommendation compliance. For an element, say canProject it will show something like this:

it is null there because element node is by design and by recommendation compliance has node value null.

Schematically, its output are obtained through the method figuratively shown below:


If you're at a text node, the second component will show something with node name showing #text for text node by design and by compliance... again. You can discover the rest for comment, processinginstruction and cdata nodes. There is nothing secretive about it.

That's all about it.
@paul nisset
[1]
I return to the exchanges at the stage where standalone is wrongly written as true instead of, as rightly corrected by the invited author, as yes.

paul nisset 2019-02-13 4:05:22 AM wrote:Hi Paul,
The assignment to 'doc' doesn't cause the program step into the exception block despite 'doc' being null after the assignment. I would have expected it to throw  an exception as well.

The assignment to fis produces an object without throwing an exception .

I guess my question is for the task of parsing an xml document and assigning the elements to Element variables, am I doing this correctly?  


You certainly would have discovered an exception was being raised. But look at your code (and compare it with what shown by the invited author):

paul nisset wrote:


There is a closing curly bracket before catch... if the code even work! But if code properly, you would have seen the exception thrown with standalone="true", saying:

And you just have to thank the invited author and the matter would be closed... all the getElementsByTagName() etc... would work out correctly. The jaxb stuff is just a diversion. I return to this just to disperse wrong impressions the exchanges made to casual readers.

[2]

paul nisset wrote:When I look at element.getNodeValue() it returns null .
How would I get the value of 'adminappv' I see in the document ?  


You can do it like this.
There was a time gap between dom being started taking shape and the necessity of namespace concept. And then dom itself continued to develop as well... At the time of dom level 1, there wasn't namespace in its final shape and dom level 1 is therefore not namespace aware. And when dom developed to level 2, namespace was then fully incorporated and many other things enhanced as well such as event model. I can say though: without namespace, there is no schema validation. Hence, in the area of schema validation, namespace awareness is a must.

That said, we are in dom level 2 minimum for the validation issue. So dbf.setNamespaceAware(true) is good. However, in the construction of and/or parsing to a dom tree, it is not at all a good idea by mixing level 1 and level 2 methods. It can lead to unpredictable consequences and this is one.

This correction will take out the consequential mixing of methods in different levels and should have the problem rectified.