• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

Best way to split single large xml file into multiple xml files with java

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Guys,

I need your help. I need to split the large xml files into multiple xml files. Can you please suggest me which one will be the best way.

1) performance wise also should be fine.
2) With multithread also required because will receive multiple large xml files.
3) should not come memory error also.

For eg: my large xml file will look this.

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
............
<CdtTrfTxInf>
................
</CdtTrfTxInf>
<CdtTrfTxInf>
................
</CdtTrfTxInf>
<CdtTrfTxInf>
................
</CdtTrfTxInf>

</PmtInf>
</CstmrCdtTrfInitn>
</Document>

I want it as three part from the above example.

part 1:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
............
<CdtTrfTxInf>
................
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

part 2:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
............
<CdtTrfTxInf>
................
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

part 3:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
............
<CdtTrfTxInf>
................
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>


This is my requirement, So can you please suggest me the best way to do this.

And also if any book is there. Please let me know.

Thanks in Advance.

Regards,
Karthik K
 
Bartender
Posts: 3648
16
Android Mac OS X Firefox Browser Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I see your desired format part 1,2,3 has the same tags. Are only the stuff/content inside the <CdtTrfTxInf> tag determine which part to go?

If so, you can write a function or class to do this parsing of the <CdtTrfTxInf> tag and write to the appropriate file.



 
Karthik Karunanithi
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you for your quick response...K. Tsang

<CdtTrfTxInf> Inside this tag unique transaction details will come. </CdtTrfTxInf> .

For your clear understanding,

The original file will look like this,

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 1....
<CdtTrfTxInf>
Txn 1.1....
</CdtTrfTxInf>
<CdtTrfTxInf>
Txn 1.2....
</CdtTrfTxInf>
<CdtTrfTxInf>
Txn 1.3....
</CdtTrfTxInf>

</PmtInf>
<PmtInf>
Bulk 2....
<CdtTrfTxInf>
Txn 2.1....
</CdtTrfTxInf>
<CdtTrfTxInf>
Txn 2.2....
</CdtTrfTxInf>
<CdtTrfTxInf>
Txn 2.3....
</CdtTrfTxInf>

</PmtInf>
</CstmrCdtTrfInitn>
</Document>

I want it like 6 files from the one original file.

file 1:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 1
<CdtTrfTxInf>
Txn 1.1....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

file 2:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 1
<CdtTrfTxInf>
Txn 1.2....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

file 3:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 1
<CdtTrfTxInf>
Txn 1.3....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

file 4:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 2
<CdtTrfTxInf>
Txn 2.1....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

file 5:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 2
<CdtTrfTxInf>
Txn 2.2....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>

file 6:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.05">
<CstmrCdtTrfInitn>
<GrpHdr>
...............
</GrpHdr>
<PmtInf>
Bulk 6
<CdtTrfTxInf>
Txn 2.3....
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
</Document>


....Now you can able to understand what is my requirement exactly.

One more thing, i am trying to do this because for xml transformation it took nearly more than hour for transforming 50K txns.

that's why i am splitting into single file.

And if possible can please you give some sample program this and i attached sample file also.  (I am very new to this.)


Thank you.
 
Karthik Karunanithi
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Guys,

It's been urgent...Could you please help on this...

Thank you
 
Marshal
Posts: 8988
652
Mac OS X Spring VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Looking to subject, it seems you need a solution using Java.

Do you know how to read file? Write new file?

Do you know exact xml structure upfront? Repetition of same tags doesn't matter, but are they always go in same sequence and always same tags?
In case you can answer to those question above - yes, seems that you could write fairly simple parser and accomplish that job.

Now, if you never had any experience with Java yourself, you'll have hard times probably.

As an aside note: people here don't work on urgent basis as well as don't provide complete solutions, but they are more than happy to help going through some sort of struggle finding a solution.

How much Java experience you have?
 
Liutauras Vilda
Marshal
Posts: 8988
652
Mac OS X Spring VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
And welcome to the Ranch, Karthik!
 
Karthik Karunanithi
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Liutauras Vilda,

Thank you so much your response and sorry for asked urgent basis.

I have been working in jbase technology for the past 3 years. I am learning java for the past one year.

Read xml and write xml file, i learnt from website. And i tried with dom4j but it took so much time to 100k transactions.

I read about w3c dom that will also take time. It will dumb entire xml file into memory. so that's why i confused with this.

Can you please tell me which parser will be good for this.

or if you have any websites please refer me. i will check and get back to you.

Thank you so much for doing wonderful job.





 
Ranch Hand
Posts: 51
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You need to use sax parser. It doesn't read all xml file into memory.

See this tutorial https://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/ and this answer https://stackoverflow.com/questions/26310595/how-to-parse-big-50-gb-xml-files-in-java
 
Karthik Karunanithi
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you Mark Spencers....

Will try and get back if any issue.
 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Karthik,

What approach you used to solve this problem. I am also stuck at something like this.
 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi karthik ,

how did you fix the issue?
 
Stop it! You're embarassing me! And you are embarrassing this tiny ad!
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic