• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Jeanne Boyarsky
  • Paul Clapham
Sheriffs:
  • Devaka Cooray
  • Ron McLeod
  • paul wheaton
Saloon Keepers:
  • Tim Moores
  • Piet Souris
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Frits Walraven
  • Scott Selikoff

Reading a complex file

 
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I have a file(*.txt) which is of the format:

<tag1>content1</tag1>
<tag2>content2</tag2>
...Etc till...
<tagN>contentN</tagN>
{1 : Data1}{2 : Data2..
...Etc till....
}

What would be a optimal way of reading the same?
Would a plain buffered stream read suffice OR a better alternative exist.

Cheers,
Amit
[ December 16, 2006: Message edited by: amit bose ]
 
Ranch Hand
Posts: 2308
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Amit,

Does the file end with
{1 : Data1}{2 : Data2..
...Etc till....
}

Or , this is what you want..

Tag 1 =-> value : content1 like this.
You file format seems more like an xml..then why not use an xml parser..
 
Rancher
Posts: 43028
76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'd probably create a lexer (using JFlex), because these little custom file formats have a tendency to become more complex over time, which makes a hand-coded parser harder and more error-prone to maintain.
 
amit bose
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
My file input is not a XML file.Only that it has some header info present as XML Tags.
What I require to extract from this file is
content1,content2,........contentn (and)
Data1, Data2,......,Datan

So should I use Regex for the same, not sure about the Regex perfomance given that my file size would not exceed say 100 lines. However, the bulk of the input files would be quite enormous.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, Ulf's advice still seems pretty good. But regexes would work too. I think it's too early to worry about imaginary performace problems here - try it and see. Chances are good that the time it takes to read the file will be greater than the time necessary to parse it.

[amit]: ...given that my file size would not exceed say 100 lines. However, the bulk of the input files would be quite enormous.

That didn't really make sense to me. Are you saying that 100 lines is enormous? Are some of the lines extremely long? Are there many, many files? Or something else?
 
amit bose
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim,

By that line I meant that the number of such files would be large.
About a certain thousand can be safely asssumed for now. The bulk is sure to go up in future.

Cheers,
Amit
 
To do a great right, do a little wrong - shakepeare. twisted little ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic