• Post Reply Bookmark Topic Watch Topic
  • New Topic

Parsing Flat files  RSS feed

 
Srinivasa Raghavan
Ranch Hand
Posts: 1228
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Friends,
I have 4 or 5 flat files with it's own formats.
Can any one suggest me a design or to parse these flat files ?
Or does any one know any design pattern for doing the same..
Hlp me out plz .
Srini
[ October 02, 2004: Message edited by: srini vasan ]
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, in my prior post I demonstrated that formal grammars and parsers go right over my head, but here are a few examples from the XML world. I'm using words that make them apply to non-XML flat files, too ...

DOM parser: Reads a file and produces a tree of generic objects representing elements in the file. Generic elements for your text might be a line and a set of words, maybe with a distinction between numbers and strings among the words. Oh, DOM is "Document Object Model" which refers to the tree of generic elements.

SAX parser: Reads a file and calls callback methods on a handler that you write. The handler might have methods like beginFile() beginRow() foundWord() endRow(), endFile() and so on. Your handler could build application-specific objects instead of generic objects.

JAXB and other Java-XML "binding" tools: At design time takes a description of the file structure and generates Java source code for objects that represent the data plus a custom parser. At run time parses the file and populates the Java classes.

Do any of those spark your interest as a way to parse a flat file into Java objects?
 
Srinivasa Raghavan
Ranch Hand
Posts: 1228
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Stan James,
DOM & SAX are only for XML file's right.
But i have a flat file like the one below.

Item Start_position Length
Name 1 30
Accno 31 8
Amount 39 15

etc ...

Can Dom be used for the above format.
PS:- Each and every line in the file will stick on to the same format.

The following URL depicts something on this.
http://www.awprofessional.com/articles/article.asp?p=31698&seqNum=3
But it 'll be good if i follow any design pattern or a API available already .
Thanks Srini
[ October 02, 2004: Message edited by: srini vasan ]
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
 
Arnold Reuser
Ranch Hand
Posts: 196
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can answer this question in so many different ways.
Therefore could you give some more information.
Are the files java source files, which can be formatted with jelopy or parsed with javacc, antlr, ... Are it just xml files, which can be formatted and bind to java files with castor. or ....
How is your file and format defined?
 
Srinivasa Raghavan
Ranch Hand
Posts: 1228
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The file is a normal text file .. whose format is in the Database as i have mentioned in the second post ..
Each line represents a record which can split into components depending upon the format mentioned in the DB.
 
Arnold Reuser
Ranch Hand
Posts: 196
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The pattern you can use is DAO pattern its a J2EE design pattern, you can find many references on the net. The idea is to see each record as a model with an independent storage framework, yours is a flat file.

But what is your purpose? I can only see a rdmbs definition of a table in your file, nothing more. Is it your purpose to generate these meta data files, is is your purpose to generate java files that can access your rbdms, is it just your purpose to scan these files and nothing more and nothing less, ...

In the last situation, you don't need an API. Just read the file with FileReader, wrap it inside a BufferedReader, and tokenize the lines with StringTokenizer.
[ October 03, 2004: Message edited by: Arnold Reuser ]
 
Srinivasa Raghavan
Ranch Hand
Posts: 1228
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The purpose of the parser is just to read the format from DB and parse the txt files. java.io.* can do this..
Thing is the code shd be generic to read as many formats as it can ..
So when ever a new format is added to the DB i should nt wrk around the code.
So i was searching for a framework or a pattern kind of stuff that solves this design.
any way Thanks for u'r input
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The examples I gave are XML only, but you were asking about patterns. The examples represent some design ideas you might choose to emulate.

To emulate DOM you would write a parser that could read any file of a given structure, say comma separated variables, and produce a generic tree of objects, say line objects and field objects. The parser is 100% reusable but the output is very generic.

To emulate SAX you would write a parser that could read any file of a given structure, say CSV again, and call a handler with "event" methods for start line, start field, end field, end line, etc. The parser is reusable, the handler and the output are custom to the job at hand.

Another style that I don't have a name for is to write a custom parser that reads a particular file and creates a set of particular objects, maybe classes, instructors, students. The up side is that you get an output perfectly tuned for what you need to do with it. The down side is there is probably zero reusable code.

I like the SAX-kinda thing. I did an object that walks through JDBC result sets and calls a handler for every row & column. Since then I've seen lots of others, so it seems to be a pretty common idea. I like that the handler and the objects the handler builds have zero knowledge of JDBC.
 
Srinivasa Raghavan
Ranch Hand
Posts: 1228
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm on the way to emulate a DOM kind of stuff ..
Any way thanks for u'r input
Srini
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!