• Post Reply Bookmark Topic Watch Topic
  • New Topic

Reading more complicated data from text files, and creating objects after?  RSS feed

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi guys, I am currently trying to self learn Java (I'm a physics major, so I hope you don't mind my silly questions)

I've gotten to the point where I can read from simple text files, store them in a scanner, and pass these values through a constructor and then add these objects to array lists.

I have grasped the basic concepts of reading the "saved data" from files, but after trying out a few things, I realised that I have run into some problems regarding the picking up of certain data from text files.

For example, when reading a more complicated text file like this,


I realised that by using Scanner.next, I am unable to capture the data under "Description" as a whole string due to the spaces in between. How should I go about this? Also, if I want to add a new item, how do I write it into the middle part of the text file?
(I've tried looking everywhere for tutorials, including books, but very few of them actually go into details on how to move around a text file, how to pick up certain patterns etc, and being a beginner, I've no clue to the exact key words I should search for on the net.

Help would be very much appreciated.
 
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John Saam wrote:
I have grasped the basic concepts of reading the "saved data" from files, but after trying out a few things, I realised that I have run into some problems regarding the picking up of certain data from text files.

For example, when reading a more complicated text file like this,


I realised that by using Scanner.next, I am unable to capture the data under "Description" as a whole string due to the spaces in between. How should I go about this?


One possible option... the scanner class has a nextLine() method. This method advances the read pointer to the next line, and returns everything that is skipped as a token. So, when you want the description, this will work, as the description goes to the end of the line.

On the other hand, you are correct. There is only so much that the scanner class can do. At a certain point, if it gets too complicated, you will have to read it line by line and parse the tokens yourself.

Henry
 
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch.

It sounds like you should be using a database or a structured file format such as XML.
I'm not sure what you mean by "unable to capture the data under "Description" as a whole string". Do you mean you want to read the whole line?
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John Saam wrote: Also, if I want to add a new item, how do I write it into the middle part of the text file?
(I've tried looking everywhere for tutorials, including books, but very few of them actually go into details on how to move around a text file, how to pick up certain patterns etc, and being a beginner, I've no clue to the exact key words I should search for on the net.


Well, technically, you can use the RandomAccessFile class to open a file, copy the bytes to make an opening to insert, and insert the new data into the file... but it is really annoying to do. Personally, I would recommend just writing all the data out as a brand new file. You can also rename the previous file as backup.

Henry
 
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John Saam wrote:I realised that by using Scanner.next, I am unable to capture the data under "Description" as a whole string due to the spaces in between. How should I go about this? Also, if I want to add a new item, how do I write it into the middle part of the text file?

Well, it very much depends on what the actual format is; and we can only guess by looking at your example. For example, it may be that "fields" (or "columns") are separated by TABs, not spaces.

Either way, I'd echo what the others have said - read the file in a line at a time.

Also: have a look at String.split(), because it allows you to do some very sophisticated "coluimn splitting".

As to updating your file; see above: you'll have to know exactly what format it expects.

HIH

Winston
 
Master Rancher
Posts: 2045
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Another possibility is to open the data file in Excel or alike, where you can easily manipulate the file
into some easy to handle layout. Save it as csv, txt or tsv, and then use Java to process it further.
Maybe you could even record it as a macro, for later re-use.

Greetings,
Piet
 
John Saam
Greenhorn
Posts: 3
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for all the advices, I really appreciate them.

Would it be possible to elaborate more on the read line-by-line method?

Assuming that the Player and Description are separated by a tab, and not just random spacing, do I go about it by




After this, the String line would contain the string "Archer\tShoots arrows", is that right?

So from here, how do I separate the "Archer" and "Shoots arrows" for the purpose of assigning them to different variables?

*Edit* After thinking about Winston Gutkowski's reply regarding the String.split(), I've come to this conclusion.



Is this the right way to do it?
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well done.

You may want to use "\s+" for your split parameter as that will split on one or more occurrences of any whitespace character ie space, tab, form feed etc. This means it will still work if you have 1, 2 or more tabs or a load of spaces etc.
Also make sure you check the size of the array before using trying to access elements 1 and 2.
 
Ranch Hand
Posts: 679
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:You may want to use "\s+" for your split parameter as that will split on one or more occurrences of any whitespace character ie space, tab, form feed etc. This means it will still work if you have 1, 2 or more tabs or a load of spaces etc.

But that will also split on any whitespace in the description, which is what the OP was trying to avoid.
 
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could try something like \s{2,} which will not split on single whitespace.
It will probably not compile either, because I have probably got the syntax wrong.
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stuart A. Burkett wrote:But that will also split on any whitespace in the description, which is what the OP was trying to avoid.

Well spotted.

Campbell Ritchie wrote:You could try something like \s{2,} which will not split on single whitespace.
It will probably not compile either, because I have probably got the syntax wrong.

That compiles but you would need "\t|\s{2,}" to break on either a single tab or multiple white spaces.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John Saam wrote:*Edit* After thinking about Winston Gutkowski's reply regarding the String.split(), I've come to this conclusion.
Is this the right way to do it?

Indeed it is and, as Tony said: well done for working it out. You get a +1 from me.

And if columns are TAB-delimited, then I'd stick with exactly what you wrote - ie, don't get too fancy until you know you need to.

Just remember that if a column doesn't have any associated data, it's possible that you might get two TABs in a row, in which case split() will put an empty String in that column (which is probably what you want). If you start mucking about with variable-length delimiters, it won't.

Also: You can add a bit of verification by using split(String, int).
If, for example, each line is supposed to contain 5 columns - or you're not interested in anything after the 5th column - then:
split("\t", 6);
will create at most 6 fields, with any "leftover junk" put in the 6th. You can then also check the length of the array you get to make sure that it is indeed 5 (or at least 5) if you want.

It'll probably work slightly faster too.

HIH

Winston
 
Campbell Ritchie
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote: . . . That compiles but you would need "\t|\s{2,}" to break on either a single tab or multiple white spaces.
Thank you. I hadn't realised that tabs were permitted.

Aren't there CSV reading libraries? Can you adapt them to read tab‑separated files?
 
John Saam
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
All right, thanks for all the replies. Marking as solved now.

I never knew the programming community was so newbie-friendly
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
John Saam wrote:I never knew the programming community was so newbie-friendly

Welcome to JavaRanch.

Winston
 
Campbell Ritchie
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're welcome Please show us what worked.
 
Ranch Hand
Posts: 221
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:
Tony Docherty wrote: . . . That compiles but you would need "\t|\s{2,}" to break on either a single tab or multiple white spaces.
Thank you. I hadn't realised that tabs were permitted.

Aren't there CSV reading libraries? Can you adapt them to read tab‑separated files?
I was surprised reading through the thread that no one had suggested this. The CSV file would do exactly what the OP is trying to do. Removes the need to guessing if it's tabs vs. spaces, is it part of the item name or part of the description. And allows for easy expandability. And it's reasonably easy to implement. For what it's worth, anyway, that's how I would have approached the problem.
 
Piet Souris
Master Rancher
Posts: 2045
75
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Robert D. Smith wrote:(...) The CSV file would do exactly what the OP is trying to do (...)

What CSV file would that be?
The OPs problem was in the first place some irregular formed text file, The real solution would be
for the OP to demand some clear specification of the delivered file, just as Winston mentioned.
Only then could one determine whether Java is a suitable tool for doing the job, whatever that
may be.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Robert D. Smith wrote:I was surprised reading through the thread that no one had suggested this. The CSV file would do exactly what the OP is trying to do. Removes the need to guessing if it's tabs vs. spaces, is it part of the item name or part of the description.

For a full industrial app you may well be right, since there are many libraries around; but personally I dislike CSV intensely, since commas can often appear as part of a "value", usually requiring some form of 'quoting' and/or escaping.

TAB-SV, on the other hand, is extremely simple and also provides visual clarity with most viewers/terminals as well.

I used to use it for all my script-based stuff.

Winston
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Personally I try to avoid tab separated files.
You can't easily tell if a file contains tabs or a load of spaces and I've used editors which change tabs to spaces when you save the file which isn't at all helpful when you've just edited a tab separated file.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tony Docherty wrote:and I've used editors which change tabs to spaces when you save the file...

A source code editor possibly; but I don't know of any general purpose editor that does it by default.

@Robert - And that's programming for you: Two pretty experienced blokes with completely different opinions. Now it's up to you to decide what you prefer.

Winston
 
Tony Docherty
Bartender
Posts: 3271
82
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
A source code editor possibly;

Yes it was. And as a result I spent hours manually rebuilding a data file that had loads of fields containing a single whitespace character so there was no way of doing a straight search and replace to restore the tabs.
I suppose the lesson I should have learned was always do a backup before editing anything rather than never use tab separated files but I guess it's like when you drink so much beer you end up being really ill. It puts you off the last food you had rather than alcohol.

Winston Gutkowski wrote:
@Robert - And that's programming for you: Two pretty experienced blokes with completely different opinions. Now it's up to you to decide what you prefer.

And that's one of things I love about programming, there is rarely if ever only one opinion on the correct way of doing something (and more often than not both opinions are perfectly valid).

But of course we both know who's right in this case

 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!