Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

text file extractor

 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello, i would like to know if anyone could help me with this, i have a .txt file that i need to extract from every line the second sentence, ej:

aa01;Some stuff1
aa02;Some stuff2
aa03;Some stuff3

so, from that file i will need the all the line without the aa0x code, or better yet all the line with the sentence after the ";" it will be the result like this.
Some stuff1
Some stuff2
Some sutff3

Thank you.
 
Kemal Sokolovic
Bartender
Posts: 825
5
Java Python Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What have you done so far? Since the forum is NotACodeMill, nobody is going to give you a complete solution. You will have to start working on your problem and during the process we will all be glad to help with any specific problem you might encounter.

So, what exactly is the issue? Are you having problem reading a file, extracting the part of each line you want to get, or something else I didn't think of?
 
fred rosenberger
lowercase baba
Bartender
Posts: 12183
34
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are many ways to approach this, depending on your specific needs. For example:

Will there always be exactly one semi-colon?
Will there always be exactly five characters to skip?
Will every line have those five characters?
what is the exact pattern of the leading characters - will it always be the literal "aa" followed by two digits, or will those letters change?

And as Kemal said...without knowing exactly where you are stuck, we can't help. We don't know if you need help installing the JDK, compiling a simple "Hello World", opening the file, reading the file, parsing out individual lines, printing out the results...

you need to be SPECIFIC if you want help.

 
Randall Twede
Ranch Hand
Posts: 4467
3
Java Python Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i am guessing his problem is with getting the second sentence in each line. it has been so long since i have done that i can't remember how. StringTokenizer comes to mind. i am sure there are tutorials about this. try going to google and typing in java parse text file
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Use of StringTokenizer is advised against (although not strictly deprecated) but the Javadoc suggests alternatives: java.util.StringTokenizer
 
Kemal Sokolovic
Bartender
Posts: 825
5
Java Python Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Randall Twede wrote:i am guessing his problem is with getting the second sentence in each line. it has been so long since i have done that i can't remember how. StringTokenizer comes to mind. i am sure there are tutorials about this. try going to google and typing in java parse text file

String#split(String regex) would do that job. But we still don't know what is the exact problem OP is facing, so we could only guess.
 
Rob Spoor
Sheriff
Pie
Posts: 20606
60
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Am I the only one who's thinking about using a proper CSV library? Because that format looks like a CSV file that uses ; as the column separator.
Our AccessingFileFormats FAQ page mentions several libraries you can use. I rather like opencsv.
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, i show you the file and the code



and the file which is read has this:
AAA;BBB;CCC
AAA1;BBB1;CCC1
AAA2;BBB2;CCC2
AAA3;BBB3;CCC3
AAA4;BBB4;CCC4

so i can't get from each line the second column, ej, i need from the file the follow result:
BBB
BBB1
BBB2
BBB3
BBB4

i have try several way but i can't get it. thank you.


 
Kemal Sokolovic
Bartender
Posts: 825
5
Java Python Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You were already advised against using StringTokenizer, and I agree with the point.

Much easier and elegant solution can be achieved by using String#split(String) method I have referred you to in my previous post. If you have a String like this:

you can get each value with the code like this:

With that code you will have these values:
data[0] = "value1";
data[1] = "value2";
data[2] = "value3";
data[3] = "value4";

As you can see the logic is pretty straighforward. For more information you can look the API or tutorial, but with the given example you should be able to accomplish your task.
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In what way is the code you have supplied not working? When I run it the second iteration of your while loop seems to return the column you want.

I was going to say that string.split would give you the data in a more accessible format but Kemal beat me to the punch.
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kemal Sokolovic wrote:You were already advised against using StringTokenizer, and I agree with the point.

Much easier and elegant solution can be achieved by using String#split(String) method I have referred you to in my previous post. If you have a String like this:

you can get each value with the code like this:

With that code you will have these values:
data[0] = "value1";
data[1] = "value2";
data[2] = "value3";
data[3] = "value4";


As you can see the logic is pretty straighforward. For more information you can look the API or tutorial, but with the given example you should be able to accomplish your task.


your code work fine but it don't get the result that i need, see the next code in which i implent your code:



and this is the result:
AAA1
BBB1
CCC1

when i need the from each line the second column, like this:
BBB
BBB1
BBB2
BBB3
BBB4

where the original file.txt contains:
AAA;BBB;CCC
AAA1;BBB1;CCC1
AAA2;BBB2;CCC2
AAA3;BBB3;CCC3
AAA4;BBB4;CCC4
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your x variable it incrementing with the line not the token. If your text file is only ever structured as you say then you know that the second element of a will be the string you require for each line of your program.
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Phil English wrote:Your x variable it incrementing with the line not the token. If your text file is only ever structured as you say then you know that the second element of a will be the string you require for each line of your program.


Sorry i don't understand what you say. can you make and example.
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is happening in your latest code is that you are calling x++ every time you read a new line so x==2 will only be true on your second line. What is happening is that your code is printing all of the String array a only on the second line.

Let's say you run your code just for the first two lines of the file.

On the first iteration your String array a will contain ["AAA", "BBB", "CCC"] so a[0] = "AAA", a[1] = "BBB", a[2] = "CCC"
On the second iteration a is overwritten and now contains ["AAA", "BBB", "CCC"] so a[0] = "AAA1", a[1] = "BBB1", a[2] = "CCC1"

As a suggestion get rid of the if and the for loop and just try printing individual elements of the array. System.out.println(a[0])
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Phil English wrote:What is happening in your latest code is that you are calling x++ every time you read a new line so x==2 will only be true on your second line. What is happening is that your code is printing all of the String array a only on the second line.

Let's say you run your code just for the first two lines of the file.

On the first iteration your String array a will contain ["AAA", "BBB", "CCC"] so a[0] = "AAA", a[1] = "BBB", a[2] = "CCC"
On the second iteration a is overwritten and now contains ["AAA", "BBB", "CCC"] so a[0] = "AAA1", a[1] = "BBB1", a[2] = "CCC1"

As a suggestion get rid of the if and the for loop and just try printing individual elements of the array. System.out.println(a[0])


ok i understand now, but when i have a file.txt with more than 100 lines is very difficult to print manually each line. thats why i need a best method to do it.
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You know that every time you loop (each line in your file) that a[1] contains the string you want. Why not extract that on every iteration into a new variable. That new variable will grow by one element each time you iterate and when you are finished it will contain all the strings you want.
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Phil English wrote:You know that every time you loop (each line in your file) that a[1] contains the string you want. Why not extract that on every iteration into a new variable. That new variable will grow by one element each time you iterate and when you are finished it will contain all the strings you want.


ok i never think that, thank you i goint to test it.
 
Phil English
Ranch Hand
Posts: 62
MySQL Database Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No problem. Have a look at java.lang.ArrayList it is probably as good a place as any to store these strings especially if your input file could change in length.
 
Hernan Tavella
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you to all, it work great.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic