Win a copy of Practical SVG this week in the HTML/CSS/JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Parsing notepad content separated by pipes

 
Edgar Henz
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi there,

I really need your help guys, I need to read the content of a notepad file and then parse it, specifically it's a string just like this one,

AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,

and so far I've got only this code

package tokenizer;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;

/**
* @author Henz
*
*/
public class ThreadParser {

/**
*
*/
public ThreadParser() {
// Auto-generated constructor stub
}

/*
* @param args
*/
public static void main(String[] args) {
// Auto-generated method stub
String line;

try {
System.out.print("Enter the path: ");
BufferedReader stdin = new BufferedReader(new InputStreamReader(System.in));
String file = stdin.readLine();
BufferedReader sfile = new BufferedReader(new FileReader(file));

File myFile = new File(file);
if (myFile.exists()) {
System.out.println("File founded, preparing parsing...");
} else {
System.out.println("File not founded, verify path...");
}

while ((( line = sfile.readLine())!= null))
System.out.println(line);

StringTokenizer st = new StringTokenizer(line)//Exactly here I need the string so I can parse it but it returns me a NullPointerEx
while(st.hasMoreTokens())
System.out.println(st.nextToken());

} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

}


But I've got only this output

Enter the path: C:\Test.txt
File founded, preparing parsing...
Exception in thread "main" java.lang.NullPointerException
at java.util.StringTokenizer.<init>(StringTokenizer.java.182)
at java.util.StringTokenizer.<init>(StringTokenizer.java.219)
at tokenizer.ThreadParser.main(ThreadParser.java:54)

The question is, what else do I need so I can get this output ,

AAA_AAAA.01
BBB_BBBB.01 BBB_BBBB.02 BBB_BBBB.03
CCC_CCCC.01
DDD_DDDD.01 DDD_DDDD.02 DDD_DDDD.03 DDD_DDDD.04

The main objective is to parse the string above eliminating its pipes and in each line having just the same type prefix words

Regards!

[ January 17, 2008: Message edited by: Edgar Henz ]
[ January 17, 2008: Message edited by: Bear Bibeault ]
 
Campbell Ritchie
Marshal
Posts: 52664
121
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch.

Please use "code" tags around quoted code; it makes it easier to read.
Suggest you use a JFileChooser to find the requisite file (if it asks for a Component in any of the methods you can pass "null").
Suggest you use String.split() rather than StringTokenizer; StringTokenizer is legacy code.
Suggest you use java.util.Scanner instead of a BufferedReader for text files; it is much easier.

The real problem, I think, however, is lack of {} around your "while" loop.

This is what happened with your code unchanged, then with the appropriate {}:
[Campbell@queeg java]$ java tokenizer.ThreadParser
Enter the path: /home/Campbell/java/Henz
File founded, preparing parsing...
AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,
Exception in thread "main" java.lang.NullPointerException
at java.util.StringTokenizer.<init>(StringTokenizer.java:182)
at java.util.StringTokenizer.<init>(StringTokenizer.java:219)
at tokenizer.ThreadParser.main(ThreadParser.java:54)
[Campbell@queeg java]$ javac -d . ThreadParser.java
[Campbell@queeg java]$ java tokenizer.ThreadParser
Enter the path: /home/Campbell/java/Henz
File founded, preparing parsing...
AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,
AAA_AAAA.01
|
BBB_BBBB.01
BBB_BBBB.02
|
BBB_BBBB.03
|
CCC_CCCC.01
|
DDD_DDDD.01
|
DDD_DDDD.02
|
DDD_DDDD.03
|
DDD_DDDD.04....and
so
on,
[Campbell@queeg java]$
You will need to put in a regular expression to split the input; you have not quoted anything, so you will be using whitespace as a default.
You can actually set a Scanner to use a regular expression, so it will read the file and split the tokens in one operation!

CR
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 65542
110
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also, please read this.
 
Edgar Henz
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks a lot for your advise and your time, certainly it's my first time in a forum so I will try to improve my way of formulating my questions, my posts and I'll try not to distract or conflict anyone

Have a nice day, thanks again
 
Campbell Ritchie
Marshal
Posts: 52664
121
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're welcome.

Have you got it to work? Please post what changes you made.
 
You showed up just in time for the waffles! And this tiny ad:
the new thread boost feature: great for the advertiser and smooth for the coderanch user
https://coderanch.com/t/674455/Thread-Boost-feature
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!