• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

StringTokenizer

 
Ranch Hand
Posts: 89
Oracle Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I am reading lines from a txt file and splitting words using StringTokenizer. My code is as below:

File:


BufferedReader in = new BufferedReader(new FileReader(FILENAME));
String line = in.readLine();
while(line != null) {
StringTokenizer tk = new StringTokenizer(line);
String first = tk.nextToken(),
second = tk.nextToken(),
third = tk.nextToken(),
fourth = tk.nextToken();

------
------
line = in.readLine();
}

In Line 1, there are 4 words - aaa, bbb, ccc, ddd which gets assigned to variables first, second, third and fourth.
In Line 2, there are only 3 words, The word that should be assigned to variable third is missing.

I want that the variable 'third' should not be assigned anything in the Line 2. But StringTokenizer treats all the spaces as delimiters.

How can this be done?

Thanks,
Nidhi

[ EJFH: Added "CODE" tags to preserve formatting in data file. ]
[ August 27, 2007: Message edited by: Ernest Friedman-Hill ]
 
Bartender
Posts: 3323
86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
How do you know that it's the third word that's missing and not the first second or fourth? From your post there doesn't appear to be a double space or any other marker to denote which word is missing.

If there is a double space then you can split the line successfully using the String.split(..) method (which is actually the preferred way of splitting strings these days).
 
Ranch Hand
Posts: 39
MyEclipse IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do like this


import java.io.*;
import java.util.*;
public class Ex1
{
public static void main(String[] args) throws Exception
{
//Reading data from file1.txt
boolean flag=false;
String s,s1="",chk;


FileReader f=new FileReader("file1.txt");
BufferedReader br=new BufferedReader(f);

while((s =br.readLine())!= null)
{
s1=s1.concat(s);
s1=s1.concat("\n"); /*file data will be put in string */
}


f.close();


StringTokenizer st=new StringTokenizer(s1);
String ss[]=new String[st.countTokens()];
int i=0;

while(st.hasMoreElements())
{
ss[i]=st.nextToken();
i++;
}
// storing tokens in string array ss

for(int k=0;k<i;k++)
{
for(int l=1;l<i;l++)
{
if((ss[k].compareTo(ss[l]))<0)
{
String temp=ss[l];
ss[l]=ss[k];
ss[k]=temp;
}
}
}

for(int j=0;j<i;j++)
{
System.out.println(ss[j]);
}

}
}
 
Ranch Hand
Posts: 234
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't know if it is the best solution, but take a look at the StringTokenizer's constructors.
There is one where you can specify whether or not you want it to return also the delimiters (the spaces). Then you can check what it returns, if it returns two consecutive spaces, a string is missing.
 
Ranch Hand
Posts: 103
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If your text file is the kind of fixed format that it looks like (meaning that each word position is fixed at a specified index; eg firstName starts at position 1 in the line, lastName starts at position 10, middleInitial starts at position 25, etc) then you might want to look up the substring methods of class String; they include the option of specifying the beginning and optional ending indexes for the positions in which you expect to find words. If you were to split each line into substrings by the appropriate beginning index, you could then look for a word or absence of a word in each of the resulting substrings. If your words are all the same length when they exist, then the option mentioned earlier of looking for contiguous spaces would probably be easier and less memory intensive.
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic