• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Liutauras Vilda
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Devaka Cooray
  • Paul Clapham
Saloon Keepers:
  • Scott Selikoff
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
  • Frits Walraven
Bartenders:
  • Stephan van Hulst
  • Carey Brown

Regex help

 
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm trying to print out the 1st word from the 1st line of a text file, the 2nd word from the 2nd line and so on. If the nth line has less than n words, then print out the last word on the line.

My code so far is this:



To do what I want, I know I should use the split method found in the string class, but I'm unsure how to write the regex to do what I want to do.
 
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

To do what I want, I know I should use the split method found in the string class, but I'm unsure how to write the regex to do what I want to do.



The regex to the split() method simply defines the delimiter that is use to separate the words. For example...

If all the words are separated by one space, then you can use " ".

If all the words are separated by one or more spaces, then you can use " +".

If all the words are separated by any white space character, then you can use "\s".

If all the words are separated by one or more of any white space character, then you can use "\s+".

Etc... well, you get the point.

Henry
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You may not have to use regexps at all. Depending on how the "words" of a line are separated from each other, a StringTokenizer may do. But then, you don't have to use regexps with String.split - e.g., a regexp of " " will split at each space character.
 
Gary Goldsmith
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ok I'm getting somewhere:



This works fine to a point! when the next line only has for example 3 words. But num is greater than 3 I get "file input error" message. How do I print out the last word of that line?
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, as it is, the code expects that line [n] has at least [n+1] words. You can get the length of the returned array through "line.split(...).length".
 
Gary Goldsmith
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry, but this has me stumped. I've tried the following:


I've tried other if statements but still get "File input error".
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't really understand what the code is supposed to do. E.g., "num" seems to be the same as "i" since it is incremented for each line - is that by design?

In general, the code should proceed until there are no more lines in the file, so you might have code like:

[ October 13, 2007: Message edited by: Ulf Dittmer ]
 
Gary Goldsmith
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


numOfWords is the number of lines the code should go through, so the first 20 lines only.

"num" is the line number, the idea of the code is to print out the 1st word from the 1st line of a text file, the 2nd word from the 2nd line and so on. So I increment num on each loop, which gives me the next line, "num" is then used as the number of words in from the start, i.e. if num is 3 then its line 3 and word 3 of that line. The problem is that if line 6 has 3 words then reading in the 6th word gives an error, and I'm trying to get it to print the last word of the line.
 
Gary Goldsmith
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've worked it out now, the problem was that when I was using the length of the line as the number to be the number of the last word! I forgot that it started at 0, so if there are 5 words on a line its actually 0, 1, 2, 3, 4. And I was saying get word number 5 which didn't exist! Thanks you for your help.
 
Henry Wong
author
Posts: 23958
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You code is incredibly inefficient. You execute a split(), to get an array result, just so that you can find out how many fields there are. Then you run a split(), to get an array result, for each index of the array.

Wouldn't it be better to call split() once, store the result array, and then iterate that result set?

Henry
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think this here is a big problem:

The problem is that "File input error" is not informative about what sort of error has occurred. You've lost important information here: the name of the Exception, the error message, and the stack trace. All three of these can be printed easily with the following:

In the future, debugging will be much easier with this sort of info at hand.
 
Space seems cool in the movies, but once you get out there, it is super boring. Now for a fascinating tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
reply
    Bookmark Topic Watch Topic
  • New Topic