Win a copy of Spring in Action (5th edition) this week in the Spring forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Bear Bibeault
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
Sheriffs:
  • Knute Snortum
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Ganesh Patekar
  • Frits Walraven
  • Tim Moores
  • Ron McLeod
  • Carey Brown
Bartenders:
  • Stephan van Hulst
  • salvin francis
  • Tim Holloway

Reading a file off the internet and parsing  RSS feed

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Guys

I need to do all of this in Java

a quick question I have the url of a text file on the web -

1) some lines begin with # - I need to ignore these - other lines have nunmbers and some useful information on them  1232432 21 #


Thus this is what the file looks like (with several entries)

# this is a sample file only to show you the format
# lines like this need to be ignored
#
12343453533    23   #  1 Jan 2012
12232324223    33    # 2 Jan 2013
12343434344    44   # 7 Jun 2016
#
# and more lines to be ignored

I only need the two numbers ie 12343453533 23, and 12232324223 33 and 12343434344 44 - how do I a) open the text file whose url I have and b) extract these numbers ? ie first and second column of the non # rows ?

Million thanks !

Farrukh
 
author & internet detective
Posts: 38909
684
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This tutorial page shows how to read a file off the internet.

For parsing, you can use startsWith to deal with the comment. And you can use split() to parse individual fields.

Try putting these together and post a reply when you are successful or stuck!
 
Ranch Hand
Posts: 70
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
to parse txt file you can use:

 
Vasyl Lyashkevych
Ranch Hand
Posts: 70
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
using file chooser:

 
Marshal
Posts: 61727
193
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry for being late, but what is “so” in line 6? I hope it isn't a String. No, I don't think Strings have add methods; maybe it is a List, which is better.
Don't write an if block all on one line; indent it correctly.
Why are you not using a Scanner for a text file?
Why are you declaring so many exception types?
If you are using a buffered reader, are you familiar with this idiom:-
I would suggest that you can use a Stream nowadays;Please check that I have got the arguments for String#split right; I meant to use a regex comprising a hash sign but escaped, and divide once only, so the right half of the String becomes element 1 of the array. I suspect you will suffer an out of bound exceptions whichever way you try to split if the String doesn't contain a has sign.
The BufferedReader#lines() method creates a Stream<String> which handles each String returned from the reader in order. It first maps each String to something else, which happens to be another String, using the well‑known String#split method and keeping the right half of the text (as unnamedArray[1]), using a regex which I have probably written wrongly.
It just happens that in this case we are mapping from a String to another String; in other cases you can map to a different type. So the second line creates a second Stream<String>, which can be collected into a List with the collect() methodexamples of usage in that link. You will notice that takes a Collector as its argument, but you may have guessed from the examples that you can use the Collectors utility class. Yes, it has a method which returns a Collector creating a List.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!