Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Writing Big Vector to file.  RSS feed

 
T sandya
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Friends,
I have one input text file with size around 800MB, which contains the capital word letters and small letter words. I am reading the text file and capturing all the capital letter words in to vector(obvioiusly size of this vector is big, let us assume 700MB).

now my problem is if i want to remove the dupicates in vector and i need to sort the vector, it is going to delay.. delay.... due to vast size.
it is taking so much time to process. if i try with small file eveything is ok!!!

for example ,if it is successful, i need to write this big size vector into flat file.
but this process is taking so much time to do that.
so please suggest the best way to do my scenario.

thanks in advance.

Sandya.
 
Rob Spoor
Sheriff
Posts: 20893
81
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you need the original Vector? If not, use a LinkedHashSet instead. Because it is a Set, it cannot contain duplicate elements. It also retains your original order, and the Hash part should enable fast checks for existence of elements when adding.
 
Vilmantas Baranauskas
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you need sorting then use TreeSet. This will give you sorted collection without duplicates.

Try:
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!