Last week, we had the author of TDD for a Shopping Website LiveProject. Friday at 11am Ranch time, Steven Solomon will be hosting a live TDD session just for us. See for the agenda and registration link
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
Bartenders:
  • Piet Souris
  • Himai Minh

Writing Big Vector to file.

 
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear Friends,
I have one input text file with size around 800MB, which contains the capital word letters and small letter words. I am reading the text file and capturing all the capital letter words in to vector(obvioiusly size of this vector is big, let us assume 700MB).

now my problem is if i want to remove the dupicates in vector and i need to sort the vector, it is going to delay.. delay.... due to vast size.
it is taking so much time to process. if i try with small file eveything is ok!!!

for example ,if it is successful, i need to write this big size vector into flat file.
but this process is taking so much time to do that.
so please suggest the best way to do my scenario.

thanks in advance.

Sandya.
 
Sheriff
Posts: 22649
126
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do you need the original Vector? If not, use a LinkedHashSet instead. Because it is a Set, it cannot contain duplicate elements. It also retains your original order, and the Hash part should enable fast checks for existence of elements when adding.
 
Ranch Hand
Posts: 89
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you need sorting then use TreeSet. This will give you sorted collection without duplicates.

Try:
 
when your children are suffering from your punishment, tell your them it will help them write good poetry when they are older. Like this tiny ad:
Free, earth friendly heat - from the CodeRanch trailboss
https://www.kickstarter.com/projects/paulwheaton/free-heat
reply
    Bookmark Topic Watch Topic
  • New Topic