• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

performance while executing loops

 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In our application (It's a web based application using JSPs,Servlets, Javabeans ), there are instances wherein we have to execute "while" loop for more than 40,000 iterations. This poses major performance problem. Since our apllication is a web application, we cannot afford to have even 6 secs delay . We have even tried using hotspot compiler to overcome this problem. Nothing works out.
Can anybody help us?.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, what are you doing in the while loop? An empty loop would take almost no time - I just tested and it takes less than 10 milliseconds to increment a counter 40000 times, on my system. Calculating a sine function 40000 times takes 180 msec. So you really need to look at what operations are performed inside the loop - they make all the difference.
 
moor
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Well, what are you doing in the while loop? An empty loop would take almost no time - I just tested and it takes less than 10 milliseconds to increment a counter 40000 times, on my system. Calculating a sine function 40000 times takes 180 msec. So you really need to look at what operations are performed inside the loop - they make all the difference.


Sorry!. I must have briefed you what we are doing inside the loop. I apologize. Basically, we are retrieving a particular data from the ResultSet object and adding it in to a Vector object.
Just I will give the sample code.
while(rs.next())
{y = rs.getInt("x");
// z is a vector
z.addElement( new Integer (y);
}
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by moor:
while(rs.next())
{y = rs.getInt("x");
// z is a vector
z.addElement( new Integer (y);
}


So you are reading 40000 rows from a database??? You don't *really* *need* all of them, do you?
 
moor
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ilja Preuss:
So you are reading 40000 rows from a database??? You don't *really* *need* all of them, do you?


Yes!. I need to add 40,000 elements in to my vector.
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Urg. Well, that's the problem then. It's going to take some time to process 40000 records. There may be ways you can speed this up, but chances are good that your best bet is to find some way to eliminate the need to call 40000 records at once. Perhaps there is a way to do some or most of the required work before the user submits their request, and cache the result until it is needed? It's hard to offer good suggestions for this without knowing more about your application. What do the 40000 integers represent? When or why did their values last change? What will the Vector of 40000 results be used for?
Incidentally you should probably replace the Vector with an ArrayList or LinkedList - I suspect the latter will be fastest. But this is probably insignificant next to the time to access 40000 rows of a ResultSet.
Also, I overlooked your user name when I first responded. Please take a look at our user name policy and re-register with a valid user name (one with a first and last name. Thanks.
 
author
Posts: 3252
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by moor:
Yes!. I need to add 40,000 elements in to my vector.

To be honest, my knee-jerk reaction is "I don't believe you". If you are shifting that kind of data to the client, your users wouldn't be upset by a mere 6 seconds delay. If you're not shifting that kind of data, needing to create 40,000 objects to generate a single response looks like an architectural problem rather than a mere performance problem.
I may well be mistaken though. It might be helpful if you would tell us a little bit more.
- Peter
 
author
Posts: 106
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I agree with what's been said so far. Here's some other things to think about. If you are adding 40K things to a Vector or ArrayList, are you presizing it to that size?
By default a Vector and ArrayList are sized to hold 10 elements. If you are adding 40K things, you are doing a whole bunch of new allocations. When you add more things than will fit in a Vector or ArrayList, they create a new underlying array (by default it doubles with a Vector and increases by 50% with an ArrayList), then copy all of the elements from the old array to the new array. Then the old array becomes garbage. For a Vector this will be 12 allocations and 12 copies, for an ArrayList, it will be 21 allocations and 21 copies. Also, each copy copies more data. If you size it at 40K, you don't do any additional allocations or copying.
In addition, you are creating 40K Integer objects. That's a lot of allocations and a lot of objects for the collector to track. Can you take those 40K ints, put them in an int array and add that array to the vector? Then you are only adding 1 object to the vector vs. 40K.
Peter Haggar
------------------
Senior Software Engineer, IBM
author of: Practical Java
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Some numbers on my machine: to create 40K different Integer objects and put them in a Vector takes 60 ms. For an ArrayList, 61 ms. And for a LinkedList, 80-90 ms. With presizing, Vector and ArrayList go down to 40 ms. (Presizing is not possible for LinkedList.) So, I'm surprised LinkedList didn't do better - I guess it's really only good for doing insertions and deletions internal to the List. If I modify the code to insert the Integer objects at the beginning of the list rather than the end, LinkedList performance remains the same, but Vector and ArrayList skyrocket to over 8000 ms. (Since they must recopy the entire internal array to shift positions by one, on each insert.) Worth remembering if you ever need to insert or delete somewhere other than the end of a list.
Anyway though, assuming that moor is not inserting entries to the begginning of the list , the times here are all pretty negligible compared to 6 seconds - so the problem is probably elsewhere, as expected. Focus on eliminating the need for the huge ResultSet if possible.
 
Ranch Hand
Posts: 1170
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Some numbers on my machine: to create 40K different Integer objects and put them in a Vector takes 60 ms. For an ArrayList, 61 ms. And for a LinkedList, 80-90 ms. With presizing, Vector and ArrayList go down to 40 ms. (Presizing is not possible for LinkedList.) So, I'm surprised LinkedList didn't do better - I guess it's really only good for doing insertions and deletions internal to the List. If I modify the code to insert the Integer objects at the beginning of the list rather than the end, LinkedList performance remains the same, but Vector and ArrayList skyrocket to over 8000 ms. (Since they must recopy the entire internal array to shift positions by one, on each insert.) Worth remembering if you ever need to insert or delete somewhere other than the end of a list.
Anyway though, assuming that moor is not inserting entries to the begginning of the list , the times here are all pretty negligible compared to 6 seconds - so the problem is probably elsewhere, as expected. Focus on eliminating the need for the huge ResultSet if possible.



ArrayList is not synchronized. Why does it take longer to perform its operations on your computer than Vector (synchronized) does?
 
Ilja Preuss
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by CL Gilbert:

ArrayList is not synchronized. Why does it take longer to perform its operations on your computer than Vector (synchronized) does?


As Peter Haggar wrote:


When you add more things than will fit in a Vector or ArrayList, they create a new underlying array (by default it doubles with a Vector and increases by 50% with an ArrayList), then copy all of the elements from the old array to the new array.


So, ArrayList will more often have to resize.
 
Mr. C Lamont Gilbert
Ranch Hand
Posts: 1170
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ilja Preuss:
So, ArrayList will more often have to resize.


True, but the synchronization should be quite a nice hit. This resize will happen only so often, and each one will take longer to the next one. The vector will have to resize too, but probably 1/4 as often. He must be at the magic number of additions that causes the number of resizes to outweigh the delay due to synchronization. Its hard to believe.
 
Ilja Preuss
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by CL Gilbert:
[...] Its hard to believe.


That is why it is better to optimize by measuring than to optimize by 'knowing'...
 
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all!
This is moor again. Actually I was kicked out of forum because of naming convention. Again I registered.
We have found the real problem is in accessing the huge result set. ResultSet which may have 40k-100k elements. Hence can you give any suggestions on how to reduce time spent in accessing the resultset.
 
Ilja Preuss
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by moor krish:
Hence can you give any suggestions on how to reduce time spent in accessing the resultset.


Optimize your queries, so that you don't get resultsets of this size.
 
moor krish
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
See exactly the time 6 secs , I am talking is after we start accessing the result set. Query is taking less than a sec. Problem is realy in accessing the huge result set.
 
Bartender
Posts: 783
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
moor,
A couple of things you can do to help speed up processing the result set.
1. Add a 'WHERE' condition to filter the number of records returned. Do you really need all that data? As Jim mentioned before, can you cache some of the result for future access and only pay the penalty once? Can you retrieve the result set before the user actually hits the webserver (e.g. on startup)?
2. What JDBC type driver are you using? What database are you using? If you're using a type I, then forget it. It's not meant for production (i.e. fast). The best performance will be a type IV driver.
-Peter
 
moor krish
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi;
We can pull the records only based on the user id who logs in to the site. Hence we cannot pull records at startup. Also we used the maximum possible "where" conditions in the query. We are using oracle thin driver comes along with the Oracle s/w which is a type4 driver.
 
Peter den Haan
author
Posts: 3252
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Presumably you are performing some calculation on the data? Can't you let the database do it? Use SQL expressions or, if need be, stored procedures. They are very efficient at data access.
Moving about that kind of data will always be slow. I won't even start about scalability. If you can get the database to digest it into a managable volume, that would be so much better. If you could change your schema and precalculate as much as possible (possibly using triggers) it would be better still.
- Peter

[This message has been edited by Peter den Haan (edited October 19, 2001).]
 
moor krish
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Peter!
please note the time consumption is taking place in accessing the data from the resultset. We don't face any bottleneck at DB level.
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually I wouldn't be too sure of that. I think some drivers are able to return ResultSets as soon as some rows are available. You can start reading results well before they've all been retrieved. You can't get ahead of the retrieval process though - you will find you are waiting on the next() method in the ResultSet a lot, as the next row is still bing retrieved. I may be completely mistaken here - I seem to remember reading this somewhere, but don't have time right now to track it down. One way to tell if this is happening on your maching though - look at the memory usage (using the freeMemory() and totalMemory() methods of Runtime) immediately before and after retrieving the ResultSet. You should be able to make some sort of estimate at what (minimum) amount of memory 40000 rows would take up - if you don't see that much increase in memory used, that tells you that the rows haven't really all been retrieved yet. Good luck...
[This message has been edited by Jim Yingst (edited October 23, 2001).]
 
Ranch Hand
Posts: 1879
MySQL Database Suse
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Oracle does come with some Oracle specific performance extensions:
http://otn.oracle.com/sample_code/tech/java/sqlj_jdbc/files/advanced/advanced.htm
at this link you can find sample code on these performance extension packages
specifically:
1. Row Prefetch Performance Extension
2. Column Type Performance Extension
If you are doing any inserts/updates/deletes:
3. Batch Update Performance Extension
Also, there is a hint to the driver that can improve performance:
1. Statement.setFetchSize(int) - play with this till you find a number that peaks performance(default is 25 I think)
2. I have also found that if I set autocommit() to false (even for queries) seemed to increase performance nominally.
The only drawbacks for using the Oracle Performance extensions is that it is not portable to a different database. So you'll have to assess the adv/disadv of not being portable
here is an informative link regarding the Oracle Performance extensions: http://otn.oracle.com/docs/products/oracle8i/doc_library/817_doc/java.817/a83724/oraperf2.htm
Jamie
[This message has been edited by Jamie Robertson (edited November 08, 2001).]
 
Ranch Hand
Posts: 68
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Moor,
please could you say what you are actually trying to do at a higher level? it might be helpful to understand what the data is, what processing you do on it (both in stored proc and on the created Vector), what is the result and output you require, etc., just on the off-chance that someone could see a way of cutting down the no. of rows you are iterating through...
cheers, dan.
 
Ranch Hand
Posts: 40
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Moor
As far as I am aware JDBC does not pull all the records back in one go. Basically pulls a number of records based on the fetch size and then pulls another set when required. You could try changing the fetch size and see if that helps.
Have to say I agree with most people here that you do not want to get 40000 records as the network traffic is going to be a killer performance wise.
Phil
 
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ilja Preuss:
So you are reading 40000 rows from a database??? You don't *really* *need* all of them, do you?


I once wrote a piece of code to display "heart rate variability (HRV)" ... basically I would retrieve a whole day or weeks worth of heart-beat information then dynamically generate a GIF image of the heartbeats... 100K+ records / individual / day!!! Was it worth the hit on the database (1 minute processing time+)? Of course it was because I could determine bad heartrates MUCH faster than some doctor somewhere physically sitting down and measuring the HRV by hand! This was also made available via the net and was greeted very warmly by the practitioners involved
 
moor krish
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanx for your suggestions. I will check it out.
 
Ilja Preuss
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Andy Brookfield:
I once wrote a piece of code to display "heart rate variability (HRV)" ... basically I would retrieve a whole day or weeks worth of heart-beat information then dynamically generate a GIF image of the heartbeats... 100K+ records / individual / day!!! ...


I wonder wether an rdbms really would be the best storage for this type of data. What if you had used flat files (one per individual per day), for example (assuming free choice of storage)?
 
Everyone is a villain in someone else's story. Especially this devious tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic