• Post Reply Bookmark Topic Watch Topic
  • New Topic

Database Query versus ArrayList  RSS feed

 
Claude Cundiff
Ranch Hand
Posts: 78
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello everyone,

I have a database of intraday stocks prices (open, high, low, close, etc.). The database is rather large. There are millions of lines. Here are some questions:
1.) Is it even possible to store ALL of these in an ArrayList AND if so, would that even be a good idea?
2.) Am I correct in saying that querying the database will take more time than retrieving it from an ArrayList?
3.) Given that the ArrayList will not hold all of the data, how could I Optimally alternate between loading an ArrayList and Querying the database?

Thanks!
 
Nitesh Kant
Bartender
Posts: 1638
IntelliJ IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Claude Cundiff wrote:
1.) Is it even possible to store ALL of these in an ArrayList AND if so, would that even be a good idea?


It all depends on how much memory does an entry in the arraylist take and how much memory is available in the JVM. So, if your data is only a few bytes and you have a 4GB JVM heap, then you can probably store it. But, it all depends on the size of the data and size of the heap.

Claude Cundiff wrote:
2.) Am I correct in saying that querying the database will take more time than retrieving it from an ArrayList?


Well if with query you mean picking up a particular entry from the arraylist, then it depends on the size of the list. If it is huge then you would have a hard time locating an entry if it is near the end. On the other hand, if you indexed your DB properly then it *may* be faster to get it from DB. Of course, it will depend on the network latency and DB load but it is a possibility.

Claude Cundiff wrote:
3.) Given that the ArrayList will not hold all of the data, how could I Optimally alternate between loading an ArrayList and Querying the database?


It will be better if you can use a hashmap instead or may be use a LinkedHashMap as an LRU cache. (See javadocs for details). If you do not find it in the map, lookup the DB and put the result in the map.
This is a very raw form of a cache that people typically use to quicken the DB interactions. There are professional products available for cache like memcache and ehcache.
 
Tim Holloway
Bartender
Posts: 18531
61
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Memory is typically thousands of times faster than disk for information access. So it's hard to beat a memory-only solution.

It's not impossible, however. If your memory-accessing algorithms are poor, and especially if you're using virtual memory and using it badly, you can see memory underperform a DBMS.

DBMS's are designed to give reasonably high performance for general-purpose use. For specific cases, you can almost always create something faster by leaving out the features that the DBMS supplies that you don't need and tweaking the access algorithms. However, the bigger the problem, the more likely that the various optimizations that the DBMS has will begin to narrow or even overcome that gap. Also the bigger the problem, the more raw programming work you'll have to create and maintain in order to out-perform a DBMS. And, since data has a way of changing its shape over time, you'd have to constantly go back and revisit your solution and perhaps make radical changes to it. A DBMS may not be optimal for a specific case, but by virtue of its being able to bear multiple solutions and select the optimal one in its tool kit, it may make sense to sacrifice a little performance in exchange for a LOT of your time and peace of mind.

Ultimately, of course, pretty much all data access is going to be done in RAM - the exception being IBM mainframe disks that have hardware indexing capabilities. So the real question is whether you think your custom usage of RAM will be worth the extra effort.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13078
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A two stage approach could drastically improve average performance, assuming that most stock prices will be accessed seldom if at all and practically all queries will be for commonly held stocks. I am, of course, talking about caching the most frequently requested quotes in memory. There are plenty of open source Java caching toolkits - Whirlycache for example.

Bill
 
Claude Cundiff
Ranch Hand
Posts: 78
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks everyone! Your responses are greatly appreciated!
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!