• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to Access an Array in Multiple Ways?  RSS feed

 
Marshall Veach
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So, this is a question about best practices.

Basically, I have a medium-sized ArrayList of custom objects (about 300,000 of them). The objects
each have a time stamp (a LocalDateTime). I pull them in form a file and when I do they're in
chronological order.

Most of the work I am doing on this data involves looking at subsets based on time. So, for example,
i often need to display an hour's worth of the data. Or sometimes I need to count how many objects
have a time stamp from a particular date.

Now, I could search the array every time I need to extract a subset but it felt wasteful to me; I figured
I should be able to search it once and build a few indices that would let me access it in these ways
more efficiently.

And this where i really started winging it and kept thinking that there must be e cleaner way to do this...

But what I ended up doing is building a set of maps that relate granularities of of time  to indices of the array.
So if I want to quickly access one day's worth of objects then I use my "ByDay" map where the key is a
string that represents a day and the value is the index where that day starts in my array.

And I have a few of these maps. One for every major division of time: BYMonth, ByWeek, ByDay and ByHour.

My question is; is there some data structure that maybe I'm not aware of that solves this kind of
problem more elegantly? Or more efficiently? Have I stumbled squarely into the purview of databases?
Should I be using a lightweight database to do this for me?

Anyway, I hope that's clear. I feel like I'm missing some vocabulary that would help me talk about
this more clearly.
 
Junilu Lacar
Sheriff
Posts: 11494
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
1. Developers are notoriously bad at judging performance issues by gut feelings alone. Use a profiler to see if you really have a problem that needs to be dealt with.

2. Yes, you have stumbled into the purview of some kind of database. You could also use streams. mongoDb is fast, in-memory, and works with streams. There are other DB options, too.
 
Marshall Veach
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for the reply. And yes, I probably should have made it clear that I'm not asking out of a sense that I have a practical need to optimize; I'm just curious/interested in the different ways one might approach this kind of thing. I've not been exposed to streams yet but that's exactly the kind of keyword that I was hoping to get so I can go learn about it. Thanks!
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the Ranch

You say the objects are in chronological order in the List, in which case you have already sorted the List, in which case you can do a binary search with a Comparator which looks at the timestamp. If the timestamp isn't found, negating the result of binary search will lead you to a division point where everything to the left is earlier and everything to the right is later than your timestamp.
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Marshall Veach wrote:. . . I could search the array every time I need to extract a subset but it felt wasteful to me . . . .
Don't call it an array; it is a List. Since binary search runs in logarithmic time, I would have thought its performance would be very fast.

Junliu is right that you can put everything into a database and SELECT xyz FROM tableABC WHERE date > '2016‑07‑29' AND date < '2016‑08‑19';
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Marshall Veach wrote:But what I ended up doing is building a set of maps that relate granularities of of time  to indices of the array.

Sounds like overkill to me.

There's also a basic problem with Maps: Each key can only have one "value" - ie, each 'timestamp' has to be unique.

If this is actually the case, then a single TreeMap<Date, Integer> might work (assuming your timestamp is a Date), and would allow you to retrieve any granularity of 'grouping' you like (have a look at its subMap() methods).

Alternatively, and if you don't need to update your List while you're "trawling" it, my suggestion would be to simply sort it in timestamp order, and then use a binary chop to get the first index of any "group" you want. After that, it's simply a case of reading forward until you reach a timestamp that's outside the range. In fact, you could probably create a specialized Iterator to return all the elements of a "group".

And just in case you don't know, Collections.binarySearch() has an extremely useful feature (although Stephan doesn't agree with me on this ) - if it doesn't find the requested key, it returns '−(insertion point) - 1', where 'insertion point' is the index where the key would be inserted.

HIH

Winston
 
Knute Snortum
Sheriff
Posts: 4287
127
Chrome Eclipse IDE Java Postgres Database VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I've not been exposed to streams yet but that's exactly the kind of keyword that I was hoping to get so I can go learn about it. 

A good book on this subject is Manning's Java 8 In Action. If you Google java 8 stream tutorial you'll get a bunch of tutorials.
 
Knute Snortum
Sheriff
Posts: 4287
127
Chrome Eclipse IDE Java Postgres Database VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not a streams expert, but I think your problem could be solved as easily as:
 
Campbell Ritchie
Marshal
Posts: 56584
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Knute Snortum wrote:. . . I think your problem could be solved as easily as: . . .
Not tried it, but I think you are correct  .
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Knute Snortum wrote:I'm not a streams expert, but I think your problem could be solved as easily as:

Actually, assuming Marshall's ArrayList is sorted, I think the problem could be solved with:
  list.subList(firstIndexOf(starttime), firstIndexAfter(endTime));
and both methods could be built around Collections.binarysearch().

Winston
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!