• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

Selective element update for large sized list

 
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,

I am solving a puzzle, its description is as given below:

we are given a list of size N on which we have to perform M operations and return max value available in the list.

For each M operation, we are given 3 values a, b and k. a is the start index, b is the end index and k is the value that needs to be added to all elements in the range {a,b}.

I have used array as the data structure for simplicity and ease of use. The solution fails on performance aspect as it takes very long time to solve problem for large input set.

Here is my solution:


The input dataset is huge, not sure if it will come in this space. There are 100000 entries in the set.

Here is the small snippet of the dataset:
My question is how to reduce the time lag in this solution. I do not know how to speed up a for loop.

I was thinking of using some functional solution like :  But I have not worked it out completely. It might be possible that I need to use java 8 methods with some parallel operation to speed this up.

how should I resolve this?
 
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I see nothing in that code you posted that would read anything from this so-called HUGE dataset of yours. You only have a Scanner that reads from the console and you have no prompts. What you perceive as a for loop taking so long might just be the program silently and patiently waiting for you to enter something at the keyboard.

By the way, do not close a Scanner when it is taking input from System.in. Disregard any warnings your IDE may have about not closing the resource; System.in should not be closed.
 
Junilu Lacar
Sheriff
Posts: 17734
302
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also, you don't need Java 8 and streams to make your program faster. Pre-Java 8 style code can handle this task fine and if written properly, the program you need can probably process that much data in just a few seconds, if it even takes more than a couple.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the reply. The input parameter I posted is the one I give as input to scanner.

What I was doing was putting all the input values in console together in place of passing 10000 entries one at a time. Logic is that scanner will reach 3 values at a time and for each iteration while loop will take care of processing it.

My solution passed basic tests but failed in performance testing. Hence I thought that I might need to have faster processing.

I have closed the scanner after while loop. I do not intend to input any other value after that. Is there any other reason for not closing scanner?
 
Saloon Keeper
Posts: 5654
214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If the average distance between end index and start index = 2.000.000, and there are 100.000 lines, then in total you are doing 2 * 10^11 additions. If java does 10^7 additions per second, then the whole operation will take 20.000 seconds...

Suppose I have the index-segments [a. b] and [c, d], and I know that for index i array[i] is maximal for all indices in [a, b], and likewise j is the index in [c, d] where array[j] is maximal, then what can I say about the maximum when I add X to all array elements with indices in [a, b] and Y to all elements with index in [c, d]? Beware that [a, b] and[c, d] may overlap.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Piet Souris wrote:If the average distance between end index and start index = 2.000.000, and there are 100.000 lines, then in total you are doing 2 * 10^11 additions. If java does 10^7 additions per second, then the whole operation will take 20.000 seconds...

Suppose I have the index-segments [a. b] and [c, d], and I know that for index i array[i] is maximal for all indices in [a, b], and likewise j is the index in [c, d] where array[j] is maximal, then what can I say about the maximum when I add X to all array elements with indices in [a, b] and Y to all elements with index in [c, d]? Beware that [a, b] and[c, d] may overlap.


Sorry for the delayed response. Was on vacation.

The issue I found was that the index range is quite big for the operation to perform. From what you suggest, I get that whatever index is the previous maximum, it would stay as maximum if the fixed weight is added to that particular index, else I can do the usual iterate and find maximal operation. For this I would need to preserve the previous start and end index.

Does this seem more optimal?
 
Piet Souris
Saloon Keeper
Posts: 5654
214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Since my reply I had some time to think a bit deeper about this problem. First of all, I think this problem is slightly easier than I thought. I assumed that the 10.000.000 sized array would be input as well, so that we had to find the maximum to start with But now I assume that the array contains only zeroes at the start. That makes it, as said, slightly easier.

The idea is that we have a range given, the size of the array, and that we have a series of non overlapping segments (i.e. closed intervals) such that they exactly cover the given range. Each segment has a value. Now, suppose that each segment belongs to a class that implements Comparable<Segment>, where the comparison is simply between the starting indices of the two comparants. If we have these segments stored in a TreeSet, then it is  simple matter to find each wanted segment.

For instance, say we have this class:


(Note: I have not testd this code, so it may contain one to many errors)

In the main class, set up the TreeSet, consisting of the initial Segment with a max of 0, read the first line, use the subSet method of a TreeSet to find all Segments that are hit, call the 'process' method of each of those Segments, and replace each hit Segment by the returned Segments (if the Optional contains a value, that is).

I have not done this myself, I hope it is fast enough. Let me know if you have problems implementing this, or if you have a better idea.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the explanation.

I have a query regarding the segment implementation. We will create segment for a fixed index range interval or will it have index range from the input?

I got the point of using treeset, if the comparing attribute is max value, then the tree would always have the highest segment at top.

Will evaluate this further.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I re- read your solution. Understood the logic this time. I was trying a different logic using just the treemap but it doesn't look like working.

Still unclear about process() but I will work on this solution and hopefully I will get more clarity on handing cases with overlapping segment indexes.
 
Piet Souris
Saloon Keeper
Posts: 5654
214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have made a simple demo:
demo click here
so that you can have an idea of what I mean, in case you need a little push in the back. It doesn't do the value updating, but that should be the easy part. I'll leave it there untill tomorrow morning (about eight hours from writing this).
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the response. Although I did see the link, I have an idea of what I need to do.  I am bit busy with my official work. will try to work on it tomorrow.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay I have started working on this problem.

Here is my understanding of the problem solution using Segment.

We mainly have additional operation in this problem, but to speed up the operation for large data set, we have to breakup the structure.

What I am doing right now is break up the structure to a range of 5 indexes ( initial test value). I will create as many segments as needed to cover a given data set.

if we have a collection of segments :

Considering a scenario where the input range overlaps many segments( most probable case), we have to get subset of segments which come under this start and end index range. For this range, we just have to add data to the max value and return it. Here we are not adding up value to each index, but we designate max value of each segment and add input value to this as maxNum will be the maximum value in the given segment range. So, the add operation is limited to the number of segments applicable to a test scenario.

We have to then iterate the result set and find the max value among them.

Here is my initial code:


Will keep working on adding the remaining logic to this code.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Updated solution which works for single segment.

I have come to conclusion that just using a max variable, I cannot handle segment max value as within same segment the max will change across indexes.

I am proceeding in sequence as the solution to the problem becomes more clear.

Updated Code:


Three issues remain:
1) what to use as key for the subset so that I get the segments in that range.
2) how to determine index interval to handle index range  between 3 - 10^9.
3) how to update data value within a segment using divide and conquer.
 
Piet Souris
Saloon Keeper
Posts: 5654
214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The idea is that we have a series of non overlapping segments, such that the union is exactly our initial range. So, if we have that our array contains 10_000_000 elements, then we start with an initial Segment(0, 10_000_000 - 1, 0);

Now, each line of the input we are given a newStart and a newEnd, and some value.
For all existing Segments in our TreeSet we have to find the Set of all Segments that somehow overlap the inputted newStart and newEnd values. The way I did that im my demo was:

For this to work, you must not issue:

since you need a NavigableSet for the floor-method to work.

The idea is that each overlapped existing Segment is subdivided on the basis of the inputted newStart and newEnd, such that this new Set is again a true partion of the overlapped segment.
To make this work, you need to be a bit careful.

Say, we have a Segment S(10, 17), and we are given as input, 9 as startValue and 16 as endValue. Into what Segments must we divide our given Segment, such that these new Segments do not overlap but their union covers still our original Segment (10, 17)? And what does that mean for the values of our new Segments?

In my demo I deleted all overlapped existing segments, to be replaced by the newly formed Segments.

But be carefull with that deleting: the subSet of overlapped Segments is a view of the given Set 'segments', so this code will get you an error:

Your questions:

1) the 'key' for the Comparable and for the equals is the start of each Segment. That will keep our 'segments' TreeSet sorted based on the startValue.
2) to find all overlapped segments: see above
3) as you see, I do a replace/add sequence, so no divide and conquer. I let the TreeSet do all the hard work.

Given that we need to process 100_000 new Segments, and say that on average 5 segments are hit and 10 newly are formed, we do 500_000 deletes and 1_000_000 additions. That seems favorable to the 2 * 10^11 additions, but I am curious how fast all these deletions and adds take.

Oh yeah: after all processing, don't forget to get the maximum of the values of our Segments, since that was the question.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the  explanation. After reading this few days, I am getting the logic now.

Will implement this and see how it goes.
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

Does this solution logically look like what you intended?

 
Piet Souris
Saloon Keeper
Posts: 5654
214
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your implementation is lacking. For instance, the overlap method is incorrect, and as I said before, the subset is a view of the original, and trying to remove an element from such a view gives  an exception.
Have you tested your code?

But it doesn't matter. My TreeSet/Segment implementation turned out to be very, very slow. In your opening post you asked for some java 8 routine to do the job. Well, this turned out to be much much faster:

With 10_000_000 as maximum end of the Segments, and with 100_000 segments, this code runs in about 8 minutes on my humble laptop.

So, time for something completely different.

Say, we are given these Segments (since a Segment is a handy way to describe these 100_000 lines of input, I'll stick to using Segments)
[3, 8]: 5
[5, 7]: 3
[1. 3]: 4
Can you figure out what the max will be?
Now, suppose we transform these three Segments into these data: (I've used a plain old Point for this, good enough)
[3, 5], [8, -5], [5, 3], [7, -3], [1, 4], [3, -4]

Then I made a special Comparator<Point> that sorted the above Points:
[1, 4], [3, 5], [3, -4], [5, 3], [7, -3], [8, -5]

And from this sequence I derived the following two sequences:
4 9 5 8 5 0
4 9 9 9 9 9

Can you see what I'm doig here, and what it has to do with the assignment?

The code I wrote is just a nice elegant Java 8 stream, of just a few lines, and it gives the max of 100_000 overlapping Segments in less that a second.

I did use a helper class 'Summary' in this java 8 stream, and it may be a clue:
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for the explanation. I too initially tried using IntStream but did not reach anywhere( handling max value). The second half of your solution looks complex. I will get back once I understand this part.
 
Marshal
Posts: 80867
505
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Junilu Lacar wrote:. . . you don't need Java 8 and streams . . .

A Stream is intended to operate on the whole of a source (that source might be a List), so a Stream might not work at all well if you are only dealing with a part of the List. Another thing is that the Stream will create a new List which you will later have to incorporate in the old List (or array or whatever), so Junilu is right that a loop is probably easier to use.
For simple additions, I agree with him: you shou‍ld be able to update as many ints as will fit into the average heap space in a second or two.
 
Piet Souris
Saloon Keeper
Posts: 5654
214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, let me try to be a bit less cryptic.

Suppose we are given the segments:

[2, 5]: 3
[8, 9]: 12
[1, 4]: 4


Let us consider these as rectangles. For intance, the first one as a rectangle with width 5 - 2 = 3, and height (also) 3. Set these rectangles out on the x-axis, this x-axis starting from 0:

__4_7___3_0____12_0__0
 1 2   4 5    8  9 10

Under the x-axis, we see the segments, above the x-axis we see the accumulated values.
Starting from x = 0, we have the initial value of 0. Next, we encounter our first segment: [1, 4]: 3. So, starting from x = 1 we now have a value of 3.

When x = 2, we encounter our second segment: [2, 5]: 3.
Our current value now increases to 4 + 3 = 7.
Going further to the right, when x = 4, we see that our first segments ends. So, our current value now drops to 7 - 4 = 3. But this does not influence our global maximum of 7, so far!

Going further to the right, we now arrive at x = 5. Our second segment ends here, and our current value drops to 3 - 3 = 0. Again, our global-max-so-far is not hit.

Then we arrive at our third segment, at x = 8. Our current value increases to 0 + 12 = 12. And hey, that is above our maximum-so-far! Therefore, max-so-far becomes 12.

Next we arrive at x = 9, the end of our third (and last) segment. Current value drops to 0, but again, our max-so-far remains what is was.

And finally, we arrive at the end of our journey (in this case: x = 10). We end with a current value of 0, as it should be, and with a max-so-far of 12.

Does it become a little less cryptic what I was describing?

If you generate (or read as input) 100_000 segments, and you follow the procedure I described, then you will find that programming this is surprisingly simple, that the code by using a java 8 stream is very elegant and short, and that it takes, on my humble laptop, around three quarters of a second.

It is one of those problems that, once knowing the solution, make you think "Oh of course, why didn't I think of that before" (the famous "Aha-Erlebnis"), but after witnessing the very dissapointingly slow TreeSet way, it kept me awake for half a night...

I hope it is a little bit more clear, now.

Finally, and in response to Campbell, this is actually the first time that I witnessed that a Java 8 stream was much faster than the equivalent Java 7 for-loop. Thr java 8 version was about four times as fast!

Note that a Segment is a class with int-fields: start, end, value. (start and end inclusive). I tested it with 10_000 inputSegments given, and where each Segment could be a wide as 10_000_000)

java 7:


java 8:
 
Creator of Enthuware JWS+ V6
Posts: 3412
320
Android Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Congratulations s ravi chandran, your question was selected for this month's Journal  

Have a Cow!
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Frits Walraven wrote:Congratulations s ravi chandran, your question was selected for this month's Journal  

Have a Cow!


Thanks Frits!
 
s ravi chandran
Ranch Hand
Posts: 595
6
jQuery Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Piet Souris wrote:Well, let me try to be a bit less cryptic.


Extremely sorry for no response from my end. I am having internet connectivity issue at my new place. Just moved in to a new location.

Will work on this as soon as I get my broadband working.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic