• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to count how many duplicates are in an ArrayList?  RSS feed

 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

So, as the title says I'm trying to count the number of elements in an ArrayList which also have duplicates. So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.

If an element is found to be a duplicate, that element should then be exempt from the search so if that element is found again it should not increase the duplicate count.

Here is my code:



I know it's wrong because right now it's still increasing the duplicate count for elements that have already been detected as duplicates. How can I make it stop doing this?
 
Junilu Lacar
Sheriff
Posts: 11494
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How would you do this by hand, without the help of a computer?

Here's another hint: histogram
 
Tapas Chand
Ranch Hand
Posts: 614
9
BSD Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Before going for the solution please check your if condition. There is a semicolon after the if condition, that means it is not doing anything.
The statements after the if condition will always be executed no matter what are the elements of the list.
And there is no braces for the if condition.
you should always have braces for if/else/while/for as a good coding practice.
 
Junilu Lacar
Sheriff
Posts: 11494
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Besides the coding error mentioned by Tapas Chand, check if your algorithm will work with a list like this: {cat, dog, cat, dog, cow, dog, duck, duck, goose} - note that 'dog' appears three times in the list.
 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.

For example if your list contains:
cat
cat
dog
dog
horse
zebra
zebra

The output is 3 because cat, dog, and zebra are duplicated?
 
Karthick Pattabiraman
Greenhorn
Posts: 18
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote:Hello,

So, as the title says I'm trying to count the number of elements in an ArrayList which also have duplicates. So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.


Give this a try

 
Liutauras Vilda
Sheriff
Posts: 4925
334
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Karthick Pattabiraman,

I'm afraid you didn't help.
 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
@Liutauras Vilda:
Why? Was his answer incorrect?
 
Liutauras Vilda
Sheriff
Posts: 4925
334
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kat Rollo wrote:@Liutauras Vilda:
Why? Was his answer incorrect?

It is not about that.
Hopefully moderators will comment on this.
 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I thought the solution was wrong. We have the same answer.
But I get what you're saying.
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The reason is that handing out a complete answer prevents OP learning from the exercise. This is what it says on the contents page for this forum:
We're all here to learn, so when responding to others, please focus on helping them discover their own solutions, instead of simply providing answers.
So, don't be annoyed, but I have pulled rank and deleted the post (it remains in memory so I might restore it later).

What happens if an entry is triplicated? What if you had a list like that which Junilu Lacar posted earlier today? Does Junilu's list count as 3 or 4 duplications?
 
Carey Brown
Saloon Keeper
Posts: 3328
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote:So, as the title says I'm trying to count the number of elements in an ArrayList which also have duplicates. So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.

If an element is found to be a duplicate, that element should then be exempt from the search so if that element is found again it should not increase the duplicate count.

There is a difference between your list here and those proposed by others: your list is already in sorted order, which should give you some insights on how to create an algorithm for this calculation. Hint: you only need one loop. If the list you are given is not sorted, then you can sort it yourself, or you'll need some sort of collection to keep track of which animals you've already seen.
I hope this helps without giving too much away.
 
Karthick Pattabiraman
Greenhorn
Posts: 18
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:The reason is that handing out a complete answer prevents OP learning from the exercise. This is what it says on the contents page for this forum:
We're all here to learn, so when responding to others, please focus on helping them discover their own solutions, instead of simply providing answers.
So, don't be annoyed, but I have pulled rank and deleted the post (it remains in memory so I might restore it later).


No problem. I am also in the process of learning. So thought of sharing my approach which would attract some comments if there is something good or bad in it.
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Apologies accepted .
Please consider what to do if an element is triplicated. What should you give as output for Junilu's List which contains "dog" thrice. If the output of that list is supposed to be 4, then there is a very simple way to achieve that with a Set. But that appears not to be what you want.

Another method is to put all the elements into a counting application (if you search for Java® Tutorials Map interface) you will find one there. Iterate the set of “K”s (or Entries) and count how often the “V” is greater than 1.
If you use Java8 there is a straightforward way to do that with a Stream.
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your technique would have worked nicely, and would have given the value for triplicates which the OP wanted.
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tapas Chand wrote:Before going for the solution please check your if condition. There is a semicolon after the if condition, that means it is not doing anything.
The statements after the if condition will always be executed no matter what are the elements of the list.
And there is no braces for the if condition.
You should always have braces for if/else/while/for as a good coding practice.


Yeah damn that really was some horrible coding for me... thanks for pointing that out.
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kat Rollo wrote:

So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.

For example if your list contains:
cat
cat
dog
dog
horse
zebra
zebra

The output is 3 because cat, dog, and zebra are duplicated?


Yes. No matter how many duplicates there are of a word, it should only increase the duplicate count by 1. It's counting how many words there are which have duplicates, not how many duplicates of a word there are.
 
Guillermo Ishi
Ranch Hand
Posts: 789
C++ Linux Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The j = i+1 is a great way of avoiding comparing twice, which should be in the back of everybody's mind.
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Junilu Lacar wrote:How would you do this by hand, without the help of a computer?

Here's another hint: histogram


Well by hand I would just choose to simply ignore elements which I've already recognised as a duplicate.

I was thinking of creating a new ArrayList and filling it with elements which are known to be duplicates, then maybe making another if statement which checks if the element which is found to be a duplicate is already within that new ArrayList, and then not bothering to increase the duplicate count if it is (I hope that makes sense, I'm pretty terrible at explaining things). But I just thought there must be a more simple way which I'm just not thinking of.
 
Carey Brown
Saloon Keeper
Posts: 3328
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Guillermo Ishi wrote:The j = i+1 is a great way of avoiding comparing twice, which should be in the back of everybody's mind.

What about: { cat, cat, horse, dog, cat, cat, zebra } ?
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Junilu Lacar wrote:Besides the coding error mentioned by Tapas Chand, check if your algorithm will work with a list like this: {cat, dog, cat, dog, cow, dog, duck, duck, goose} - note that 'dog' appears three times in the list.


Exactly, it won't. When it checks to see if there are any duplicates of index 1 (dog in your list), it will find index 3 and then increase the duplicate count. When the loop reaches index 3 it will then search for a duplicate again and find index 5, and then increase the duplicate count yet again, even though the string "dog" has already been recognised as a duplicate. I'm sure there must be some simple way to stop it from doing this but for some reason I just can't seem to figure it out -_-
 
Liutauras Vilda
Sheriff
Posts: 4925
334
BSD
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote:I was thinking of creating a new ArrayList and filling it with elements which are known to be duplicates, then maybe making another if statement which checks if the element which is found to be a duplicate is already within that new ArrayList, and then not bothering to increase the duplicate count if it is (I hope that makes sense, I'm pretty terrible at explaining things). But I just thought there must be a more simple way which I'm just not thinking of.

Sounds good. If it matches with your requirements, this way is quite simple, try to implement this, see if it works, post it here.
 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote:
Kat Rollo wrote:

So for example, in an ArrayList of strings which contains cat, cat, dog, horse, zebra, zebra, the answer should be two.

For example if your list contains:
cat
cat
dog
dog
horse
zebra
zebra

The output is 3 because cat, dog, and zebra are duplicated?


Yes. No matter how many duplicates there are of a word, it should only increase the duplicate count by 1. It's counting how many words there are which have duplicates, not how many duplicates of a word there are.

Thanks for the clarification.

Karthick Pattabiraman and I have the same solution (though I didn't post mine). Here is how the algorithm went:

1) Get all unique elements from the List.
2) Check the occurrence of each unique element in the List.
3) If an element occurred more than once, increment your counter.

Since you seem to be studying Collections, you may want to read up on the Java Collections Framework and Collections API.

Goodluck!
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks everyone, with your help I managed to work it out ^_^
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well done
Please show us how.
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:Well done
Please show us how.


Haha well, to be 100% honest, I only managed to get it working enough so that the website grader marked it correct. If there was a list which contained more than one duplicate of a word, so for example something like "cat, cat, horse, dog, cat, cat, zebra", I think my code would count 3 duplicates even though in reality there's only 1. My code works because the lists which the website uses as tests only contain a maximum of 1 duplicate per word.

Here is my new code:


To be honest it barely differs from the code I originally posted, and as I said it isn't a 100% working solution but I feel like I just spent way too long on it to no avail so I just moved on.
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're lucky the test site is not pickier. I have restored the solution which I deleted this morning, so you can compare the two.
 
Guillermo Ishi
Ranch Hand
Posts: 789
C++ Linux Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:
Guillermo Ishi wrote:The j = i+1 is a great way of avoiding comparing twice, which should be in the back of everybody's mind.

What about: { cat, cat, horse, dog, cat, cat, zebra } ?

What I meant was j = i+1 avoids unnecessary comparisons if you're comparing every element of an array with every other element.

 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
This version uses the counting technique I alluded to earlier. It correctly counts GI's cat as one duplicate.There are ways to do that with much less code using a Stream.
 
Joshua Harris
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:You're lucky the test site is not pickier. I have restored the solution which I deleted this morning, so you can compare the two.


Is that the one written by Karthick Pattabiraman? If so, I appreciate his effort but I don't really understand it as I've never seen some of the things he uses such as "HashSet", "frequency" and "Collections." I presume the website I'm using will cover them in the next chapter or something.
 
Carey Brown
Saloon Keeper
Posts: 3328
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My solution (eluded to earlier)
 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote: . . .
Is that the one written by Karthick Pattabiraman? . . .
Yes.

A HashSet is an implementation of the Set interface; sets do not support duplicates. You put the contents of the List into a Set and the Set contains one instance of each (using the equals method to distinguish differences). So the Set contains one copy of each word in the List, whether it is duplicated or not.
Then you look at every element in the array and go back to the original List. If the frequency method reports more than 1, you count it.
 
Carey Brown
Saloon Keeper
Posts: 3328
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Carey Brown wrote:My solution (eluded to earlier)
;

Should have been
 
nick woodward
Ranch Hand
Posts: 382
12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
here's my (probably buggy) effort, if it helps. if not, i enjoyed that


 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Sample Execution:

Here's my solution with comments and sysouts so you see what's happening. It also works for unsorted lists and multiple occurrences of elements (duplicate, triplicate, and so on).

As a bonus (if you're not familiar with it yet), we're using a foreach-loop here. It's an enchanced for-loop that's great for traversing collections forward.
 
Junilu Lacar
Sheriff
Posts: 11494
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kat Rollo wrote:Here's my solution with comments and sysouts so you see what's happening.

Rather than add comments and System.outs, prefer to refactor for clarity. With a more judicious choice of names, you can make the code more readable and self-explanatory. Extracting implementation details to methods like unique() distinct() and hasMultiple() hides complexity behind intention-revealing names.

EDIT: refactored from "unique" to "distinct"

 
Kat Rollo
Ranch Hand
Posts: 62
Eclipse IDE Java MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The comments and sysouts were just for instructional purposes. They're not really meant to be there. It's also intended to be nearer the OP's original code:

 
Guillermo Ishi
Ranch Hand
Posts: 789
C++ Linux Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joshua Harris wrote:How can I make it stop doing this?

Here is your original code adapted to work. I think it's more readable than some.

 
Campbell Ritchie
Marshal
Posts: 56576
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I hope you never write == false anywhere else.
 
Junilu Lacar
Sheriff
Posts: 11494
180
Android Debian Eclipse IDE IntelliJ IDE Java Linux Mac Spring Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Kat Rollo wrote:It's also intended to be nearer the OP's original code:


Easy enough to do:

As for other versions offered, watch out for the Arrow Anti-pattern
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!