• Post Reply Bookmark Topic Watch Topic
  • New Topic

Standard Deviation in MyMathClass  RSS feed

 
Scott Wolf
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm supposed to write a generic class MyMathClass with at type parameter T where T is a numeric object.

There's no compiler errors or anything. It's just my standard deviation doesn't match with the expectation of standard deviation.

With this program, I get 2.872281323690143 while it is supposed to be 3.0276503540974917. I'm not sure if it is part of my calculation problem or I forget to add something to the program?



 
Paul Clapham
Sheriff
Posts: 22509
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm no statistician, but I sort of recall that there's two kinds of standard deviation, one of them involves dividing by N and the other involves dividing by N-1. The latter would produce a larger result but you've used the former, so perhaps that's the source of your discrepancy.

(Any real statisticians out there, it's okay if you're still laughing at me.)
 
Scott Wolf
Greenhorn
Posts: 4
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:I'm no statistician, but I sort of recall that there's two kinds of standard deviation, one of them involves dividing by N and the other involves dividing by N-1. The latter would produce a larger result but you've used the former, so perhaps that's the source of your discrepancy.

(Any real statisticians out there, it's okay if you're still laughing at me.)


Oh right. I solved the issue. Thanks for reminding me about that there's two kinds of standard deviation. :)
 
Campbell Ritchie
Marshal
Posts: 55772
163
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There must be an elegant way to populate those Lists using a Stream rather than a loop. I have got a solution but I shall let other people produce a better one first.
 
Paweł Baczyński
Bartender
Posts: 2054
44
Firefox Browser IntelliJ IDE Java Linux Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Also, don't declare your method to take an ArrayList.
What if I had an instance of LinkedList or whatever type Arrays.asList() returns or MyFancyCustomBetterThanAnyOtherList? I wouldn't be able to use your class easily.
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:I'm no statistician, but I sort of recall that there's two kinds of standard deviation, one of them involves dividing by N and the other involves dividing by N-1. The latter would produce a larger result but you've used the former, so perhaps that's the source of your discrepancy.

It's been a few geological epochs for me, but I seem to remember you use n-1 for sample variances (and therefore SD's). I'm sure Piet'll enlighten us if he's around.

Winston
 
Campbell Ritchie
Marshal
Posts: 55772
163
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 is not a sample. It is the entire population. So maybe the model answer was calculated with the wrong formula.
 
Tim Moores
Saloon Keeper
Posts: 3893
91
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You might also consider using the BigDecimal class instead of doubles, depending on how much accuracy is desired. It's not going to make much difference for 10 elements, but for much larger datasets, the floating point inaccuracies will eventually add up to noticeable amounts.
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tim Moores wrote:You might also consider using the BigDecimal class instead of doubles, depending on how much accuracy is desired. It's not going to make much difference for 10 elements, but for much larger datasets, the floating point inaccuracies will eventually add up to noticeable amounts.

Hmmm. Maybe, but since SD is a value in the same unit as the input, and the variance sum is divided by n (or n-1) anyway (and (x - x̄) reduces the likelihood of inaccuracy due to the magnitude of the values), I doubt it'll make too much difference as long as you don't assume that the value is accurate to more than a few decimal places.

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!