# Crazy Fast Square Root

posted 5 years ago

Ok, so I have a need for a super fast square root function - common enough thing. Math.sqrt just is not fast enough - I need every bit of performance while doing some N body simulations. So I found this article which lists different ways along with their performance and accuracy. So I'd like to use #14, which it says is 375% faster than the C Math.sqrt, but doing it is a bit beyond me.

It would make a nice little Java library if someone could make a JNA or JNI for this code! (Uses inline method for Assembly)

It would make a nice little Java library if someone could make a JNA or JNI for this code! (Uses inline method for Assembly)

Tim Moores

Saloon Keeper

Posts: 3263

54

posted 5 years ago

JNI isn't hard to get started with, and there are a lot of easy-to-follow tutorials out there on the web, like this one: http://electrofriends.com/articles/jni/jni-part1-java-native-interface/. Have you tried to do it yourself?

posted 5 years ago

Well, first: it looks like the method might be doing some pointer manipulation, which ain't allowed in Java.

Second: if speed is paramount, why aren't you doing the whole thing in assembler?

Third: 9ns isn't fast enough for you? That's what Math.sqrt() takes on my 4-year old clunker Dell (averaged over 100,000,000 calls).

Fourth: Math.sqrt() is likely to be optimized as and when new algorithms turn up; so its performance is only likely to get better. I compared it with the method from this article, written 2 years ago, which seemed quite neat. The author claims that it's 3 times faster (although not as accurate); I found it to be more than 3 times

Winston

PS: If you're really interested in this stuff, you might want to check out Paul Hsieh's page, which has been around for a while now.

Jeff Gent wrote:Ok, so I have a need for a super fast square root function - common enough thing. Math.sqrt just is not fast enough - I need every bit of performance while doing some N body simulations. So I found this article which lists different ways along with their performance and accuracy. So I'd like to use #14, which it says is 375% faster than the C Math.sqrt, but doing it is a bit beyond me.

Well, first: it looks like the method might be doing some pointer manipulation, which ain't allowed in Java.

Second: if speed is paramount, why aren't you doing the whole thing in assembler?

Third: 9ns isn't fast enough for you? That's what Math.sqrt() takes on my 4-year old clunker Dell (averaged over 100,000,000 calls).

Fourth: Math.sqrt() is likely to be optimized as and when new algorithms turn up; so its performance is only likely to get better. I compared it with the method from this article, written 2 years ago, which seemed quite neat. The author claims that it's 3 times faster (although not as accurate); I found it to be more than 3 times

*slower*with no repetitions, and four times slower with 1.

Winston

PS: If you're really interested in this stuff, you might want to check out Paul Hsieh's page, which has been around for a while now.

"Leadership is nature's way of removing morons from the productive flow" - Dogbert

Articles by Winston can be found here

posted 5 years ago

I tried a little while back, but couldn't get it to work, perhaps because of the assembler components. I've made a very basic JNI once, but this one is more difficult.

Tim Moores wrote:JNI isn't hard to get started with, and there are a lot of easy-to-follow tutorials out there on the web, like this one: http://electrofriends.com/articles/jni/jni-part1-java-native-interface/. Have you tried to do it yourself?

I tried a little while back, but couldn't get it to work, perhaps because of the assembler components. I've made a very basic JNI once, but this one is more difficult.

posted 5 years ago

Even with JNA or JNI?

I don't know assembler. I'm familiar with Java.

It does distance calculations, so take that * itself * 3, and iterate it 100 times. My Java Profiler shows that the vast majority of the time is spent on sqrt.

So because it might be faster in two years, I shouldn't mess it now? How about when it gets faster, I'll switch back.

Winston Gutkowski wrote:Well, first: it looks like the method might be doing some pointer manipulation, which ain't allowed in Java.

Even with JNA or JNI?

Winston Gutkowski wrote:Second: if speed is paramount, why aren't you doing the whole thing in assembler?

I don't know assembler. I'm familiar with Java.

Winston Gutkowski wrote:Third: 9ns isn't fast enough for you? That's what Math.sqrt() takes on my 4-year old clunker Dell (averaged over 100,000,000 calls).

It does distance calculations, so take that * itself * 3, and iterate it 100 times. My Java Profiler shows that the vast majority of the time is spent on sqrt.

Winston Gutkowski wrote:Fourth: Math.sqrt() is likely to be optimized as and when new algorithms turn up; so its performance is only likely to get better. I compared it with the method from this article, written 2 years ago, which seemed quite neat. The author claims that it's 3 times faster (although not as accurate); I found it to be more than 3 timesslowerwith no repetitions, and four times slower with 1.

So because it might be faster in two years, I shouldn't mess it now? How about when it gets faster, I'll switch back.

posted 5 years ago

It'll never happen; believe me, I've been there before. You may just be the exception though.

Anyway, I wish you luck.

Winston

Jeff Gent wrote:How about when it gets faster, I'll switch back...

It'll never happen; believe me, I've been there before. You may just be the exception though.

Anyway, I wish you luck.

Winston

Articles by Winston can be found here

posted 5 years ago

- 2

Just a diversion: are you sure you need to calculate any square roots?

For example, if you wanted to know which of a set of points was closest to a given point, you might at first compute the distance of each of the points to the given point, thus calling the sqrt function once for each of the points. But it's sufficient to work with the squares of the distances; you don't need to call sqrt in this case. Perhaps your algorithm can be similarly adjusted to call sqrt fewer times?

For example, if you wanted to know which of a set of points was closest to a given point, you might at first compute the distance of each of the points to the given point, thus calling the sqrt function once for each of the points. But it's sufficient to work with the squares of the distances; you don't need to call sqrt in this case. Perhaps your algorithm can be similarly adjusted to call sqrt fewer times?

posted 5 years ago

Thanks, that's a very good point. In places where I could eliminate the sqrt, I have. So some of my distance calculations are just sq(x2 - x1) + sq(y2 - y1) + sq(z2 - z1), others are an approximate distance. In fact, if I could find a more efficient sq function, that would be even better.. haha. My sq() is just (a * a).

Paul Clapham wrote:Just a diversion: are you sure you need to calculate any square roots?

For example, if you wanted to know which of a set of points was closest to a given point, you might at first compute the distance of each of the points to the given point, thus calling the sqrt function once for each of the points. But it's sufficient to work with the squares of the distances; you don't need to call sqrt in this case. Perhaps your algorithm can be similarly adjusted to call sqrt fewer times?

Thanks, that's a very good point. In places where I could eliminate the sqrt, I have. So some of my distance calculations are just sq(x2 - x1) + sq(y2 - y1) + sq(z2 - z1), others are an approximate distance. In fact, if I could find a more efficient sq function, that would be even better.. haha. My sq() is just (a * a).