This week's book giveaway is in the Kotlin forum.
We're giving away four copies of Kotlin in Action and have Dmitry Jemerov & Svetlana Isakova on-line!
See this thread for details.
Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Mpi Send / Receive Comm Time Discrepency with IO Streams  RSS feed

 
Michael Byrne
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm currently benchmarking Mpi like software on a cluster and I ran into a strange discrepency in communication time.
For simplicity here, I'll be referring to simple synchronous send / receive commands in which 8 bytes of data are sent or received during each command.

In a two compartment case, the compartments are connected via a Socket and the input/output streams are pulled from the socket. In the most simple case, compartment 1 sends a double value to compartment 2, and this repeats 1e6 times. In doing this the average communication time per send / receive operation is about 1 microsecond.

In the next more complex case, compartment 1 sends a double value to compartment 2, then compartment 2 sends that double value back to compartment 1.
This is where communication time increases to 10 microseconds (on the same Compute Node) or up to 100 microseconds (inbetween different Compute Nodes).

At first I was doing the send and receive on the same socket using it's input and outputstreams. To make sure that wasn't a bottleneck, I separated the 2way communication into two different sockets, but that large communication time still existed. (No significant change).

With one-way communication, even with the blocking send/receive methods, communication is still remarkebly fast (1 us). So what I'm trying to figure out is: why does this increase 100 times when doing a send & receive, then the reverse operation?
 
Michael Byrne
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Another thing I should mention...

Instead of measuring total communication time, I measure send time and receive time separately.
In the two compartment case, the avg send time for each compartment is only 2us, but the avg receive time is 88us.

C1 -2us-> value -80us-> C2
C1 <-80us- value <-2us- C2

So after C1 sends its value to C2, it's waiting at the receive for a long time. But C2 is waiting a long time as well. And the blocking send is very fast, so what could the system be getting hung up on?

By itself,
C1 --1us--> value --1us--> C2
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!