TCP? Who cares about TCP in HPC?
More and more people, actually. With the commoditization of HPC, lots of newbie HPC users are intimidated by special, one-off, traditional HPC types of networks and opt for the simplicity and universality of Ethernet.
And it turns out that TCP doesn't suck nearly as much as most (HPC) people think, particularly on modern servers, Ethernet fabrics, and powerful Ethernet NICs.
I'll cut to the chase: I surprised myself by being able to get ~10us half-round-trip ping-pong MPI latency over TCP (using NetPIPE). The slidedeck below discusses how that works.
A little background: I've posted several times about Cisco's forthcoming ultra-low latency Ethernet product. While working on that code, I was doing some performance testing last week, and wanted to compare to the best performance that TCP could give me. I discovered many things:
Check out these slides explaining what I found:
Just to be complete, here's the hardware I ran these tests on (all of which is available today):
Are these results definitive? Absolutely not - you can see the tradeoffs listed at the end of the slide deck. As with any HPC application, YMMV. But it certainly is interesting, and definitively shows how the rest of the world is benefiting from the trickle-down effect of HPC.