Recently I was asked to help investigate the performance of a fancy bit of hardware. The device in question was an xCelor XPM3, an ultra low-latency Layer 1 switch. Layer 1 switches are often used by trading firms to replicate data from one source out to many other sources. The exciting thing about these kinds of switches is that they can take network packets in one port and redirect them back out another port in 3 billionths of a second. That is fast. This may be no surprise, but that is so fast that it is pretty hard to measure.
To measure something in nanoseconds, billionths of a second, you need some equally exotic gear. I happened to have an FPGA based packet capture card with a clock disciplined by a high-end GPS receiver, some optical taps, and a pile of Twinax cables. Oh boy, let the fun begin. Even with toys like these, the minimum resolution of my packet capture system was 8 nanoseconds, nearly 3 times slower than the XPM3 can move packets. To get around this problem, I replicated each packet through every port on the XPM3, bouncing it all the way down the switch and back. Physically this meant that every port was diagonally connected with Twinax cables like this:
And inside the XPM3 it was moving data between ports like this:
The problem now is that I have two variables. Sending a packet down the switch this way means that it moves through 32 replication ports (r) and 30 Twinax cables (t). After running 10 million packets through this test setup, I knew that 32r + 30t + 35 = 212.93259 nanoseconds on average. The ‘35’ is the number of nanoseconds it took for the packet capture system to timestamp the arriving packets. But how could I determine the time for replication and the time in the Twinax cables? The answer was to get a second equation so that I could substitute variables.
I ran a second trial using just ports 1-8 instead of the full 32 ports. This gave me 8r + 6t + 35 = 75.958503 nanoseconds. Now with two variables and two equations I could simply substitute them to calculate that a replication port took 3.35 nanoseconds per hop and Twinax cables took 2.34 nanoseconds for each .5-meter length.