Measuring Billionths of Seconds

Recently I was asked to help investigate the performance of a fancy bit of hardware. The device in question was an xCelor XPM3, an ultra low-latency Layer 1 switch. Layer 1 switches are often used by trading firms to replicate data from one source out to many other sources. The exciting thing about these kinds of switches is that they can take network packets in one port and redirect them back out another port in 3 billionths of a second. That is fast. This may be no surprise, but that is so fast that it is pretty hard to measure.

To measure something in nanoseconds, billionths of a second, you need some equally exotic gear. I happened to have an FPGA based packet capture card with a clock disciplined by a high-end GPS receiver, some optical taps, and a pile of Twinax cables. Oh boy, let the fun begin. Even with toys like these, the minimum resolution of my packet capture system was 8 nanoseconds, nearly 3 times slower than the XPM3 can move packets. To get around this problem, I replicated each packet through every port on the XPM3, bouncing it all the way down the switch and back. Physically this meant that every port was diagonally connected with Twinax cables like this:

And inside the XPM3 it was moving data between ports like this:

The problem now is that I have two variables. Sending a packet down the switch this way means that it moves through 32 replication ports (r) and 30 Twinax cables (t). After running 10 million packets through this test setup, I knew that 32r + 30t + 35 = 212.93259 nanoseconds on average. The ‘35’ is the number of nanoseconds it took for the packet capture system to timestamp the arriving packets. But how could I determine the time for replication and the time in the Twinax cables? The answer was to get a second equation so that I could substitute variables.

I ran a second trial using just ports 1-8 instead of the full 32 ports. This gave me 8r + 6t + 35 = 75.958503 nanoseconds. Now with two variables and two equations I could simply substitute them to calculate that a replication port took 3.35 nanoseconds per hop and Twinax cables took 2.34 nanoseconds for each .5-meter length.

Divvy Bike Shares in Chicago

The Chicago based bike sharing company, Divvy, hosted a contest this past winter. They released anonymous ride data on over 750,000 rides taken in 2013. The contest had several categories to see who could draw the most meaning from these data and who could design the most beautiful representation of the rides. I entered the contest as a way to learn about D3.js, a new data visualization tool that is amazingly powerful. And complicated.

I thought it would be fun to see where most people were coming from and going to. When I start a play-project like this, I reach for my two favorite data analysis machetes, Postgres and Python. Cleaning and loading the data into Postgres was pretty straight forward, which lead to the fun part, trying to derive a meaningful framework with which to examine the ride data.

Pretty quickly it became apparent that breaking down the day into small time slices and aggregating the top departure points would yield interesting insights. It became even more interesting when you categorize the top departure points by their corresponding top destinations. 2pm

At different times of the day the pattern of rides looks wildly different. Early in the morning a massive influx of riders use Divvy bikes near the citys two main train stations. In the middle of the day, bike usage centers around the primary tourist attractions with everyone coming and going to the same places. And in the small hours of the morning the bikes serve as cab replacements in the neighborhoods with lots of bars. 3am

With the ride data extracted, I used D3 to make it beautiful. D3 allows shapes to move and change color in seemingly magical ways inside a web browser. Each departure point can be linked to its top destinations and they will arrange themselves. Crain’s Chicago Business newspaper saw my entry and is running a special print edition of the graphic in an upcoming paper. You can see the online edition here.

Predicting the future is for suckers

I spent the weekend thinking through a trading strategy dubbed by a Wall St. Journal reporter as the “Common Sense” (CS) trading strategy (see Part I). It turns out that common sense was a disaster when tested against historical data. The original formula was to buy or sell 5% of your cash or share value whenever the market moved 5%. I will refer to these levels as the market threshold and the aggression level. Using that formula, the CS strategy faithfully sold shares at market peaks and bought in the market valleys, but it sold off too many shares over time and paid out too much money in taxes. It was substantially worse than a “Buy and Hold” (B&H) strategy that bought on the low points and never sold.

Knowing that the shortcomings of the original CS trading strategy are incurring taxes and holding too much money in cash, can it be improved? Two things come to mind: search out better set of values for when and how much to buy/sell and alter the algorithm to buy more aggressively than it sells. Raising the market threshold will cause the strategy to trade less and lowering the % of assets to buy and sell will mean lower taxable income. To prove that the results are not a fluke of timing, four different historical time periods were used. Computing power is cheap so I plowed through hundreds of combinations of inputs to find the most profitable over four different time periods.

Start Date End Date Length Significance

1/1/06

12/31/12

7 years Short term

8/11/87

12/31/12

25 years Before the 1987 crash

10/19/87

12/31/12

25 years After the 1987 crash

1/1/50

12/31/12

63 years Long term

Buy and Hold is hard to beat. Starting with the longest term, 1/1950 through 12/2012, the B&H strategy earned $855,902. The best CS configuration returned 45% less money or $472,129. Ouch. The best performing version of the CS strategy was to buy and sell when the market moved wildly, by 20% and to sell very small percentages of your holdings each time, around 1%. This meant that the minimal amount of value was lost to taxes. The best B&H on the other hand spent 100% of its free cash the first time the market moved 2% and never traded again.

Graph.1950.Total

For fun I chose to medium length periods of time on either side of the great 1987 crash. The Before graph started in August of 1987 and the After graph was from October 1987. The results are very similar, except as you would expect, buying at a low point right after a market crash earned more money overall. The best performing inputs were nearly identical. B&H performed best when it bought aggressively after the first market move and never traded again. CS performed best when it bought or sold 1% when the market moved 20%.

Before the 1987 crash (August)Graph.1987B.Total

After the 1987 crash (October):Graph.1987A.Total

Now for the short term where things get interesting. The tables turn thanks the most recent market crash. The CS strategy came out on top, but it reinforces why this is a poor trading strategy for most people. I will just say that the CS strategy gets lucky here:

Graph.2006.Total

Unlike the longer time periods, the best performing versions of the CS strategy in the short run were very aggressive. When the market moved 20%, it bought and sold 100% of its positions. This is what you would do if you had a crystal ball. Sell everything when the market is high, and buy back when the market is low. The pink shaded sections are the times where the CS strategy owned zero shares of stock. So why is this bad? It is a strategy that counts on huge volatility to be successful and historically the markets just don’t fluctuate that much. If the market entered a period of calm sustainable growth, you would be caught with your money on the sidelines earning no return. It is the reason why historically the CS strategy works best when it makes very small moves. So unless you can predict the future and you *know* that there is a major market crash coming, you can’t win by timing the market with this strategy.

At the start of this post I mentioned that there might be a second way to improve the CS strategy, by buying and selling at different rates. That might get around the tax issue while preventing too much money from sitting around in cash. But this post is getting long, so I will come back to that another day.

Here is the source code and data if you want to play yourself:  CS.Trading.Strat.Source

Debunking the “Common Sense” trading strategy

Most small investors are not that good at predicting the financial future. I am certainly awful at it and I worked at a trading company for five years. I had a front row seat watching how the big guys run a trading firm today and I can tell you that it takes a lot of specialist knowledge, technology, and money. My job was to run the technology. It is a world that is far removed from the tools and timescale of everyday people.

For investors who don’t have a supercomputers and 10Gb links to the New York Stock Exchange, how do you know when to make trading decisions? A writer for the Wall St. Journal wrote an article a year or two back claiming that a “common sense” trading strategy was the right move for the average Joe. He claimed that investors should buy whenever the market fell by 5% and sell whenever it rose by 5%. The intuition was clear, this model would force people to buy low and sell high.

But does it work? I built a model to test this trading strategy. The original article was a little fuzzy on some of the key details such as how much of your wealth should you move when the market crosses the 5% threshold, so I assumed that to be 5% as well.

Here are the key assumptions for the “Common Sense” (CS) trading strategy:

  • Buy or sell when the market falls/rises by 5%
  • Every sale results in 15% of your gains going to long-term capital gains taxes
  • Every sale is 5% of the value of held shares
  • Every purchase uses 5% of free cash
  • With an initial $10,000, $7K is invested on day 1 and $3k is held in cash
  • Investment was started on 1/3/2006

To compare this against a “Buy and Hold” (BH) trading strategy

  • Buy when the market falls by 5% from the last high
  • Every purchase uses 5% of free cash
  • Never sell
  • With an initial $10,000, $7K is invested on day 1 and $3k is held in cash
  • Investment was started on 1/3/2006

Here are two graphs of the S&P 500 index with the transactions of the two strategies overlaid. The CS strategy has both purchases and sales, the B&H has only purchases. Both strategies make purchases when the market has fallen by 5%. A glance at the graphs confirms that both strategies are doing pretty well at buying during market low points.

First the CS graph. Notice that it Buys (blue) when the market falls and Sells (red) on the way up. It does a great job of hitting all the peaks on the graph:

Graph.2006.CS.1 The B&H has the same pattern of purchases, but never sells any shares:Graph.2006.BH.1

So how did they do? The “common sense” trading strategy is a disaster. Buy and Hold finished with $11,586.81 and “common sense” ended up losing money, ending with only $9,711.43 of the original $10,000. That is $1,875 or 16% worse than the B&H strategy over the seven years.

Final Earnings

What happened? Three things are going wrong. First, long-term capital gain taxes suck out a ton of your profits. At 15% of your gains, every sale nibbles away at your purchasing power. The CS strategy paid $1,021 in taxes over the seven years, which is a majority of the performance difference between the two strategies. In the seven years, that $1,021 would have grown by 6.5% to $1,088. Over more time both the money you don’t pay in taxes plus the 6.5% in growth really get big.

Second, the CS algorithm is too risk-averse. It moves too much money out of investments and into cash. Since the market has generally risen for the last 70+ years, there are more selling events than buying events. So the number of shares owned decreases over time as the cash begins to equal the value of the shares you own. You can see in this graph how the B&H strategy ends up with 2.6x more shares. When you money sits in cash, it does not grow.

Graph.2006.Shares.Owned

Lastly, as you purchase new shares over time, you increase the average purchase price of the shares you hold. Since the price of stocks has continued to rise over time, in the future when the market dips by 5%, shares will still cost more then they did on day one. Buying the small dips does not help because they are very rarely deep enough to lower the average cost of your purchases. What seems like a perfectly good idea turns out to be horrible in real life. But are there ways to save the CS strategy? I will give this some more thought and follow up with another post.

Here is the python source code and data if you want to play. CS.Trading.Strat.Source

Python pipeline with gap detection and replay using zeromq

I needed a way to process a ton of information recently. I had a bunch of systems that I could use, each with wildly different levels of resources. I needed to find a way to distribute work to all of these CPU’s and gather results back without missing any data. The answer that I came up with was to create a processing pipeline using zeromq as a message broker. I abstracted the process into three parts: task distribution, processing, and collection.

Since Python has a global interpreter lock, I needed to distribute my process using something other than threads. I turned to ZeroMQ since they have a robust message system. One of the ZeroMQ message patterns, the pipeline, was a great jumping off point for my needs. But the pipeline didn’t have any sort of tracking or flow control. I added a feedback loop to keep track of the total items sent and gap detection. The idea is that all work to be processed will have a sequential key. There are three pieces of the app, Start, Middle, and Finish. Start and Finish will use the sequential keys to understand where they are in the process and to replay any missing items.

There can be gotchas. If you don’t set the timing correctly on the apps, you might start replaying long processing jobs too quickly and create a process storm. But for the type of work I was doing it was fairly easy to set a reasonable set of times.

Pipeline Files

 

Visual analysis of building activity in Chicago 2006-2012

In 2008 the property bubble burst in Chicago. It is hard to gauge a recession without some hard numbers. In this case a visual representation gives a powerful view into the scale of the decline in building activity, measured by the total value of building permits by large builders. The visual gives a reference for the scale of the recession. A big thank you to the Chicago’s Open Data Portal for providing the data to work with.

The Data Portal has all the Chicago building permits available online which are a great metric for building activity. I narrowed down the permits to construction activity (elevator repair and fire alarm systems didn’t count) and used Python and Gephi to graph out the connections. Take a look at the result:

Yearly building activity of the largest builders in Chicago, 2006-2012

It was important to filter out smaller builders to have a clear image. The threshold for a builder to make the graph was at least 100 building permits or total permit value of over two million dollars. Each year is scaled to the total value of the building permits for that year, ranging from $8.3 billion in 2006 down to $752 million in 2009 and back up to $4 billion in 2011. Look what happened to John C. Hanna’s activity. In 2006 Hanna’s firm was the most active by properties. In 2007 and 2008 the activity was significantly reduced and failed to make the graph in 2009. By 2010 Hanna was back on the graph and by 2011 was growing again.

If you would like the higher resolution version or a PDF of the image, contact me.