Misapplied Math

Trading, Data Science, CS.

Why I Always Hit The Crosswalk Button

There’s a major crosswalk bisecting one of the busiest street in Chicago; I stop there on a near daily basis as said crosswalk also leads to the best running trail downtown. On sunny weekends, crowds grow fifty pedestrians large before the light turns red, alleviating the congestion. It’s a great place to people watch, especially if you’re interested in crowd dynamics or game theory. The size of the crowd aside, a few things make this particular crosswalk special:

  • Cross traffic is sparse, and in the absence of a car, the light will never turn without a pedestrian hitting the walk button.
  • The crowds are large enough to form human traffic jams at times. When this happens, there’s ambiguity as to who arrived first, and who should hit the button.
  • It’s a tourist heavy spot, so most pedestrians aren’t aware of how long the wait is, or the aforementioned rules of the game.

Having studied it casually for a year, I feel justified in my habit of always hitting the crosswalk button.

The Volunteer’s Dilemma refers to a broad set of game theoretic problems applicable to everything from wireless network design to employee incentive packages. When a stranger murdered Kitty Genovese outside of her Brooklyn apartment while thirty seven bystanders watched without so much as calling the police, the Volunteer’s Dilemma went mainstream. Economists demonstrated an unintuitive result: the size of the crowd worked to her detriment, and not to her favor.

Assume that individuals witnessing a crime will act independently and call the police with probability $p$. For Kitty, the probability of no one calling, $(1 - p)^{37}$ (a minuscule quantity even for depressingly small values of $p$) isn’t congruous with the outcome. Instead, assume that an individual will always call the cops if they know with certainty that no one else will. However, no one wants to deal with paperwork at 3am. Given the option, witnesses will bank on another member of the group rising to the occasion. It’s a simple but intuitive model of how self motivated agents act in a group.

To make this notion concrete, consider the following payout scheme. Each witness has the option of calling the police. If no one calls, all players receive a payout of $0$. If at least one player calls, all players who didn’t call receive a payout of $1$, and those who did receive a payout of $\frac{1}{2}$. It’s easy to see that there are no pure-strategy Nash Equilibrium (NE) for this game. However, we can find a mixed-strategy NE quite easily.

Players will randomize iif the payout from each of $k$ actions is equal. We denote the probability of an individual calling in the mixed-strategy NE as $p^*$. Thus:

It follows immediately that the probability of no one calling is $(1-p^*)^n = \left(\frac{1}{2}\right)^\frac{n}{n - 1}$. That function is monotonically increasing in $n$, asymptotically approaching a max of $\frac{1}{2}$. As $n$ increases, the marginal importance of any one player is diminished. The above holds as NE requires that it’s impossible for a player to alter their strategy in a manner that ensures a uniformly superior outcome. Given that all players know the equilibrium probability, they’re happy to use a weighted coin as the basis of their decision – there’s no way to outperform. When $n = 2$, our equilibrium probability is $\frac{1}{2}$. In the case of Kitty, where $n = 37$, $p \approx .02$, and the probability of no one calling is $\approx .49$.

As is commonplace in game theory, the equilibrium for a repeated play game looks quite different. In fact, it’s relatively easy to show that a strategy in which all $n$ players take turns volunteering is both utility maximizing and a NE. So there you have it, the age old adage ``play nice with others’’ has a mathematical justification, as long as your play group stays the same.

Testing Collections With Guava-Testlib and JUnit 4

Testing Custom Java Collections

Over the years I’ve built up a suite of unit tests for various types of java collections. When I started doing so, I didn’t know of any good off-the-shelf means of generic collections testing. The JDK itself has some unit/regression tests, but not as many as you would think, and that aside they rely on a custom build system/testing framework.

I’ve used google-guava in a few projects but recently discovered their not so well advertised test support library: guava-testlib. I tried it out out on my most recent project (a custom random access list with $O(1)$ deque operations) and I’m very impressed with the test coverage and flexibility. The documentation is sparse but once you get it working you’ll have a test suite with hundreds to thousands of unit tests generated for your list, map, or set. Guava-testlib also generates tests for the non-standard collections in google-guava.

The library relies on JUnit, and most of what I found online detailing its use pertains to JUnit 3. There was a substantial API break between JUnit 3 and JUnit 4, as JUnit 3 relies on naming conventions and inheritance whereas JUnit 4 uses less intrusive annotations. I struggled for a bit trying to figure out how to get things working without resorting to the old JUnit 3 APIs. I arrived at something that works, but it’s still a little dirty as it indirectly relies on a magic static method named suite() from the JUnit 3 days.

package com.indigo.collections;
import com.google.common.collect.testing.*;
import com.google.common.collect.testing.features.CollectionFeature;
import com.google.common.collect.testing.features.CollectionSize;
import com.google.common.collect.testing.features.ListFeature;
import junit.framework.TestSuite;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Suite;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
* Your test class must be annotated with [email protected] RunWith} to specify that it's a
* test suite and not a single test.
* We need to use static inner classes as JUnit only allows for empty "holder"
* suite classes.
public class ArrayListTest {
* Add your additional test cases here.
public static class AdditionalTests {
public void testFoo() {
* This class will generate the guava test suite. It needs a public static
* magic method called [email protected] GuavaTests#suite()} to do so.
public static class GuavaTests {
* The method responsible for returning the [email protected] TestSuite} generated
* by one of the [email protected] FeatureSpecificTestSuiteBuilder} classes.
* This method must be public static method with no arguments named
* "suite()".
* @return An instance of [email protected] TestSuite} for collection testing.
public static TestSuite suite() {
* guava-testlib has a host of test suite builders available,
* all descending from [email protected] FeatureSpecificTestSuiteBuilder}.
* The
* [email protected] FeatureSpecificTestSuiteBuilder#usingGenerator(Object)}
* is the main entry point in that collections are tested by
* implementing [email protected] TestCollectionGenerator} and providing an
* instance of the interface to the test suite builder via the
* #usingGenerator(Object) method. Most of suite builder classes
* provide a convenience method such as
* [email protected] ListTestSuiteBuilder.using()} that streamline the process
* of creating a builder.
return ListTestSuiteBuilder
// The create method is called with an array of elements
// that should populate the collection.
.using(new TestStringListGenerator() {
protected List<String> create(String[] elements) {
return new ArrayList<>(Arrays.asList(elements));
// You can optionally give a name to your test suite. This
// name is used by JUnit and other tools during report
// generation.
.named("ArrayList tests")
// Guava has a host of "features" in the
// com.google.common.collect.testing.features package. Use
// them to specify how the collection should behave, and
// what operations are supported.

Twelve Days 2013: The Bellman-Ford Algorithm

Day Twelve: The Bellman-Ford Algorithm


If you need to find the shortest path for a graph with negative weights, the Bellman-Ford algorithm can do so. Dijkstra’s algorithm has better time complexity, but Bellman-Ford can handle graphs with negative weights, and even be used to detect negative cycles.

Detecting Negative Cycles

Today’s post is short and sweet – I might add code later but the algorithm is quite simple. For graphs with negative weights, Dijkstra’s algorithm fails. Negative weights introduce the potential for negative cycles, in which case a shortest path might not exist as a shorter path could always be constructed by transversing the cycle once again. That breaks Dijkstra. The Bellman-Ford algorithm can detect this when it happens. Unfortunately, the generality comes at the cost of time complexity. Dijkstra’s algorithm runs in $O(|E| + |V|\log(|E|))$. Bellman-Ford takes $O(|E| \cdot |V|))$, where $V$ and $E$ are the sizes of the vertex and edge sets, respectively. There are modifications to Bellman-Ford with slightly better time complexity, but Bellman-Ford is quite simple to implement.

Finding Negative Cycles

Once you know that there is a negative cycle, you still need to find it. There are fast algorithms to do so, but depth-first search (DFS) is arguably the simplest route. You can run DFS on a subgraph that’s produced as a bi product running Bellman-Ford (the shortest path edge set that’s generated along the way) for slightly better time complexity. There’s a break even point between just running depth-first search on the full graph or starting off with Bellman-Ford to check if there’s a negative cycle in the first place. The run-time complexity of DFS is exponential for graphs with a branching factor greater than one so it’s not cheap. For really small graphs you’re almost certainly better off with a vectorized brute force search.

Why do we Care?

Finding a path via DFS, or the shortest path via Bellman-Ford/Dijkstra’s algorithm/A-star has lots of applications in optimization and logistics. Here’s one nice example. Graphs with negative cycles are a special case of the shortest path problem, and once upon a time, currencies were inefficient enough that triangular arbitrage persisted for some time. These days, it’s a purely high frequency game.