Misapplied Math

Trading, Data Science, CS

Visualization is Everything

And with that in mind…

I've introduced a new visualization gallery to showcase my graphics and interactive data, some of which will be post specific, some will stand on their own. Exploratory data analysis helps me see features that don't show up well in summary statistics alone – fat tails, outliers, skew, clustering, fit issues, bias, and patterns. I use R, Python, and Matlab for static visualizations, and I've started playing around with d3.js as a means of generating interactive graphs.

Until now I always wrote tools for interactive data visualization using java and C++. I need a few new toys, and given the progress made towards closing the gap between native and browser performance, web based development seems like the way to go. Javascript got fast; it takes a fraction of the time to develop, the frameworks are awesome, and the results are pretty. I'm setting out to rewrite one of my most data intensive visualizations (a L2/L3 limit order book viewer) using pure HTML5 and javascript. At some point I'll put up a screenshot of the result, or a post mortem on the project if I, like Zuckerberg, put too much faith in HTML5.

As an initial experiment with d3 I decided to write a tool that shows a quick overview of volume and open interest on the CME. You can check out the end-of-day snapshot here. The link given is a simple adaptation of mbostock's tree map visualization. Tree maps are a great way to visualize hierarchical data, and given the nested structure of product sectors and sub sectors, they're a perfect fit.

The version that I use is geared towards real-time surveillance across a subset of markets, but the general idea is the same. One notes that e-mini (ES) volume is remarkably high relative to open interest…a pattern made obvious by switching between the two modes and keeping an eye on ratios.

If you want to play around on your own, or create daily updates, the CME offers publicly available end-of-day data on volume and open interest. Feel free to take my code (here's the javascript) and use an adaptation of this quick and dirty parser to convert your data to JSON. Note that the parse_row function in the code below isn't implemented – it's simple but depends on your input format.

encode_json.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#!/usr/bin/env python

import sys
import json

class Node(object):
    def __init__(self, parent, name):
        self.parent = parent
        self.name = name
        self.children = []        

class Leaf(Node):
    def __init__(self, parent, name, full_name, volume, open_interest):
        super(Leaf, self).__init__(parent, None)
        self.name = name
        self.full_name = full_name
        self.volume = volume
        self.open_interest = open_interest

class NodeJSONEncoder(json.JSONEncoder):
    def default(self, node):
        if type(node) == Node:
            return { "name":node.name, "children":node.children }
        elif type(node) == Leaf:
            return { "name":node.name, "full_name":node.full_name, 
                "volume":node.volume, "open_interest":node.open_interest }
        raise TypeError("{} is not an instance of Node".format(node))

def get_or_create_node(node_dictionary, parent, name):
    if name in node_dictionary:
        return node_dictionary[name]
    else:
        node = Node(parent, name)
        node_dictionary[name] = node
        parent.children.append(node)
        return node


if __name__ == "__main__":
    
    sectors = dict()
    subsectors = dict()

    root = Node(None, "CME Products")

    with open(sys.argv[1]) as f:
        for row in f.readlines():           
            p = parse_row(row)        
            sector = get_or_create_node(sectors, root, p.sector_name)
            subsector = get_or_create_node(subsectors, sector, p.subsector_name)
            subsector.children.append(Leaf(subsector, p.name, p.full_name, 
                p.volume, p.open_interest))

    print NodeJSONEncoder().encode(root)

Comments