Getting Started

If it still doesn’t fit get a bigger hammer …

The “Website Development” Section of this site reviews the evaluation process behind selection of a set of technology tools that enable the generation, publication and maintenance of the website that will host the results of this “Market Analysis” effort and now the process turns to the question of what tools will be used to further the development and deployment of the actual computations that will makeup the analysis.

As before it is best to start with “Functional Requirements” and the “Known Knowns”. As far as these are concerned the software framework that is selected will need to:

  • Support a wide variety of computational capabilities
  • Support the generation of data sets and extraction of data from a wide variety of sources
  • Support a robust presentation framework for generating visualizations of the results

There are many tools and resources available to implement these requirements and I spent several weeks reviewing the available options to determine what combination of these tools would best meet my needs. Rather than spend a great deal of time going over the options here, I will take the mantle of “Trail Blazer” and just get to the discussion of the results of this investigation in order to move more quickly towards the goal of actually doing data analysis.

The first choice that was made was the choice to do a majority of the coding in Python.

One of the challenges I have faced before when working with Python is the need to track dependencies and versions of the various packages that when combined provide the fucntionality that is needed to complete an analysis. It appears that the Anaconda framework, provides an easy to use solution to this problem.

Once installed, Anaconda Explorer allows for the generation of task specific ‘environments’ that have the specific versions of packages installed that are required to support a specific application task. Once Anaconda Explorer is running you can go to the envirionment tab, add a new environment and begin to install the packages that are required for your application. If one application needs a specific set of code versions you can establish an environment specific to that application and know that if you activate that environment in Anaconda Explorer you will quickly and easily switch to the versions of the packages that that application requires. For the purposes of this analysis, I have started by generating an environment that has the dependencies needed to support the TDA-API that jumpstarts accesss to the data feed API associated with TDA Ameritrade ThinkorSwim Framework.

For decades I have been using Mathematica for analysis of this type, but of late it appears that the open source software community has developed suitable alternatives to this package based on the Python programming language.

One of the nice features of Mathematica is the ability to incrementally build your analysis through executing individual cells in the context of a computational “Notebook”. This approach lends itself well to trial and error and provides rich formatting and textual description cells to describe what one is doing while one is doing it, leaving a well documented codebase as the end result.

In the Python world there is a similar capability in the form of Jupyter Notebooks and a development infrastructure called JupyterLab that provides this type of support for project development activities. Anaconda Explorer provides the hooks to get JupyterLab operating and this will be the primary development interface that will be used for this work.

With the environment established, a variety of packages are available to perform the basic functions of data access and manipulation, analysis and visualizations required.

The basics needed to operate start with:

  • Selenium - A package that automates web access that is used in initializing authentication and can be used to automate data extraction from a variety of sources available on the web such as CBOE where a complete listing of options instruments associated with specific tickers is available
  • Pandas - A package that supports the generation and processing of datasets in a format called ‘DataFrames’. These are in-memory sets of data that can be sliced, diced and otherwise combined and modified to achieve a desired result
  • Numpy - A package that supports numerical computations and operations on N-Dimensional Arrays
  • Holoviews - A package that supports organizing visualizations for output and display
  • Bokeh - A package for generating charts and images from data for visualization
  • Matplotlib - A cousin to Bokeh that performs the same basic function, but sometimes with slightly different and incremental capabilities

In additon to these workhorses of the open-source Python world I thought it would be useful to find an open-source package that supports the construction of portfolios and executing the numerical cacluations associated with projecting the price of Options using Black-Scholes methodology.

Here I located a public repository on Github by Gabriele Pompa that implements an object oriented Portfolio construction and Options pricing capability that I decided to use to model these capabilites in this project.

So with this set of choices made, the business of analysis can begin. The first two areas of investigation will revolve around the question of what is “Force” in a market environment and specifically around the idea that useful and possibly actionable information can be found if one considers various measures of “Market Breadth” to be “Force”. As a followup to this an analysis of Options activities and the actions of marketmakers around these activities will be reviewed as a possible “Force” candidates.

“See you on the flip side”