Functional programming concepts such as the State monad and the Traversable type class can be powerful tools for thinking about common software engineering problems. As with any abstractions, these ideas may seem unfamiliar or opaque in the absence of concrete examples. The goal of this post is therefore to motivate and contextualize a “stateful traversal” design pattern by working through the specific problem of generating synthetic data points drawn from a simple statistical model.

**AR(p) modeling**

The p-th order linear autoregressive model AR(p) assumes that data point $$y_t$$ is generated according to the equation

where the $$a_i$$ are the fixed autoregressive coefficients and $$\epsilon_t$$ are the iid Gaussian noise terms. This is a basic and foundational model in econometrics and time series analysis.

**Data generation**

Say we have some model *estimation* code which we would like to sanity test against synthetic data points generated by a known model. First, we note that the $$\epsilon$$ terms are, by definition, totally independent of anything else. We can therefore simply sample $$N$$ of these to populate an $$\epsilon$$ vector.

The next step is a bit more involved. While we could compute each $$y_t$$ independently using only the previous noise terms $$\epsilon$$, this calculation would be $$O(N^2)$$ and a bit awkward. A more intuitive approach might be to sweep a pointer over the $$\epsilon$$ array and our (to-be-populated) $$y$$ array, using pointer/index arithmetic to select the appropriate previous terms for our calculation of each $$y_t$$.

This simple diagram shows how we can slide a function $$f$$ with fixed coefficients a over our previously generated $$y$$ and iid $$\epsilon$$.

An unfortunate aspect of this design is that it is mixes inputs and outputs in a way that could easily lead to programming errors. Also, the pointer arithmetic trick couples us to this *exact* problem and somewhat obscures what we’re really doing: threading the dynamic evolution of some *state* (here, the trailing window) through the course of our data-generating process. To make the latter point more concrete, consider the bookkeeping complexity that would be entailed by the extension of this approach to models with a richer notion of state, such as an ARIMA model or an Infinite Hidden Markov Model.

This motivates an alternative implementation which *explicitly* models the evolution of some *state* as we generate our data.

**State-threading approach**

The idea here is that the AR(p) equation above tells us how to get a mapping from some *state* of type `S`

(window $$[y_{t-p},\ldots,y_{t-1}]$$) and an input type `I`

(noise $$\epsilon_t$$) to a new output of type `O`

(output $$y_t$$) and a new *state* of type `S`

(new window $$[y_{t-(p-1)},\ldots,y_t]$$). As a Scala type signature this would look something like `(S,I) => (S,O)`

.

The above diagram shows the simplest version of this idea, where $$g$$ is nearly identical to $$f$$, except with an additional output for the new state `S`

.

We can simplify this a bit via the concept of *partial application*. We note that $$ a $$ is fixed across all evaluations of our function, so we can fix that parameter to get $$g(y,\epsilon_t)$$, as shown in this diagram.

Finally, we combine partial application with the *functor* property of our $$\epsilon$$ sequence by *mapping* partial application of our function over $$\epsilon$$ to get separate functions $$g_t(y)$$ *for each individual position *$$t$$. Our function now simply maps from a previous window `S`

to a new window `S`

and an output value `O`

, as shown in this diagram

This re-arrangement has stripped our sequential data-generating computation down to its essence: given a sequence of iid noise terms $$\epsilon$$, we almost directly encode our mathematical definition as a function which takes a sliding window state `S`

as its input, and returns a computed value `O`

and a new sliding window state `S`

as output, therefore having type signature `S => (S,O)`

.

**Plug-and-play**

All that remains is to construct the plumbing necessary to put it all together. As luck would have it, there is a common functional programming idiom for traversing a data structure while simultaneously accumulating state and transforming individual elements based on that state. This allows us to simply supply our transformed function of $$g$$ along with an initial window state (say, all zeros).

This is somewhat remarkable. The AR(p) equation tells us how to go from a state (the window of previous $$y_t$$) and an input ($$\epsilon_t$$) to a new state (the new window) and an output (the new $$y_t$$), and we can directly plug exactly this information into some generic machinery to achieve our desired result. So how does it work?

**State**** monad**

Clearly we’ve buried some crucial mechanics – let’s take a look at `StatefulGeneration.generate()`

:

The context bounds assert the availability of utility type class instances for our container type `G[_]`

(here, `List`

), which are supplied by `scalaz.std.list.listInstance`

and retrieved via the `implicitly`

statements.

The first piece of machinery invoked in the above example is `functor.map(gs)(State.apply)`

. Recall that `gs`

has type `List[Window => (Window,Double)]`

, that is, a sequence of functions that each map an input state to a new output state along with an output value. This simple and abstract definition of a *stateful computation* occupies a special place in the functional programming world, known as the State Monad.

There exists a vast amount of instructional content on this topic which we shall not recapitulate here (you can find plenty on the web, for a nice example see “Learn You A Haskell” aka LYAH). Suffice it to say that for our purposes a *State* instance is a wrapper for a function that takes a state of type S as input, and returns some new state of type `S`

along with an “output” value of type `A`

, something like:

The “Monad” half of the name means that there exists a `flatMap`

function over State which, loosely speaking, “chains” two of these computations, using the output `S`

of one State as the input to the next and returning a new stateful computation. An example implementation and simple diagram are below, where `State.run`

takes an initial state `S`

as input and then executes the wrapped function:

Returning to our `generate()`

function above, the snippet simply wraps each of our $$g_t$$ functions `Window => (Window,Double)`

into a scalaz State, for which `flatMap`

and a variety of other nice helper functions are already defined.

**Traverse/Sequence**

What about `traverse.sequenceS[Window,Double]`

? Let’s inspect the type signature of `sequenceS`

, substituting in the actual concrete types we’re using:

Verbally, this translates to transforming a `List`

of stateful computations with `Window`

state and `Double`

output into a *single* stateful computation with `Window`

state and `List[Double]`

output.

Informally, `sequenceS`

is able to accomplish this by using the `flatMap`

machinery defined in the previous section to chain all of the individual stateful computations into one. Furthermore, `sequenceS`

also transforms the “output” variable of the resulting stateful computation from `Double`

to `List[Double]`

, a list of all the individual outputs, which is *exactly the final output we originally wanted*. In general, Sequence allows us to “commute” two higher-order types, going from `G[F[A]]`

to `F[G[A]]`

. In this specific case we are transforming `List[State[Window,Double]]`

to `State[Window, List[Double]]`

.

**The End**

Finally, we run it by supplying an all-zeros window as the initial state. The resulting value is of type `(Window,List[Double])`

, containing both the final window state and the list of all output values. We retrieve the latter via `._2`

and declare victory!

What have we gained by all this? First, we have pushed the work of incidental wiring into well-defined library functions, leaving our custom code focused solely on the particulars of our problem: the $$g()$$ function which emits a value and updates the state. This compact surface area for custom logic should be easier to both understand and test. Second, notice that `StatefulGeneration.generate()`

is polymorphic in the type of the data container `G[_]`

, subject to the availability of Traverse and Functor type class instances. Finally, the stateful traversal and transformation of sequential data structures is ubiquitous in software, making this design pattern a valuable addition to one’s “vocabulary” for understanding and reasoning about code.

**References**

- This example was briefly discussed in my talk “Economical machine learning

via functional programming” (slides, video) at the Big Data Scala by the Bay conference in Oakland. - The code examples use the scalaz library, for which the “Learning Scalaz” blog series is an invaluable companion.
- The functional programming concepts discussed here are heavily influenced by the excellent book Functional Programming in Scala.
- My reference for the AR(p) process is Applied Econometric Time Series.
- A more thorough look at some of these ideas can be found in Eric Torreborre’s epic blog post on the “Essence of the Iterator Pattern” paper by Jeremy Gibbons and Bruno C. d. S. Oliveira. Also see other blog discussion here, here, and here.

Any mistakes are of course my own, and I’d love to hear about them!