#!/usr/bin/env python # coding: utf-8 # (sec:ssm-intro)= # # What are State Space Models? # # # A state space model or SSM # is a partially observed Markov model, # in which the hidden state, $\hidden_t$, # evolves over time according to a Markov process, # possibly conditional on external inputs or controls $\input_t$, # and each hidden state generates some # observations $\obs_t$ at each time step. # (In this book, we mostly focus on discrete time systems, # although we consider the continuous-time case in XXX.) # We get to see the observations, but not the hidden state. # Our main goal is to infer the hidden state given the observations. # However, we can also use the model to predict future observations, # by first predicting future hidden states, and then predicting # what observations they might generate. # By using a hidden state $\hidden_t$ # to represent the past observations, $\obs_{1:t-1}$, # the model can have ``infinite'' memory, # unlike a standard Markov model. # # ```{figure} /figures/SSM-AR-inputs.png # :height: 150px # :name: fig:ssm-ar # # Illustration of an SSM as a graphical model. # ``` # # # Formally we can define an SSM # as the following joint distribution: # ```{math} # :label: eq:SSM-ar # p(\obs_{1:T},\hidden_{1:T}|\inputs_{1:T}) # = \left[ p(\hidden_1|\inputs_1) \prod_{t=2}^{T} # p(\hidden_t|\hidden_{t-1},\inputs_t) \right] # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t, \inputs_t, \obs_{t-1}) \right] # ``` # where $p(\hidden_t|\hidden_{t-1},\inputs_t)$ is the # transition model, # $p(\obs_t|\hidden_t, \inputs_t, \obs_{t-1})$ is the # observation model, # and $\inputs_{t}$ is an optional input or action. # See {numref}`fig:ssm-ar` # for an illustration of the corresponding graphical model. # # # We often consider a simpler setting in which the # observations are conditionally independent of each other # (rather than having Markovian dependencies) given the hidden state. # In this case the joint simplifies to # ```{math} # :label: eq:SSM-input # p(\obs_{1:T},\hidden_{1:T}|\inputs_{1:T}) # = \left[ p(\hidden_1|\inputs_1) \prod_{t=2}^{T} # p(\hidden_t|\hidden_{t-1},\inputs_t) \right] # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t, \inputs_t) \right] # ``` # Sometimes there are no external inputs, so the model further # simplifies to the following unconditional generative model: # ```{math} # :label: eq:SSM-no-input # p(\obs_{1:T},\hidden_{1:T}) # = \left[ p(\hidden_1) \prod_{t=2}^{T} # p(\hidden_t|\hidden_{t-1}) \right] # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t) \right] # ``` # See {numref}`ssm-simplified` # for an illustration of the corresponding graphical model. # # # ```{figure} /figures/SSM-simplified.png # :height: 150px # :name: ssm-simplified # # Illustration of a simplified SSM. # ``` # # SSMs are widely used in many areas of science, engineering, finance, economics, etc. # The main applications are state estimation (i.e., inferring the underlying hidden state of the system given the observation), # forecasting (i.e., predicting future states and observations), and control (i.e., inferring the sequence of inputs that will # give rise to a desired target state). We will discuss these applications in later chapters. #