ssm_intro.py 3.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
  1. #!/usr/bin/env python
  2. # coding: utf-8
  3. # (sec:ssm-intro)=
  4. # # What are State Space Models?
  5. #
  6. #
  7. # A state space model or SSM
  8. # is a partially observed Markov model,
  9. # in which the hidden state, $\hidden_t$,
  10. # evolves over time according to a Markov process,
  11. # possibly conditional on external inputs or controls $\input_t$,
  12. # and each hidden state generates some
  13. # observations $\obs_t$ at each time step.
  14. # (In this book, we mostly focus on discrete time systems,
  15. # although we consider the continuous-time case in XXX.)
  16. # We get to see the observations, but not the hidden state.
  17. # Our main goal is to infer the hidden state given the observations.
  18. # However, we can also use the model to predict future observations,
  19. # by first predicting future hidden states, and then predicting
  20. # what observations they might generate.
  21. # By using a hidden state $\hidden_t$
  22. # to represent the past observations, $\obs_{1:t-1}$,
  23. # the model can have ``infinite'' memory,
  24. # unlike a standard Markov model.
  25. #
  26. # ```{figure} /figures/SSM-AR-inputs.png
  27. # :height: 150px
  28. # :name: fig:ssm-ar
  29. #
  30. # Illustration of an SSM as a graphical model.
  31. # ```
  32. #
  33. #
  34. # Formally we can define an SSM
  35. # as the following joint distribution:
  36. # ```{math}
  37. # :label: eq:SSM-ar
  38. # p(\obs_{1:T},\hidden_{1:T}|\inputs_{1:T})
  39. # = \left[ p(\hidden_1|\inputs_1) \prod_{t=2}^{T}
  40. # p(\hidden_t|\hidden_{t-1},\inputs_t) \right]
  41. # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t, \inputs_t, \obs_{t-1}) \right]
  42. # ```
  43. # where $p(\hidden_t|\hidden_{t-1},\inputs_t)$ is the
  44. # transition model,
  45. # $p(\obs_t|\hidden_t, \inputs_t, \obs_{t-1})$ is the
  46. # observation model,
  47. # and $\inputs_{t}$ is an optional input or action.
  48. # See {numref}`fig:ssm-ar`
  49. # for an illustration of the corresponding graphical model.
  50. #
  51. #
  52. # We often consider a simpler setting in which the
  53. # observations are conditionally independent of each other
  54. # (rather than having Markovian dependencies) given the hidden state.
  55. # In this case the joint simplifies to
  56. # ```{math}
  57. # :label: eq:SSM-input
  58. # p(\obs_{1:T},\hidden_{1:T}|\inputs_{1:T})
  59. # = \left[ p(\hidden_1|\inputs_1) \prod_{t=2}^{T}
  60. # p(\hidden_t|\hidden_{t-1},\inputs_t) \right]
  61. # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t, \inputs_t) \right]
  62. # ```
  63. # Sometimes there are no external inputs, so the model further
  64. # simplifies to the following unconditional generative model:
  65. # ```{math}
  66. # :label: eq:SSM-no-input
  67. # p(\obs_{1:T},\hidden_{1:T})
  68. # = \left[ p(\hidden_1) \prod_{t=2}^{T}
  69. # p(\hidden_t|\hidden_{t-1}) \right]
  70. # \left[ \prod_{t=1}^T p(\obs_t|\hidden_t) \right]
  71. # ```
  72. # See {numref}`ssm-simplified`
  73. # for an illustration of the corresponding graphical model.
  74. #
  75. #
  76. # ```{figure} /figures/SSM-simplified.png
  77. # :height: 150px
  78. # :name: ssm-simplified
  79. #
  80. # Illustration of a simplified SSM.
  81. # ```
  82. #
  83. # SSMs are widely used in many areas of science, engineering, finance, economics, etc.
  84. # The main applications are state estimation (i.e., inferring the underlying hidden state of the system given the observation),
  85. # forecasting (i.e., predicting future states and observations), and control (i.e., inferring the sequence of inputs that will
  86. # give rise to a desired target state). We will discuss these applications in later chapters.
  87. #