Pārlūkot izejas kodu

update intro chapter with LDS

Kevin P Murphy 3 gadi atpakaļ
vecāks
revīzija
c2c2e1ff1c
36 mainītis faili ar 2705 papildinājumiem un 183 dzēšanām
  1. BIN
      _build/.doctrees/chapters/hmm/hmm_filter.doctree
  2. BIN
      _build/.doctrees/chapters/ssm/ssm.doctree
  3. BIN
      _build/.doctrees/environment.pickle
  4. BIN
      _build/html/_images/pendulum.png
  5. BIN
      _build/html/_images/ssm_15_0.png
  6. BIN
      _build/html/_images/ssm_16_1.png
  7. BIN
      _build/html/_images/ssm_21_1.png
  8. BIN
      _build/html/_images/ssm_24_1.png
  9. BIN
      _build/html/_images/ssm_25_1.png
  10. BIN
      _build/html/_images/ssm_29_1.png
  11. BIN
      _build/html/_images/ssm_30_1.png
  12. BIN
      _build/html/_images/ssm_31_1.png
  13. 23 0
      _build/html/_sources/chapters/hmm/hmm_filter.ipynb
  14. 582 15
      _build/html/_sources/chapters/ssm/ssm.ipynb
  15. 1 1
      _build/html/chapters/hmm/hmm.html
  16. 16 0
      _build/html/chapters/hmm/hmm_filter.html
  17. 562 41
      _build/html/chapters/ssm/ssm.html
  18. BIN
      _build/html/objects.inv
  19. 42 12
      _build/html/reports/ssm.log
  20. 1 0
      _build/html/root.html
  21. 1 1
      _build/html/searchindex.js
  22. 31 0
      _build/jupyter_execute/chapters/hmm/hmm_filter.ipynb
  23. 12 0
      _build/jupyter_execute/chapters/hmm/hmm_filter.py
  24. 619 64
      _build/jupyter_execute/chapters/ssm/ssm.ipynb
  25. 456 22
      _build/jupyter_execute/chapters/ssm/ssm.py
  26. BIN
      _build/jupyter_execute/chapters/ssm/ssm_15_0.png
  27. BIN
      _build/jupyter_execute/chapters/ssm/ssm_16_1.png
  28. BIN
      _build/jupyter_execute/chapters/ssm/ssm_21_1.png
  29. BIN
      _build/jupyter_execute/chapters/ssm/ssm_24_1.png
  30. BIN
      _build/jupyter_execute/chapters/ssm/ssm_25_1.png
  31. BIN
      _build/jupyter_execute/chapters/ssm/ssm_29_1.png
  32. BIN
      _build/jupyter_execute/chapters/ssm/ssm_30_1.png
  33. BIN
      _build/jupyter_execute/chapters/ssm/ssm_31_1.png
  34. 23 0
      chapters/hmm/hmm_filter.ipynb
  35. 336 27
      chapters/ssm/ssm.ipynb
  36. BIN
      figures/pendulum.png

BIN
_build/.doctrees/chapters/hmm/hmm_filter.doctree


BIN
_build/.doctrees/chapters/ssm/ssm.doctree


BIN
_build/.doctrees/environment.pickle


BIN
_build/html/_images/pendulum.png


BIN
_build/html/_images/ssm_15_0.png


BIN
_build/html/_images/ssm_16_1.png


BIN
_build/html/_images/ssm_21_1.png


BIN
_build/html/_images/ssm_24_1.png


BIN
_build/html/_images/ssm_25_1.png


BIN
_build/html/_images/ssm_29_1.png


BIN
_build/html/_images/ssm_30_1.png


BIN
_build/html/_images/ssm_31_1.png


+ 23 - 0
_build/html/_sources/chapters/hmm/hmm_filter.ipynb

@@ -6,6 +6,29 @@
    "source": [
     "# HMM filtering (forwards algorithm)\n"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```{math}\n",
+    "\n",
+    "\\newcommand\\floor[1]{\\lfloor#1\\rfloor}\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(42)"
+   ]
   }
  ],
  "metadata": {

Failā izmaiņas netiks attēlotas, jo tās ir par lielu
+ 582 - 15
_build/html/_sources/chapters/ssm/ssm.ipynb


+ 1 - 1
_build/html/chapters/hmm/hmm.html

@@ -646,7 +646,7 @@ const thebe_selector_output = ".output, .cell_output"
 <p>We first create the “Ocassionally dishonest casino” model from <span id="id1">[<a class="reference internal" href="../../bib.html#id3" title="R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.">DEKM98</a>]</span>.</p>
 <div class="figure align-default" id="casino-fig">
 <a class="reference internal image-reference" href="../../_images/casino.png"><img alt="../../_images/casino.png" src="../../_images/casino.png" style="width: 208.5px; height: 142.5px;" /></a>
-<p class="caption"><span class="caption-number">Fig. 7 </span><span class="caption-text">Illustration of the casino HMM.</span><a class="headerlink" href="#casino-fig" title="Permalink to this image">¶</a></p>
+<p class="caption"><span class="caption-number">Fig. 8 </span><span class="caption-text">Illustration of the casino HMM.</span><a class="headerlink" href="#casino-fig" title="Permalink to this image">¶</a></p>
 </div>
 <p>There are 2 hidden states, each of which emit 6 possible observations.</p>
 <div class="cell docutils container">

+ 16 - 0
_build/html/chapters/hmm/hmm_filter.html

@@ -51,6 +51,8 @@ const thebe_selector_input = "pre"
 const thebe_selector_output = ".output, .cell_output"
 </script>
     <script async="async" src="../../_static/sphinx-thebe.js"></script>
+    <script>window.MathJax = {"options": {"processHtmlClass": "tex2jax_process|mathjax_process|math|output_area"}}</script>
+    <script defer="defer" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
     <link rel="index" title="Index" href="../../genindex.html" />
     <link rel="search" title="Search" href="../../search.html" />
     <link rel="next" title="HMM smoothing (forwards-backwards algorithm)" href="hmm_smoother.html" />
@@ -458,6 +460,20 @@ const thebe_selector_output = ".output, .cell_output"
                 
   <div class="tex2jax_ignore mathjax_ignore section" id="hmm-filtering-forwards-algorithm">
 <h1>HMM filtering (forwards algorithm)<a class="headerlink" href="#hmm-filtering-forwards-algorithm" title="Permalink to this headline">¶</a></h1>
+<div class="math notranslate nohighlight">
+\[\newcommand\floor[1]{\lfloor#1\rfloor}\]</div>
+<div class="cell docutils container">
+<div class="cell_input docutils container">
+<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
+</pre></div>
+</div>
+</div>
+<div class="cell_output docutils container">
+<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>42
+</pre></div>
+</div>
+</div>
+</div>
 </div>
 
     <script type="text/x-thebe-config">

Failā izmaiņas netiks attēlotas, jo tās ir par lielu
+ 562 - 41
_build/html/chapters/ssm/ssm.html


BIN
_build/html/objects.inv


+ 42 - 12
_build/html/reports/ssm.log

@@ -17,23 +17,53 @@ Traceback (most recent call last):
     raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content'])
 nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
 ------------------
-# MAP estimation
+ # Filtering
 fig, ax = plt.subplots()
-plot_inference(z_map, z_hist, ax, map_estimate=True)
-ax.set_ylabel("MAP state")
-ax.set_title("Viterbi")
+plot_inference(filtered_dist, z_hist, ax)
+ax.set_ylabel("p(loaded)")
+ax.set_title("Filtered")
+ 
 
+  
 ------------------
 
 ---------------------------------------------------------------------------
-NameError                                 Traceback (most recent call last)
-<ipython-input-12-d20416120056> in <module>
-      1 # MAP estimation
+ValueError                                Traceback (most recent call last)
+<ipython-input-17-a3a9c11b46bd> in <module>
+      1 # Filtering
       2 fig, ax = plt.subplots()
-----> 3 plot_inference(z_map, z_hist, ax, map_estimate=True)
-      4 ax.set_ylabel("MAP state")
-      5 ax.set_title("Viterbi")
+----> 3 plot_inference(filtered_dist, z_hist, ax)
+      4 ax.set_ylabel("p(loaded)")
+      5 ax.set_title("Filtered")
 
-NameError: name 'z_map' is not defined
-NameError: name 'z_map' is not defined
+<ipython-input-16-9c8ddad3f57c> in plot_inference(inference_values, z_hist, ax, state, map_estimate)
+      3     n_samples = len(inference_values)
+      4     xspan = np.arange(1, n_samples + 1)
+----> 5     spans = find_dishonest_intervals(z_hist)
+      6     if map_estimate:
+      7         ax.step(xspan, inference_values, where="post")
+
+<ipython-input-15-4606c615e17a> in find_dishonest_intervals(z_hist)
+      4     x_init = 0
+      5     for t, _ in enumerate(z_hist[:-1]):
+----> 6         if z_hist[t + 1] == 0 and z_hist[t] == 1:
+      7             x_end = t
+      8             spans.append((x_init, x_end))
+
+/opt/anaconda3/lib/python3.8/functools.py in _method(cls_or_self, *args, **keywords)
+    397         def _method(cls_or_self, /, *args, **keywords):
+    398             keywords = {**self.keywords, **keywords}
+--> 399             return self.func(cls_or_self, *self.args, *args, **keywords)
+    400         _method.__isabstractmethod__ = self.__isabstractmethod__
+    401         _method._partialmethod = self
+
+/opt/anaconda3/lib/python3.8/site-packages/jax/_src/device_array.py in _forward_method(attrname, self, fun, *args)
+     39 
+     40 def _forward_method(attrname, self, fun, *args):
+---> 41   return fun(getattr(self, attrname), *args)
+     42 _forward_to_value = partial(_forward_method, "_value")
+     43 
+
+ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
+ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
 

+ 1 - 0
_build/html/root.html

@@ -449,6 +449,7 @@ in automatic differentiation and parallel computing.</p>
 <li class="toctree-l1"><a class="reference internal" href="chapters/ssm/ssm.html">What are State Space Models?</a></li>
 <li class="toctree-l1"><a class="reference internal" href="chapters/ssm/ssm.html#hidden-markov-models">Hidden Markov Models</a></li>
 <li class="toctree-l1"><a class="reference internal" href="chapters/ssm/ssm.html#linear-gaussian-ssms">Linear Gaussian SSMs</a></li>
+<li class="toctree-l1"><a class="reference internal" href="chapters/ssm/ssm.html#nonlinear-gaussian-ssms">Nonlinear Gaussian SSMs</a></li>
 <li class="toctree-l1"><a class="reference internal" href="chapters/ssm/ssm.html#inferential-goals">Inferential goals</a></li>
 <li class="toctree-l1"><a class="reference internal" href="chapters/hmm/hmm_index.html">Inference in discrete SSMs</a><ul>
 <li class="toctree-l2"><a class="reference internal" href="chapters/hmm/hmm.html">Hidden Markov Models</a></li>

Failā izmaiņas netiks attēlotas, jo tās ir par lielu
+ 1 - 1
_build/html/searchindex.js


+ 31 - 0
_build/jupyter_execute/chapters/hmm/hmm_filter.ipynb

@@ -6,6 +6,37 @@
    "source": [
     "# HMM filtering (forwards algorithm)\n"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```{math}\n",
+    "\n",
+    "\\newcommand\\floor[1]{\\lfloor#1\\rfloor}\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "42\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(42)"
+   ]
   }
  ],
  "metadata": {

+ 12 - 0
_build/jupyter_execute/chapters/hmm/hmm_filter.py

@@ -3,3 +3,15 @@
 
 # # HMM filtering (forwards algorithm)
 # 
+
+# ```{math}
+# 
+# \newcommand\floor[1]{\lfloor#1\rfloor}
+# ```
+# 
+
+# In[1]:
+
+
+print(42)
+

Failā izmaiņas netiks attēlotas, jo tās ir par lielu
+ 619 - 64
_build/jupyter_execute/chapters/ssm/ssm.ipynb


+ 456 - 22
_build/jupyter_execute/chapters/ssm/ssm.py

@@ -233,8 +233,14 @@ def print_source(fname):
 # \newcommand{\ldsDynNoise}{\vQ}
 # \newcommand{\ldsObsNoise}{\vR}
 # 
-# \newcommand{\ssmDyn}{f}
-# \newcommand{\ssmObs}{h}
+# \newcommand{\ssmDynFn}{f}
+# \newcommand{\ssmObsFn}{h}
+# 
+# 
+# %%%
+# \newcommand{\gauss}{\mathcal{N}}
+# 
+# \newcommand{\diag}{\mathrm{diag}}
 # ```
 # 
 
@@ -264,7 +270,7 @@ def print_source(fname):
 # Formally we can define an SSM 
 # as the following joint distribution:
 # ```{math}
-# :label: SSMfull
+# :label: eq:SSM-ar
 # p(\hmmobs_{1:T},\hmmhid_{1:T}|\inputs_{1:T})
 #  = \left[ p(\hmmhid_1|\inputs_1) \prod_{t=2}^{T}
 #  p(\hmmhid_t|\hmmhid_{t-1},\inputs_t) \right]
@@ -286,21 +292,29 @@ def print_source(fname):
 # Illustration of an SSM as a graphical model.
 # ```
 # 
-# We often consider a simpler setting in which there
-# are no external inputs,
-# and the observations are conditionally independent of each other
+# 
+# We often consider a simpler setting in which the
+#  observations are conditionally independent of each other
 # (rather than having Markovian dependencies) given the hidden state.
 # In this case the joint simplifies to 
 # ```{math}
-# :label: SSMsimplified
+# :label: eq:SSM-input
+# p(\hmmobs_{1:T},\hmmhid_{1:T}|\inputs_{1:T})
+#  = \left[ p(\hmmhid_1|\inputs_1) \prod_{t=2}^{T}
+#  p(\hmmhid_t|\hmmhid_{t-1},\inputs_t) \right]
+#  \left[ \prod_{t=1}^T p(\hmmobs_t|\hmmhid_t, \inputs_t) \right]
+# ```
+# Sometimes there are no external inputs, so the model further
+# simplifies to the following unconditional generative model: 
+# ```{math}
+# :label: eq:SSM-no-input
 # p(\hmmobs_{1:T},\hmmhid_{1:T})
 #  = \left[ p(\hmmhid_1) \prod_{t=2}^{T}
 #  p(\hmmhid_t|\hmmhid_{t-1}) \right]
-#  \left[ \prod_{t=1}^T p(\hmmobs_t|\hmmhid_t \right]
+#  \left[ \prod_{t=1}^T p(\hmmobs_t|\hmmhid_t) \right]
 # ```
 # See {numref}`Figure %s <ssm-simplified>` 
 # for an illustration of the corresponding graphical model.
-# Compare {eq}`SSMfull` and {eq}`SSMsimplified`.
 # 
 # 
 # ```{figure} /figures/SSM-simplified.png
@@ -445,21 +459,441 @@ print_source(hmm.sample)
 # affect how we choose to gamble our money.
 # We discuss various ways to perform this inference below.
 
+# (sec:lillypad)=
+# ## Example: Lillypad HMM
+# 
+# 
+# If $\obs_t$ is continuous, it is common to use a Gaussian
+# observation model:
+# ```{math}
+# p(\obs_t|\hidden_t=j) = \gauss(\obs_t|\vmu_j,\vSigma_j)
+# ```
+# This is sometimes called a Gaussian HMM.
+# 
+# As a simple example, suppose we have an HMM with 3 hidden states,
+# each of which generates a 2d Gaussian.
+# We can represent these Gaussian distributions are 2d ellipses,
+# as we show below.
+# We call these ``lilly pads'', because of their shape.
+# We can imagine a frog hopping from one lilly pad to another.
+# (This analogy is due to the late Sam Roweis.)
+# The frog will stay on a pad for a while (corresponding to remaining in the same
+# discrete state $\hidden_t$), and then jump to a new pad
+# (corresponding to a transition to a new state).
+# The data we see are just the 2d points (e.g., water droplets)
+# coming from near the pad that the frog is currently on.
+# Thus this model is like a Gaussian mixture model,
+# in that it generates clusters of observations,
+# except now there is temporal correlation between the data points.
+# 
+# Let us now illustrate this model in code.
+# 
+# 
+
+# In[7]:
+
+
+# Let us create the model
+
+initial_probs = jnp.array([0.3, 0.2, 0.5])
+
+# transition matrix
+A = jnp.array([
+[0.3, 0.4, 0.3],
+[0.1, 0.6, 0.3],
+[0.2, 0.3, 0.5]
+])
+
+# Observation model
+mu_collection = jnp.array([
+[0.3, 0.3],
+[0.8, 0.5],
+[0.3, 0.8]
+])
+
+S1 = jnp.array([[1.1, 0], [0, 0.3]])
+S2 = jnp.array([[0.3, -0.5], [-0.5, 1.3]])
+S3 = jnp.array([[0.8, 0.4], [0.4, 0.5]])
+cov_collection = jnp.array([S1, S2, S3]) / 60
+
+
+import tensorflow_probability as tfp
+
+if False:
+    hmm = HMM(trans_dist=distrax.Categorical(probs=A),
+            init_dist=distrax.Categorical(probs=initial_probs),
+            obs_dist=distrax.MultivariateNormalFullCovariance(
+                loc=mu_collection, covariance_matrix=cov_collection))
+else:
+    hmm = HMM(trans_dist=distrax.Categorical(probs=A),
+              init_dist=distrax.Categorical(probs=initial_probs),
+              obs_dist=distrax.as_distribution(
+                  tfp.substrates.jax.distributions.MultivariateNormalFullCovariance(loc=mu_collection,
+                                                                                    covariance_matrix=cov_collection)))
+
+print(hmm)
+
+
+# In[8]:
+
+
+
+n_samples, seed = 50, 10
+samples_state, samples_obs = hmm.sample(seed=PRNGKey(seed), seq_len=n_samples)
+
+print(samples_state.shape)
+print(samples_obs.shape)
+
+
+# In[9]:
+
+
+
+# Let's plot the observed data in 2d
+xmin, xmax = 0, 1
+ymin, ymax = 0, 1.2
+colors = ["tab:green", "tab:blue", "tab:red"]
+
+def plot_2dhmm(hmm, samples_obs, samples_state, colors, ax, xmin, xmax, ymin, ymax, step=1e-2):
+    obs_dist = hmm.obs_dist
+    color_sample = [colors[i] for i in samples_state]
+
+    xs = jnp.arange(xmin, xmax, step)
+    ys = jnp.arange(ymin, ymax, step)
+
+    v_prob = vmap(lambda x, y: obs_dist.prob(jnp.array([x, y])), in_axes=(None, 0))
+    z = vmap(v_prob, in_axes=(0, None))(xs, ys)
+
+    grid = np.mgrid[xmin:xmax:step, ymin:ymax:step]
+
+    for k, color in enumerate(colors):
+        ax.contour(*grid, z[:, :, k], levels=[1], colors=color, linewidths=3)
+        ax.text(*(obs_dist.mean()[k] + 0.13), f"$k$={k + 1}", fontsize=13, horizontalalignment="right")
+
+    ax.plot(*samples_obs.T, c="black", alpha=0.3, zorder=1)
+    ax.scatter(*samples_obs.T, c=color_sample, s=30, zorder=2, alpha=0.8)
+
+    return ax, color_sample
+
+
+fig, ax = plt.subplots()
+_, color_sample = plot_2dhmm(hmm, samples_obs, samples_state, colors, ax, xmin, xmax, ymin, ymax)
+
+
+# In[10]:
+
+
+# Let's plot the hidden state sequence
+
+fig, ax = plt.subplots()
+ax.step(range(n_samples), samples_state, where="post", c="black", linewidth=1, alpha=0.3)
+ax.scatter(range(n_samples), samples_state, c=color_sample, zorder=3)
+
+
+# (sec:lds-intro)=
 # # Linear Gaussian SSMs
 # 
-# Blah blah
+# 
+# Consider the state space model in 
+# {eq}`eq:SSM-ar`
+# where we assume the observations are conditionally iid given the
+# hidden states and inputs (i.e. there are no auto-regressive dependencies
+# between the observables).
+# We can rewrite this model as 
+# a stochastic nonlinear dynamical system (NLDS)
+# by defining the distribution of the next hidden state 
+# as a deterministic function of the past state
+# plus random process noise $\vepsilon_t$ 
+# \begin{align}
+# \hmmhid_t &= \ssmDynFn(\hmmhid_{t-1}, \inputs_t, \vepsilon_t)  
+# \end{align}
+# where $\vepsilon_t$ is drawn from the distribution such
+# that the induced distribution
+# on $\hmmhid_t$ matches $p(\hmmhid_t|\hmmhid_{t-1}, \inputs_t)$.
+# Similarly we can rewrite the observation distributions
+# as a deterministic function of the hidden state
+# plus observation noise $\veta_t$:
+# \begin{align}
+# \hmmobs_t &= \ssmObsFn(\hmmhid_{t}, \inputs_t, \veta_t)
+# \end{align}
+# 
+# 
+# If we assume additive Gaussian noise,
+# the model becomes
+# \begin{align}
+# \hmmhid_t &= \ssmDynFn(\hmmhid_{t-1}, \inputs_t) +  \vepsilon_t  \\
+# \hmmobs_t &= \ssmObsFn(\hmmhid_{t}, \inputs_t) + \veta_t
+# \end{align}
+# where $\vepsilon_t \sim \gauss(\vzero,\vQ_t)$
+# and $\veta_t \sim \gauss(\vzero,\vR_t)$.
+# We will call these Gaussian SSMs.
+# 
+# If we additionally assume
+# the transition function $\ssmDynFn$
+# and the observation function $\ssmObsFn$ are both linear,
+# then we can rewrite the model as follows:
+# \begin{align}
+# p(\hmmhid_t|\hmmhid_{t-1},\inputs_t) &= \gauss(\hmmhid_t|\ldsDyn_t \hmmhid_{t-1}
+# + \ldsDynIn_t \inputs_t, \vQ_t)
+# \\
+# p(\hmmobs_t|\hmmhid_t,\inputs_t) &= \gauss(\hmmobs_t|\ldsObs_t \hmmhid_{t}
+# + \ldsObsIn_t \inputs_t, \vR_t)
+# \end{align}
+# This is called a 
+# linear-Gaussian state space model
+# (LG-SSM),
+# or a
+# linear dynamical system (LDS).
+# We usually assume the parameters are independent of time, in which case
+# the model is said to be time-invariant or homogeneous.
+# 
 
 # (sec:tracking-lds)=
-# ## Example: model for 2d tracking
+# (sec:kalman-tracking)=
+# ## Example: tracking a 2d point
+# 
+# 
+# 
+# % Sarkkar p43
+# Consider an object moving in $\real^2$.
+# Let the state be
+# the position and velocity of the object,
+# $$\vz_t =\begin{pmatrix} u_t & \dot{u}_t & v_t & \dot{v}_t \end{pmatrix}$$.
+# (We use $u$ and $v$ for the two coordinates,
+# to avoid confusion with the state and observation variables.)
+# If we use Euler discretization,
+# the dynamics become
+# \begin{align}
+# \underbrace{\begin{pmatrix} u_t\\ \dot{u}_t \\ v_t \\ \dot{v}_t \end{pmatrix}}_{\vz_t}
+#   = 
+# \underbrace{
+# \begin{pmatrix}
+# 1 & 0 & \Delta & 0 \\
+# 0 & 1 & 0 & \Delta\\
+# 0 & 0 & 1 & 0 \\
+# 0 & 0 & 0 & 1
+# \end{pmatrix}
+# }_{\ldsDyn}
+# \
+# \underbrace{\begin{pmatrix} u_{t-1} \\ \dot{u}_{t-1} \\ v_{t-1} \\ \dot{v}_{t-1} \end{pmatrix}}_{\vz_{t-1}}
+# + \vepsilon_t
+# \end{align}
+# where $\vepsilon_t \sim \gauss(\vzero,\vQ)$ is
+# the process noise.
+# 
+# Let us assume
+# that the process noise is 
+# a white noise process added to the velocity components
+# of the state, but not to the location.
+# (This is known as a random accelerations model.)
+# We can approximate the resulting process in discrete time by assuming
+# $\vQ = \diag(0, q, 0, q)$.
+# (See  {cite}`Sarkka13` p60 for a more accurate way
+# to convert the continuous time process to discrete time.)
+# 
+# 
+# Now suppose that at each discrete time point we
+# observe the location,
+# corrupted by  Gaussian noise.
+# Thus the observation model becomes
+# \begin{align}
+# \underbrace{\begin{pmatrix}  y_{1,t} \\  y_{2,t} \end{pmatrix}}_{\vy_t}
+#   &=
+#     \underbrace{
+#     \begin{pmatrix}
+# 1 & 0 & 0 & 0 \\
+# 0 & 0 & 1 & 0
+#     \end{pmatrix}
+#     }_{\ldsObs}
+#     \
+# \underbrace{\begin{pmatrix} u_t\\ \dot{u}_t \\ v_t \\ \dot{v}_t \end{pmatrix}}_{\vz_t}    
+#  + \veta_t
+# \end{align}
+# where $\veta_t \sim \gauss(\vzero,\vR)$ is the \keywordDef{observation noise}.
+# We see that the observation matrix $\ldsObs$ simply ``extracts'' the
+# relevant parts  of the state vector.
+# 
+# Suppose we sample a trajectory and corresponding set
+# of noisy observations from this model,
+# $(\vz_{1:T}, \vy_{1:T}) \sim p(\vz,\vy|\vtheta)$.
+# (We use diagonal observation noise,
+# $\vR = \diag(\sigma_1^2, \sigma_2^2)$.)
+# The results are shown below. 
+# 
+
+# In[11]:
+
+
+key = jax.random.PRNGKey(314)
+timesteps = 15
+delta = 1.0
+A = jnp.array([
+    [1, 0, delta, 0],
+    [0, 1, 0, delta],
+    [0, 0, 1, 0],
+    [0, 0, 0, 1]
+])
+
+C = jnp.array([
+    [1, 0, 0, 0],
+    [0, 1, 0, 0]
+])
+
+state_size, _ = A.shape
+observation_size, _ = C.shape
+
+Q = jnp.eye(state_size) * 0.001
+R = jnp.eye(observation_size) * 1.0
+# Prior parameter distribution
+mu0 = jnp.array([8, 10, 1, 0]).astype(float)
+Sigma0 = jnp.eye(state_size) * 1.0
+
+from jsl.lds.kalman_filter import LDS, smooth, filter
+
+lds = LDS(A, C, Q, R, mu0, Sigma0)
+print(lds)
+
+
+# In[12]:
+
+
+from jsl.demos.plot_utils import plot_ellipse
+
+def plot_tracking_values(observed, filtered, cov_hist, signal_label, ax):
+    timesteps, _ = observed.shape
+    ax.plot(observed[:, 0], observed[:, 1], marker="o", linewidth=0,
+            markerfacecolor="none", markeredgewidth=2, markersize=8, label="observed", c="tab:green")
+    ax.plot(*filtered[:, :2].T, label=signal_label, c="tab:red", marker="x", linewidth=2)
+    for t in range(0, timesteps, 1):
+        covn = cov_hist[t][:2, :2]
+        plot_ellipse(covn, filtered[t, :2], ax, n_std=2.0, plot_center=False)
+    ax.axis("equal")
+    ax.legend()
+
+
+# In[13]:
+
+
+
+z_hist, x_hist = lds.sample(key, timesteps)
+
+fig_truth, axs = plt.subplots()
+axs.plot(x_hist[:, 0], x_hist[:, 1],
+        marker="o", linewidth=0, markerfacecolor="none",
+        markeredgewidth=2, markersize=8,
+        label="observed", c="tab:green")
+
+axs.plot(z_hist[:, 0], z_hist[:, 1],
+        linewidth=2, label="truth",
+        marker="s", markersize=8)
+axs.legend()
+axs.axis("equal")
+
+
+# The main task is to infer the hidden states given the noisy
+# observations, i.e., $p(\vz|\vy,\vtheta)$. We discuss the topic of inference in {ref}`sec:inference`.
+
+# (sec:nlds-intro)=
+# # Nonlinear Gaussian SSMs
+# 
+# In this section, we consider SSMs in which the dynamics and/or observation models are nonlinear,
+# but the process noise and observation noise are Gaussian.
+# That is, 
+# \begin{align}
+# \hmmhid_t &= \ssmDynFn(\hmmhid_{t-1}, \inputs_t) +  \vepsilon_t  \\
+# \hmmobs_t &= \ssmObsFn(\hmmhid_{t}, \inputs_t) + \veta_t
+# \end{align}
+# where $\vepsilon_t \sim \gauss(\vzero,\vQ_t)$
+# and $\veta_t \sim \gauss(\vzero,\vR_t)$.
+# This is a very widely used model class. We give some examples below.
+
+# (sec:pendulum)=
+# ## Example: tracking a 1d pendulum
+# 
+# ```{figure} /figures/pendulum.png
+# :scale: 100%
+# :name: fig:pendulum
+# 
+# Illustration of a pendulum swinging.
+# $g$ is the force of gravity,
+# $w(t)$ is a random external force,
+# and $\alpha$ is the angle wrt the vertical.
+# Based on {cite}`Sarkka13` fig 3.10.
+# 
+# ```
+# 
+# 
+# % Sarka p45, p74
+# Consider a simple pendulum of unit mass and length swinging from
+# a fixed attachment, as in {ref}`fig:pendulum`.
+# Such an object is in principle entirely deterministic in its behavior.
+# However, in the real world, there are often unknown forces at work
+# (e.g., air turbulence, friction).
+# We will model these by a continuous time random Gaussian noise process $w(t)$.
+# This gives rise to the following differential equation:
+# \begin{align}
+# \frac{d^2 \alpha}{d t^2}
+# = -g \sin(\alpha) + w(t)
+# \end{align}
+# We can write this as a nonlinear SSM by defining the state to be
+# $z_1(t) = \alpha(t)$ and $z_2(t) = d\alpha(t)/dt$.
+# Thus
+# \begin{align}
+# \frac{d \vz}{dt}
+# = \begin{pmatrix} z_2 \\ -g \sin(z_1) \end{pmatrix}
+# + \begin{pmatrix} 0 \\ 1 \end{pmatrix} w(t)
+# \end{align}
+# If we discretize this step size $\Delta$,
+# we get the following
+# formulation {cite}`Sarkka13` p74:
+# \begin{align}
+# \underbrace{
+#   \begin{pmatrix} z_{1,t} \\ z_{2,t} \end{pmatrix}
+#   }_{\hmmhid_t}
+# =
+# \underbrace{
+#   \begin{pmatrix} z_{1,t-1} + z_{2,t-1} \Delta  \\
+#     z_{2,t-1} -g \sin(z_{1,t-1}) \Delta  \end{pmatrix}
+#   }_{\vf(\hmmhid_{t-1})}
+# +\vq_{t-1}
+# \end{align}
+# where $\vq_{t-1} \sim \gauss(\vzero,\vQ)$ with
+# \begin{align}
+# \vQ = q^c \begin{pmatrix}
+#   \frac{\Delta^3}{3} &   \frac{\Delta^2}{2} \\
+#   \frac{\Delta^2}{2} & \Delta
+#   \end{pmatrix}
+#   \end{align}
+# where $q^c$ is the spectral density (continuous time variance)
+# of the continuous-time noise process.
+# 
+# 
+# If we observe the angular position, we
+# get the linear observation model
+# \begin{align}
+# y_t = \alpha_t + r_t =  h(\hmmhid_t) + r_t
+# \end{align}
+# where $h(\hmmhid_t) = z_{1,t}$
+# and $r_t$ is the observation noise.
+# If we only observe  the horizontal position,
+# we get the nonlinear observation model
+# \begin{align}
+# y_t = \sin(\alpha_t) + r_t =  h(\hmmhid_t) + r_t
+# \end{align}
+# where $h(\hmmhid_t) = \sin(z_{1,t})$.
+# 
+# 
+# 
+# 
+# 
 # 
-# Blah blah
 
 # (sec:inference)=
 # # Inferential goals
 # 
-# ```{figure} /figures/dbn-inference-problems.png
+# ```{figure} /figures/dbn-inference-problems-tikz.png
 # :scale: 100%
-# :name: dbn-inference
+# :name: fig:dbn-inference
 # 
 # Illustration of the different kinds of inference in an SSM.
 #  The main kinds of inference for state-space models.
@@ -503,7 +937,7 @@ print_source(hmm.sample)
 # \cdots
 #  p(\hmmhid_{t+h}|\hmmhid_{t+h-1})
 # \end{align}
-# See \cref{fig:dbn_inf_problems} for a summary of these distributions.
+# See {ref}`fig:dbn-inference` for a summary of these distributions.
 # 
 # In addition  to comuting posterior marginals,
 # we may want to compute the most probable hidden sequence,
@@ -530,7 +964,7 @@ print_source(hmm.sample)
 # to the casino HMM from {ref}`sec:casino`. 
 # 
 
-# In[7]:
+# In[14]:
 
 
 # Call inference engine
@@ -539,7 +973,7 @@ filtered_dist, _, smoothed_dist, loglik = hmm.forward_backward(x_hist)
 map_path = hmm.viterbi(x_hist)
 
 
-# In[8]:
+# In[15]:
 
 
 # Find the span of timesteps that the    simulated systems turns to be in state 1
@@ -555,7 +989,7 @@ def find_dishonest_intervals(z_hist):
     return spans
 
 
-# In[9]:
+# In[16]:
 
 
 # Plot posterior
@@ -576,7 +1010,7 @@ def plot_inference(inference_values, z_hist, ax, state=1, map_estimate=False):
     ax.set_xlabel("Observation number")
 
 
-# In[10]:
+# In[17]:
 
 
 # Filtering
@@ -589,7 +1023,7 @@ ax.set_title("Filtered")
  
 
 
-# In[11]:
+# In[12]:
 
 
 # Smoothing
@@ -602,7 +1036,7 @@ ax.set_title("Smoothed")
  
 
 
-# In[12]:
+# In[ ]:
 
 
 # MAP estimation
@@ -612,7 +1046,7 @@ ax.set_ylabel("MAP state")
 ax.set_title("Viterbi")
 
 
-# In[13]:
+# In[ ]:
 
 
 # TODO: posterior samples

BIN
_build/jupyter_execute/chapters/ssm/ssm_15_0.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_16_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_21_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_24_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_25_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_29_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_30_1.png


BIN
_build/jupyter_execute/chapters/ssm/ssm_31_1.png


+ 23 - 0
chapters/hmm/hmm_filter.ipynb

@@ -6,6 +6,29 @@
    "source": [
     "# HMM filtering (forwards algorithm)\n"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```{math}\n",
+    "\n",
+    "\\newcommand\\floor[1]{\\lfloor#1\\rfloor}\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "print(42)"
+   ]
   }
  ],
  "metadata": {

Failā izmaiņas netiks attēlotas, jo tās ir par lielu
+ 336 - 27
chapters/ssm/ssm.ipynb


BIN
figures/pendulum.png