{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "

\n", " \n", "\n", "

\n", "\n", "## Interactive Spatial Aggregate Uncertianty Demonstration\n", "\n", "\n", "### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)\n", "\n", "\n", "### The Aggregate Uncertainty Workflow\n", "\n", "Here's a simple, interactive workflow for calculating the aggregate uncertainty over multiple unsampled spatial locations.\n", "\n", "* critical for calculating the joint uncertainty over multiple outcomes\n", "\n", "* accounts for spatial correlation with sequential Gaussian simulation\n", "\n", "* assumes a stationary Gaussian distribution for the uncertainty at each spatial location\n", "\n", "That accounts for spatial conditioning:\n", "\n", "* we use a 'toy problem' with only 3 data for speed and interpretability of the results\n", "\n", "This workflow could be readily expanded:\n", "\n", "* to account for deterministic trends, nonstationarity\n", "\n", "* to integrate other local distribution of uncertainty shapes\n", "\n", "* transform the simulation results to a target distribution\n", "\n", "* to integrate secondary information, multivariate relationships\n", "\n", "The workflow proceeds as follows:\n", "\n", "* sequential application of kriging to build the local uncertainty model\n", "\n", "* Monte Carlo simulation to draw a local realization\n", "\n", "* sequential incorporation of the simulated values as data\n", "\n", "* proceed to the next well\n", "\n", "* calculate a aggregate measure over the wells\n", "\n", "* repeat over multiple realizations and summarize the result as aggregate uncertainty\n", "\n", "#### Observations\n", "\n", "General observations:\n", "\n", "1. as the correlation between the unsampled spatial locations increases the uncertainty in the aggregate will increase\n", "\n", "2. given no spatial correlation between usampled spatial locations, the uncertainty may be calculated with the assumption of independence, e.g. standard error in the mean:\n", "\n", "\\begin{equation}\n", "\\sigma^2_{\\overline{x}} = \\frac{\\sigma^2_{x}}{n}\n", "\\end{equation}\n", "\n", "Let's review the basic building blocks:\n", "\n", "#### Spatial Estimation\n", "\n", "Consider the case of making an estimate at some unsampled location, $š‘§(\\bf{u}_0)$, where $z$ is the property of interest (e.g. porosity etc.) and $š®_0$ is a location vector describing the unsampled location.\n", "\n", "How would you do this given data, $š‘§(\\bf{š®}_1)$, $š‘§(\\bf{š®}_2)$, and $š‘§(\\bf{š®}_3)$?\n", "\n", "It would be natural to use a set of linear weights to formulate the estimator given the available data.\n", "\n", "\\begin{equation}\n", "z^{*}(\\bf{u}) = \\sum^{n}_{\\alpha = 1} \\lambda_{\\alpha} z(\\bf{u}_{\\alpha})\n", "\\end{equation}\n", "\n", "We could add an unbiasedness constraint to impose the sum of the weights equal to one. What we will do is assign the remainder of the weight (one minus the sum of weights) to the global average; therefore, if we have no informative data we will estimate with the global average of the property of interest.\n", "\n", "\\begin{equation}\n", "z^{*}(\\bf{u}) = \\sum^{n}_{\\alpha = 1} \\lambda_{\\alpha} z(\\bf{u}_{\\alpha}) + \\left(1-\\sum^{n}_{\\alpha = 1} \\lambda_{\\alpha} \\right) \\overline{z}\n", "\\end{equation}\n", "\n", "We will make a stationarity assumption, so let's assume that we are working with residuals, $y$. \n", "\n", "\\begin{equation}\n", "y^{*}(\\bf{u}) = z^{*}(\\bf{u}) - \\overline{z}(\\bf{u})\n", "\\end{equation}\n", "\n", "If we substitute this form into our estimator the estimator simplifies, since the mean of the residual is zero.\n", "\n", "\\begin{equation}\n", "y^{*}(\\bf{u}) = \\sum^{n}_{\\alpha = 1} \\lambda_{\\alpha} y(\\bf{u}_{\\alpha})\n", "\\end{equation}\n", "\n", "while satisfying the unbaisedness constraint. \n", "\n", "#### Kriging\n", "\n", "Now the next question is what weights should we use? \n", "\n", "We could use equal weighting, $\\lambda = \\frac{1}{n}$, and the estimator would be the average of the local data applied for the spatial estimate. This would not be very informative.\n", "\n", "We could assign weights considering the spatial context of the data and the estimate:\n", "\n", "* **spatial continuity** as quantified by the variogram (and covariance function)\n", "* **redundancy** the degree of spatial continuity between all of the available data with themselves \n", "* **closeness** the degree of spatial continuity between the avaiable data and the estimation location\n", "\n", "The kriging approach accomplishes this, calculating the best linear unbiased weights for the local data to estimate at the unknown location. The derivation of the kriging system and the resulting linear set of equations is available in the lecture notes. Furthermore kriging provides a measure of the accuracy of the estimate! This is the kriging estimation variance (sometimes just called the kriging variance).\n", "\n", "\\begin{equation}\n", "\\sigma^{2}_{E}(\\bf{u}) = C(0) - \\sum^{n}_{\\alpha = 1} \\lambda_{\\alpha} C(\\bf{u}_0 - \\bf{u}_{\\alpha})\n", "\\end{equation}\n", "\n", "What is 'best' about this estimate? Kriging estimates are best in that they minimize the above estimation variance. \n", "\n", "#### Properties of Kriging\n", "\n", "Here are some important properties of kriging:\n", "\n", "* **Exact interpolator** - kriging estimates with the data values at the data locations\n", "* **Kriging variance** can be calculated before getting the sample information, as the kriging estimation variance is not dependent on the values of the data nor the kriging estimate, i.e. the kriging estimator is homoscedastic. \n", "* **Spatial context** - kriging takes into account, furthermore to the statements on spatial continuity, closeness and redundancy we can state that kriging accounts for the configuration of the data and structural continuity of the variable being estimated.\n", "* **Scale** - kriging may be generalized to account for the support volume of the data and estimate. We will cover this later.\n", "* **Multivariate** - kriging may be generalized to account for multiple secondary data in the spatial estimate with the cokriging system. We will cover this later.\n", "* **Smoothing effect** of kriging can be forecast. We will use this to build stochastic simulations later.\n", "\n", "#### Spatial Continuity \n", "\n", "**Spatial Continuity** is the correlation between values over distance.\n", "\n", "* No spatial continuity ā€“ no correlation between values over distance, random values at each location in space regardless of separation distance.\n", "\n", "* Homogenous phenomenon have perfect spatial continuity, since all values as the same (or very similar) they are correlated. \n", "\n", "We need a statistic to quantify spatial continuity! A convenient method is the Semivariogram.\n", "\n", "#### The Semivariogram\n", "\n", "Function of difference over distance.\n", "\n", "* The expected (average) squared difference between values separated by a lag distance vector (distance and direction), $h$:\n", "\n", "\\begin{equation}\n", "\\gamma(\\bf{h}) = \\frac{1}{2 N(\\bf{h})} \\sum^{N(\\bf{h})}_{\\alpha=1} (z(\\bf{u}_\\alpha) - z(\\bf{u}_\\alpha + \\bf{h}))^2 \n", "\\end{equation}\n", "\n", "where $z(\\bf{u}_\\alpha)$ and $z(\\bf{u}_\\alpha + \\bf{h})$ are the spatial sample values at tail and head locations of the lag vector respectively.\n", "\n", "* Calculated over a suite of lag distances to obtain a continuous function.\n", "\n", "* the $\\frac{1}{2}$ term converts a variogram into a semivariogram, but in practice the term variogram is used instead of semivariogram.\n", "* We prefer the semivariogram because it relates directly to the covariance function, $C_x(\\bf{h})$ and univariate variance, $\\sigma^2_x$:\n", "\n", "\\begin{equation}\n", "C_x(\\bf{h}) = \\sigma^2_x - \\gamma(\\bf{h})\n", "\\end{equation}\n", "\n", "Note the correlogram is related to the covariance function as:\n", "\n", "\\begin{equation}\n", "\\rho_x(\\bf{h}) = \\frac{C_x(\\bf{h})}{\\sigma^2_x}\n", "\\end{equation}\n", "\n", "The correlogram provides of function of the $\\bf{h}-\\bf{h}$ scatter plot correlation vs. lag offset $\\bf{h}$. \n", "\n", "\\begin{equation}\n", "-1.0 \\le \\rho_x(\\bf{h}) \\le 1.0\n", "\\end{equation}\n", "\n", "#### Sequential Gaussian Simulation\n", "\n", "With sequential Gaussian simulation we build on kriging by:\n", "\n", "* adding a random residual with the missing variance\n", "\n", "* sequentially adding the simulated values as data to correct the covariance between the simulated values\n", "\n", "I have more on this topic at [Simulation YouTube Lecture](https://www.youtube.com/watch?v=3cLqK3lR56Y&list=PLG19vXLQHvSB-D4XKYieEku9GQMQyAzjJ&index=45&t=813s).\n", "\n", "#### Objective \n", "\n", "In the PGE 383: Stochastic Subsurface Modeling class I want to provide hands-on experience with building subsurface modeling workflows. Python provides an excellent vehicle to accomplish this. I have coded a package called GeostatsPy with GSLIB: Geostatistical Library (Deutsch and Journel, 1998) functionality that provides basic building blocks for building subsurface modeling workflows. \n", "\n", "The objective is to remove the hurdles of subsurface modeling workflow construction by providing building blocks and sufficient examples. This is not a coding class per se, but we need the ability to 'script' workflows working with numerical methods. \n", "\n", "#### Getting Started\n", "\n", "Here's the steps to get setup in Python with the GeostatsPy package:\n", "\n", "1. Install Anaconda 3 on your machine (https://www.anaconda.com/download/). \n", "2. From Anaconda Navigator (within Anaconda3 group), go to the environment tab, click on base (root) green arrow and open a terminal. \n", "3. In the terminal type: pip install geostatspy. \n", "4. Open Jupyter and in the top block get started by copy and pasting the code block below from this Jupyter Notebook to start using the geostatspy functionality. \n", "\n", "You will need to copy the data file to your working directory. They are available here:\n", "\n", "* Tabular data - sample_data.csv at https://git.io/fh4gm.\n", "\n", "There are exampled below with these functions. You can go here to see a list of the available functions, https://git.io/fh4eX, other example workflows and source code. \n", "\n", "#### Load the required libraries\n", "\n", "The following code loads the required libraries." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import geostatspy.GSLIB as GSLIB # GSLIB utilies, visualization and wrapper\n", "import geostatspy.geostats as geostats # GSLIB methods convert to Python " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will also need some standard packages. These should have been installed with Anaconda 3." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import os # to set current working directory \n", "import sys # supress output to screen for interactive variogram modeling\n", "import io\n", "import numpy as np # arrays and matrix math\n", "import pandas as pd # DataFrames\n", "import matplotlib.pyplot as plt # plotting\n", "from matplotlib.pyplot import cm # color maps\n", "from matplotlib.patches import Ellipse # plot an ellipse\n", "import math # sqrt operator\n", "import random # random simulation locations\n", "from copy import copy # copy a colormap\n", "from scipy.stats import norm\n", "from ipywidgets import interactive # widgets and interactivity\n", "from ipywidgets import widgets \n", "from ipywidgets import Layout\n", "from ipywidgets import Label\n", "from ipywidgets import VBox, HBox\n", "from scipy.stats import norm # Gaussian distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you get a package import error, you may have to first install some of these packages. This can usually be accomplished by opening up a command window on Windows and then typing 'python -m pip install [package-name]'. More assistance is available with the respective package docs. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Simple, Simple Kriging Function\n", "\n", "Let's write a fast Python function to take data points and unknown location and provide the:\n", "\n", "* **simple kriging estimate**\n", "\n", "* **simple kriging variance / estimation variance**\n", "\n", "* **simple kriging weights**\n", "\n", "This provides a fast method for small datasets, with less parameters (no search parameters) and the ability to see the simple kriging weights.\n", "\n", "* we use it here for fast, flexible application of sequential simulation\n", "\n", "* the method will not work with only ones simulation location so we send 2 and only use the first result (the 2nd is always a dummy location in the workflow below." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def simple_simple_krige(df,xcol,ycol,vcol,dfl,xlcol,ylcol,vario,skmean):\n", "# load the variogram\n", " nst = vario['nst']; pmx = 9999.9\n", " cc = np.zeros(nst); aa = np.zeros(nst); it = np.zeros(nst)\n", " ang = np.zeros(nst); anis = np.zeros(nst)\n", " nug = vario['nug']; sill = nug \n", " cc[0] = vario['cc1']; sill = sill + cc[0]\n", " it[0] = vario['it1']; ang[0] = vario['azi1']; \n", " aa[0] = vario['hmaj1']; anis[0] = vario['hmin1']/vario['hmaj1'];\n", " if nst == 2:\n", " cc[1] = vario['cc2']; sill = sill + cc[1]\n", " it[1] = vario['it2']; ang[1] = vario['azi2']; \n", " aa[1] = vario['hmaj2']; anis[1] = vario['hmin2']/vario['hmaj2']; \n", "\n", "# set up the required matrices\n", " rotmat, maxcov = geostats.setup_rotmat(nug,nst,it,cc,ang,pmx) \n", " ndata = len(df); a = np.zeros([ndata,ndata]); r = np.zeros(ndata); s = np.zeros(ndata); rr = np.zeros(ndata)\n", " nest = len(dfl)\n", "\n", " est = np.zeros(nest); var = np.full(nest,sill); weights = np.zeros([nest,ndata])\n", "\n", "# Make and solve the kriging matrix, calculate the kriging estimate and variance \n", " for iest in range(0,nest):\n", " for idata in range(0,ndata):\n", " for jdata in range(0,ndata):\n", " a[idata,jdata] = geostats.cova2(df[xcol].values[idata],df[ycol].values[idata],df[xcol].values[jdata],df[ycol].values[jdata],\n", " nst,nug,pmx,cc,aa,it,ang,anis,rotmat,maxcov)\n", " r[idata] = geostats.cova2(df[xcol].values[idata],df[ycol].values[idata],dfl[xlcol].values[iest],dfl[ylcol].values[iest],\n", " nst,nug,pmx,cc,aa,it,ang,anis,rotmat,maxcov)\n", " rr[idata] = r[idata]\n", " \n", " s = geostats.ksol_numpy(ndata,a,r) \n", " sumw = 0.0\n", " for idata in range(0,ndata): \n", " sumw = sumw + s[idata]\n", " weights[iest,idata] = s[idata]\n", " est[iest] = est[iest] + s[idata]*df[vcol].values[idata]\n", " var[iest] = var[iest] - s[idata]*rr[idata]\n", " est[iest] = est[iest] + (1.0-sumw)*skmean\n", " return est,var,weights " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Interactive Aggregate Unceratin Over Random Unsampled Spatial Locations\n", "\n", "For this first interactive method we will perform sequential simulation:\n", "\n", "* at **nsim** random point locations in the area of interest\n", "\n", "* over **L** realizations\n", "\n", "The following code includes, dsahboard with:\n", "\n", "* **number of simulation locations** randomly selected, could also be provided as a table with limited modification\n", "\n", "* **number of realizations** to constrain runtimes\n", "\n", "* **variogram model** singled structure with major direction, major and minor ranges and nugget effect\n", "\n", "* **data locations** to condition the uncertainty model \n", "\n", "The summary plots include: \n", "\n", "* **variogram model** over major and minor and interpolated directions\n", "\n", "* **data locations** color coded to matched the slider bars controlling their location\n", "\n", "* **average simulated values** at the unsampled spatial locations \n", "\n", "* **aggregate uncertainty distribution** as the histogram of the **L** realizations of the aggregate metric\n", "\n", "Let's first set up the model area of interest.\n", "\n", "* **xmin**, **xmax**, **ymin**, and **ymax** determine the mode extent\n", "\n", "* **csiz** is the cell size, **nloc** random unsampled locations are selected from this regular grid" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "X extents [0.0,1000.0] and Y entents [0.0,1000.0]\n" ] } ], "source": [ "csiz = 100; xmn = csiz * 0.5; nx = 10; ymn = csiz * 0.5; ny = 10 \n", "xmin = xmn - csiz * 0.5; xmax = xmin + nx * csiz\n", "ymin = ymn - csiz * 0.5; ymax = ymin + ny * csiz\n", "print('X extents [' + str(xmin) + ',' + str(xmax) + '] and Y entents [' + str(ymin) + ',' + str(ymax) + ']')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Assume the Global Distribution\n", "\n", "We are assuming a global stationary Gaussian distribution for simplicity\n", "\n", "* this could be easily replaced with a target reference distribution or other distribution\n", "\n", "We specify the mean and standard deviation below" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "tmean = 0.15\n", "tstd = 0.03\n", "tvar = tstd**2.0\n", "vmin = 0.1; vmax = 0.2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's set up our dash board." ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "import warnings; warnings.simplefilter('ignore')\n", "\n", "# dashboard: number of simulation locations and variogram parameters\n", "style = {'description_width': 'initial'}\n", "l = widgets.Text(value=' Sequential Simulation, Michael Pyrcz, Associate Professor, The University of Texas at Austin',layout=Layout(width='950px', height='30px'))\n", "nreal = widgets.IntSlider(min = 0, max = 99, value = 50, step = 1, description = 'nreal',orientation='vertical',\n", " layout=Layout(width='50px', height='200px'))\n", "nreal.style.handle_color = 'gray'\n", "\n", "nsim = widgets.IntSlider(min = 1, max = 99, value = 1, step = 1, description = 'nloc',orientation='vertical',\n", " layout=Layout(width='50px', height='200px'))\n", "nsim.style.handle_color = 'gray'\n", "nug = widgets.FloatSlider(min = 0, max = 1.0, value = 0.0, step = 0.1, description = 'nug',orientation='vertical',\n", " layout=Layout(width='25px', height='200px'))\n", "nug.style.handle_color = 'gray'\n", "it1 = widgets.Dropdown(options=['Spherical', 'Exponential', 'Gaussian'],value='Spherical',\n", " description='Type1:',disabled=False,layout=Layout(width='180px', height='30px'), style=style)\n", "\n", "azi = widgets.FloatSlider(min=0, max = 360, value = 45, step = 22.5, description = 'azi',\n", " orientation='vertical',layout=Layout(width='40px', height='200px'))\n", "azi.style.handle_color = 'gray'\n", "hmaj1 = widgets.FloatSlider(min=0.01, max = 10000.0, value = 1000.0, step = 25.0, description = 'hmaj1',\n", " orientation='vertical',layout=Layout(width='40px', height='200px'))\n", "hmaj1.style.handle_color = 'gray'\n", "hmin1 = widgets.FloatSlider(min = 0.01, max = 10000.0, value = 500.0, step = 25.0, description = 'hmin1',\n", " orientation='vertical',layout=Layout(width='40px', height='200px'))\n", "hmin1.style.handle_color = 'gray'\n", "uikvar = widgets.HBox([nreal,nsim,nug,it1,azi,hmaj1,hmin1],) \n", "\n", "# dashboard: data locations \n", "x1 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 100.0, step = 1.0, description = 'x1',orientation='horizontal',\n", " layout=Layout(width='180px', height='30px'),readout_format = '.0f',style=style)\n", "x1.style.handle_color = 'blue'\n", "y1 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 100.0, step = 1.0, description = 'y1',orientation='vertical',\n", " layout=Layout(width='90px', height='180px'),readout_format = '.0f',style=style)\n", "y1.style.handle_color = 'blue'\n", "uik1 = widgets.VBox([x1,y1],)\n", "\n", "x2 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 500.0, step = 1.0, description = 'x2',orientation='horizontal',\n", " layout=Layout(width='180px', height='30px'),readout_format = '.0f',style=style)\n", "x2.style.handle_color = 'red'\n", "y2 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 800.0, step = 1.0, description = 'y2',orientation='vertical',\n", " layout=Layout(width='90px', height='180px'),readout_format = '.0f',style=style)\n", "y2.style.handle_color = 'red'\n", "uik2 = widgets.VBox([x2,y2],)\n", "\n", "x3 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 900.0, step = 1.0, description = 'x3',orientation='horizontal',\n", " layout=Layout(width='180px', height='30px'),readout_format = '.0f',style=style)\n", "x3.style.handle_color = 'green'\n", "y3 = widgets.FloatSlider(min=0.0, max = 1000.0, value = 200.0, step = 1.0, description = 'y3',orientation='vertical',\n", " layout=Layout(width='90px', height='180px'),readout_format = '.0f',style=style)\n", "y3.style.handle_color = 'green'\n", "uik3 = widgets.VBox([x3,y3],)\n", "\n", "uipars = widgets.HBox([uikvar,uik1,uik2,uik3],) \n", "uik = widgets.VBox([l,uipars],)\n", "\n", "def convert_type(it):\n", " if it == 'Spherical': \n", " return 1\n", " elif it == 'Exponential':\n", " return 2\n", " else: \n", " return 3\n", "\n", "def f_make_krige(nreal,nsim,nug,it1,azi,hmaj1,hmin1,x1,y1,x2,y2,x3,y3): # function to take parameters, make sample and plot\n", " text_trap = io.StringIO() # suppress all text function output to dashboard to avoid clutter \n", " sys.stdout = text_trap\n", " cmap = cm.inferno\n", " np.random.seed(seed = 73073) # ensure same results for all runs\n", " it1 = convert_type(it1)\n", " nst = 1; xlag = 10; nlag = int(hmaj1/xlag); c1 = 1.0-nug\n", " vario = GSLIB.make_variogram(nug,nst,it1,c1,azi,hmaj1,hmin1) # make model object\n", " index_maj,h_maj,gam_maj,cov_maj,ro_maj = geostats.vmodel(nlag,xlag,azi,vario) # project the model in the major azimuth # project the model in the 135 azimuth\n", " index_min,h_min,gam_min,cov_min,ro_min = geostats.vmodel(nlag,xlag,azi+90.0,vario) # project the model in the minor azimuth\n", " \n", " seed = 73073\n", "\n", "# make hard data dataframe and hard code the data values\n", " x = [x1,x2,x3]; y = [y1,y2,y3]; value = [0.1,0.15,0.2] \n", " df = pd.DataFrame({'X':x,'Y':y,'Value':value})\n", " ndata = len(df); skmean = np.average(df['Value'].values)\n", "\n", "# make simulation locations dataframe\n", " random.seed(a = seed)\n", " xl = random.sample(range(0, 1000), nsim); \n", " random.seed(a = seed+1)\n", " yl = random.sample(range(0, 1000), nsim); valuel = np.full(nsim,-9999)\n", " dfl = pd.DataFrame({'X':xl,'Y':yl, 'Value':valuel},dtype=np.single)\n", " \n", "# set up ndarrays to store all outputs\n", " sim = np.zeros([len(dfl),nreal]); sk_est = np.zeros([len(dfl),nreal])\n", " sk_var = np.zeros([len(dfl),nreal]); sk_std = np.zeros([len(dfl),nreal])\n", " sk_weights = np.zeros([ndata,len(dfl),nreal]) \n", " metric = np.zeros(nreal)\n", " local_average = np.zeros(nsim)\n", " \n", "# Loop over realizations \n", " for ireal in range(0,nreal):\n", " dfl_temp = pd.DataFrame({'X':[-9999,9999],'Y':[-9999,9999], 'Value':[-9999,-9999]},dtype=np.single)\n", " dfl_copy = dfl.copy(deep = True)\n", " df_copy = df.copy(deep = True)\n", " \n", "# perform sequential simulation\n", " for isim in range(0,len(dfl)):\n", " dfl_temp.set_value(0,'X',dfl_copy.get_value(isim,'X')); dfl_temp.set_value(0,'Y',dfl_copy.get_value(isim,'Y')); # copy current data to first data / method needs atleast 2 data\n", " sk_est_temp, sk_var_temp, sk_weights_temp = simple_simple_krige(df_copy,'X','Y','Value',dfl_temp,'X','Y',vario,skmean=skmean)\n", " sk_est[isim,ireal] = sk_est_temp[0]; \n", " sk_var[isim,ireal] = sk_var_temp[0] * tvar; \n", " sk_weights[:,isim,ireal] = sk_weights_temp[0,:ndata]\n", " if sk_var[isim,ireal] == 0: \n", " sk_std[isim,ireal] = 0.0\n", " else:\n", " sk_std[isim,ireal] = math.sqrt(sk_var[isim,ireal])\n", " sim[isim,ireal] = norm.rvs(loc=sk_est[isim,ireal], scale=sk_std[isim,ireal], size=1)[0] # random seedset at the start \n", " df_copy = df_copy.append({'X': dfl_copy.get_value(isim,'X'),'Y': dfl_copy.get_value(isim,'Y'),'Value': sim[isim,ireal]}, ignore_index=True)\n", " dfl_copy.at[isim,'Value'] = float(sim[isim,ireal])\n", " \n", " \n", " metric = np.average(sim,axis = 0) # average over unsampled locations\n", " average = np.average(sim, axis = 1) # add unsampled locations to data with average simulated value\n", " for isim in range(0, len(dfl)):\n", " df = df.append({'X': dfl.get_value(isim,'X'),'Y': dfl.get_value(isim,'Y'),'Value': average[isim]}, ignore_index=True) \n", " dfl['Value'] = average\n", "\n", "# plot the variogram model\n", " xlag = 10.0; nlag = int(hmaj1/xlag)\n", " plt.subplot(1,3,1)\n", " plt.plot([0,hmaj1*1.5],[1.0,1.0],color = 'black')\n", " plt.plot(h_maj,gam_maj,color = 'black',label = 'Major ' + str(azi)) \n", " plt.plot(h_min,gam_min,color = 'black',label = 'Minor ' + str(azi+90.0))\n", " deltas = [22.5, 45, 67.5]; \n", " ndelta = len(deltas); hd = np.zeros(ndelta); gamd = np.zeros(ndelta);\n", " color=iter(cm.plasma(np.linspace(0,1,ndelta)))\n", " for delta in deltas:\n", " index,hd,gamd,cov,ro = geostats.vmodel(nlag,xlag,azi+delta,vario);\n", " c=next(color)\n", " plt.plot(hd,gamd,color = c,label = 'Azimuth ' + str(azi+delta))\n", " plt.xlabel(r'Lag Distance $\\bf(h)$, (m)')\n", " plt.ylabel(r'$\\gamma \\bf(h)$')\n", " plt.title('Interpolated NSCORE Porosity Variogram Models')\n", " plt.xlim([0,hmaj1*1.5])\n", " plt.ylim([0,1.4])\n", " plt.legend(loc='upper left')\n", "\n", "# plot the data and simulated values on a scatter plot \n", " sk_weights_avg = np.mean(sk_weights,axis = 1)\n", " plt.subplot(1,3,2)\n", " for idata in range(0,len(df)):\n", " if idata < ndata:\n", " plt.scatter([df.get_value(idata,'X')],[df.get_value(idata,'Y')],marker='^',\n", " c = [df.get_value(idata,'Value')], cmap = cmap, vmin = vmin, vmax = vmax, edgecolors = 'black',\n", " s = 100,label = 'Original Data')\n", " else: \n", " plt.scatter([df.get_value(idata,'X')],[df.get_value(idata,'Y')],\n", " c = [df.get_value(idata,'Value')], cmap = cmap, vmin = vmin, vmax = vmax, edgecolors = 'black',\n", " label = 'Expectation')\n", " \n", " ax = plt.gca()\n", " plt.xlabel('X(m)'); plt.ylabel('Y(m)')\n", " plt.title('Data and Unsampled Locations')\n", " plt.xlim([0,1000])\n", " plt.ylim([0,1000])\n", " plt.colorbar()\n", " \n", " if nsim < 10:\n", " for i, txt in enumerate(np.round(dfl['Value'].values,2)):\n", " plt.annotate(txt, (dfl.get_value(i,'X')-40, dfl.get_value(i,'Y')-40)) \n", "\n", " ellipse = Ellipse((500, 500),width=hmin1*2.0,height=hmaj1*2.0,angle = 360-azi,facecolor='gray',alpha = 0.1)\n", " ax = plt.gca()\n", " ax.add_patch(ellipse)\n", "\n", "# plot the distribution of the simulated values \n", " plt.subplot(1,3,3)\n", " plt.hist(metric,bins = np.linspace(0,.30,100),alpha=0.2,color=\"red\",edgecolor=\"black\", normed = False)\n", " plt.xlim([.0,.30]); plt.ylim([0,nsim/2])\n", " plt.title('Spatial Aggregate Uncertainty')\n", " plt.xlabel('Spatial Aggregate'); plt.ylabel('Frequency')\n", " \n", " ax = plt.gca()\n", " ax.annotate('Spatial Expectation: Mean = ' + str(np.round(np.average(metric),2)), (0.01, nsim*0.47))\n", " ax.annotate('Spatial: Standard Deviation = ' + str(np.round(np.std(metric),4)), (0.01, nsim *0.44))\n", " ax.annotate('Spatial: P90 = ' + str(np.round(np.percentile(metric,90),2)), (0.01, nsim *0.41))\n", " ax.annotate('Spatial: P10 = ' + str(np.round(np.percentile(metric,10),2)), (0.01, nsim *0.38)) \n", " plt.subplots_adjust(left=0.0, bottom=0.0, right=2.2, top=0.9, wspace=0.3, hspace=0.3)\n", " plt.show()\n", " \n", "# connect the function to make the samples and plot to the widgets \n", "interactive_plot = widgets.interactive_output(f_make_krige, {'nreal':nreal,'nsim':nsim,'nug':nug, 'it1':it1, 'azi':azi, 'hmaj1':hmaj1, 'hmin1':hmin1, \n", " 'x1':x1, 'y1':y1, 'x2':x2, 'y2':y2, 'x3':x3, 'y3':y3,})\n", "#interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interactive Aggregate Uncertainty Calculation at Random Points Demonstration\n", "\n", "* select the variogram model and the data locations and observe the aggregate uncertainty over unsampled spatial locations \n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs\n", "\n", "Select the variogram model and the data locations:\n", "\n", "* **nug**: nugget effect\n", "\n", "* **c1 **: contributions of the sill\n", "\n", "* **hmaj1 / hmin1 **: range in the major and minor direction\n", "\n", "* **(x1, y1),...(x3,y3) **: spatial data locations " ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "scrolled": false }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "11a6f829b3e042cb843da3b150485d34", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(Text(value=' Sequential Simulation, Michael Pyrcz,ā€¦" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ba9105aa18814c95904e89e6fc8a33dd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': '
', 'iā€¦" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(uik, interactive_plot) # display the interactive plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Comments\n", "\n", "This was an interactive demonstration of spatial aggregate uncertainty. Much more could be done, I have other demonstrations on the basics of working with DataFrames, ndarrays, univariate statistics, plotting data, declustering, data transformations and many other workflows available at https://github.com/GeostatsGuy/PythonNumericalDemos and https://github.com/GeostatsGuy/GeostatsPy. \n", " \n", "#### The Author:\n", "\n", "### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "*Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions*\n", "\n", "With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development. \n", "\n", "For more about Michael check out these links:\n", "\n", "#### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)\n", "\n", "#### Want to Work Together?\n", "\n", "I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.\n", "\n", "* Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you! \n", "\n", "* Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!\n", "\n", "* I can be reached at mpyrcz@austin.utexas.edu.\n", "\n", "I'm always happy to discuss,\n", "\n", "*Michael*\n", "\n", "Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin\n", "\n", "#### More Resources Available at: [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) \n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 2 }