\n",
"\n",
"## Spatial Data Analytics \n",
"\n",
"### Interactive Demonstration of the Variogram Nugget Effect \n",
"\n",
"#### Michael Pyrcz, Associate Professor, The University of Texas at Austin \n",
"\n",
"##### Contacts: [Twitter/@GeostatsGuy](https://twitter.com/geostatsguy) | [GitHub/GeostatsGuy](https://github.com/GeostatsGuy) | [www.michaelpyrcz.com](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446)\n",
"\n",
"This a simple demonstration of the variogram nugget effect structure for a 1D datasets with variable spatial continuity and visualization.\n",
"\n",
"* we will see that the nugget effect results from random error\n",
"\n",
"* we will perform the calculations in 1D for fast run times and ease of visualization.\n",
"\n",
"#### Load the required libraries\n",
"\n",
"The following code loads the required libraries.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os # to set current working directory \n",
"import numpy as np # arrays and matrix math\n",
"import matplotlib.pyplot as plt # for plotting\n",
"from matplotlib.gridspec import GridSpec # custom matrix plots\n",
"plt.rc('axes', axisbelow=True) # set axes and grids in the background for all plots\n",
"from ipywidgets import interactive # widgets and interactivity\n",
"from ipywidgets import widgets \n",
"from ipywidgets import Layout\n",
"from ipywidgets import Label\n",
"from ipywidgets import VBox, HBox\n",
"import math # for square root\n",
"from geostatspy import GSLIB # affine correction\n",
"seed = 73073"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you get a package import error, you may have to first install some of these packages. This can usually be accomplished by opening up a command window on Windows and then typing 'python -m pip install [package-name]'. More assistance is available with the respective package docs. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Set the working directory\n",
"\n",
"I always like to do this so I don't lose files and to simplify subsequent read and writes (avoid including the full address each time). Also, in this case make sure to place the required (see below) data file in this working directory. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"#os.chdir(\"C:\\PGE337\") # set the working directory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Declare Functions\n",
"\n",
"We need a variogram calculator that is fast and works well with 1D.\n",
"\n",
"* I have modified the gam function from GeostatsPy below.\n",
"\n",
"References:\n",
"\n",
"Pyrcz, M.J., Jo. H., Kupenko, A., Liu, W., Gigliotti, A.E., Salomaki, T., and Santos, J., 2021, GeostatsPy Python Package, PyPI, Python Package Index, https://pypi.org/project/geostatspy/."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def gam(array, tmin, tmax, xsiz, ysiz, ixd, iyd, nlag, isill):\n",
" \"\"\"GSLIB's GAM program (Deutsch and Journel, 1998) converted from the\n",
" original Fortran to Python by Michael Pyrcz, the University of Texas at\n",
" Austin (Jan, 2019).\n",
" :param array: 2D gridded data / model\n",
" :param tmin: property trimming limit\n",
" :param tmax: property trimming limit\n",
" :param xsiz: grid cell extents in x direction\n",
" :param ysiz: grid cell extents in y direction\n",
" :param ixd: lag offset in grid cells\n",
" :param iyd: lag offset in grid cells\n",
" :param nlag: number of lags to calculate\n",
" :param isill: 1 for standardize sill\n",
" :return: TODO\n",
" \"\"\"\n",
" if array.ndim == 2:\n",
" ny, nx = array.shape\n",
" elif array.ndim == 1:\n",
" ny, nx = len(array),1\n",
" array = array.reshape((ny,1))\n",
"\n",
" nvarg = 1 # for multiple variograms repeat the program\n",
" nxy = nx * ny # TODO: not used\n",
" mxdlv = nlag\n",
"\n",
" # Allocate the needed memory\n",
" lag = np.zeros(mxdlv)\n",
" vario = np.zeros(mxdlv)\n",
" hm = np.zeros(mxdlv)\n",
" tm = np.zeros(mxdlv)\n",
" hv = np.zeros(mxdlv) # TODO: not used\n",
" npp = np.zeros(mxdlv)\n",
" ivtail = np.zeros(nvarg + 2)\n",
" ivhead = np.zeros(nvarg + 2)\n",
" ivtype = np.zeros(nvarg + 2)\n",
" ivtail[0] = 0\n",
" ivhead[0] = 0\n",
" ivtype[0] = 0\n",
"\n",
" # Summary statistics for the data after trimming\n",
" inside = (array > tmin) & (array < tmax)\n",
" avg = array[(array > tmin) & (array < tmax)].mean() # TODO: not used\n",
" stdev = array[(array > tmin) & (array < tmax)].std()\n",
" var = stdev ** 2.0\n",
" vrmin = array[(array > tmin) & (array < tmax)].min() # TODO: not used\n",
" vrmax = array[(array > tmin) & (array < tmax)].max() # TODO: not used\n",
" num = ((array > tmin) & (array < tmax)).sum() # TODO: not used\n",
"\n",
" # For the fixed seed point, loop through all directions\n",
" for iy in range(0, ny):\n",
" for ix in range(0, nx):\n",
" if inside[iy, ix]:\n",
" vrt = array[iy, ix]\n",
" ixinc = ixd\n",
" iyinc = iyd\n",
" ix1 = ix\n",
" iy1 = iy\n",
" for il in range(0, nlag):\n",
" ix1 = ix1 + ixinc\n",
" if 0 <= ix1 < nx:\n",
" iy1 = iy1 + iyinc\n",
" if 1 <= iy1 < ny:\n",
" if inside[iy1, ix1]:\n",
" vrh = array[iy1, ix1]\n",
" npp[il] = npp[il] + 1\n",
" tm[il] = tm[il] + vrt\n",
" hm[il] = hm[il] + vrh\n",
" vario[il] = vario[il] + ((vrh - vrt) ** 2.0)\n",
"\n",
" # Get average values for gam, hm, tm, hv, and tv, then compute the correct\n",
" # \"variogram\" measure\n",
" for il in range(0, nlag):\n",
" if npp[il] > 0:\n",
" rnum = npp[il]\n",
" lag[il] = np.sqrt((ixd * xsiz * il) ** 2 + (iyd * ysiz * il) ** 2)\n",
" vario[il] = vario[il] / float(rnum)\n",
" hm[il] = hm[il] / float(rnum)\n",
" tm[il] = tm[il] / float(rnum)\n",
"\n",
" # Standardize by the sill\n",
" if isill == 1:\n",
" vario[il] = vario[il] / var\n",
"\n",
" # Semivariogram\n",
" vario[il] = 0.5 * vario[il]\n",
" return lag, vario, npp"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Interactive Interface\n",
"\n",
"Here's the interactive interface. I make a correlated 1D data set, add noise and then calculate the histogram and variogram with and without noise. \n",
"\n",
"* the user specifies the proportion of noise and the spatial continuity range of the original data."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"\n",
"n = 200; mean = 0.20; stdev = 0.03; nlag = 100; pnoise = 0.5; \n",
"\n",
"l = widgets.Text(value=' Variogram Nugger Effect Demonstration, Prof. Michael Pyrcz, The University of Texas at Austin',\n",
" layout=Layout(width='930px', height='30px'))\n",
"\n",
"pnoise = widgets.FloatSlider(min=0.0,max = 1.0,value=0.0,step = 0.05,description = 'Noise %',orientation='horizontal',style = {'description_width': 'initial'},layout=Layout(width='500px',height='30px'),continuous_update=False)\n",
"vrange = widgets.IntSlider(min=1,max = 100,value=30,step = 5,description = 'Spatial Continuity Range',orientation='horizontal',style = {'description_width': 'initial'},layout=Layout(width='500px',height='30px'),continuous_update=False)\n",
"\n",
"ui = widgets.HBox([pnoise,vrange],)\n",
"ui2 = widgets.VBox([l,ui],)\n",
"\n",
"def run_plot(pnoise,vrange):\n",
"\n",
" psignal = 1 - pnoise\n",
"\n",
" np.random.seed(seed = seed)\n",
" data0 = np.random.normal(loc=0.20,scale=0.03,size=n+1000)\n",
" \n",
" kern1 = np.ones(vrange)\n",
" data1 = np.convolve(data0,kern1,mode='same')\n",
" data1_sub = GSLIB.affine(data1[500:n+500],mean,stdev)\n",
" \n",
" data1_sub_rescale = GSLIB.affine(data1[500:n+500],mean,stdev*math.sqrt(psignal))\n",
" data1_sub_noise = data1_sub_rescale + np.random.normal(loc=0.0,scale = stdev*math.sqrt(pnoise),size=n)\n",
" data1_sub_noise = GSLIB.affine(data1_sub_noise,mean,stdev)\n",
" \n",
" #fig, axs = plt.subplots(2,3, gridspec_kw={'width_ratios': [2, 1, 1, 1]})\n",
" \n",
" fig = plt.figure()\n",
" spec = fig.add_gridspec(2, 3)\n",
" \n",
" ax1 = fig.add_subplot(spec[0, :])\n",
" plt.plot(np.arange(1,n+1),data1_sub,color='blue',alpha=0.3,lw=3,label='Original')\n",
" plt.plot(np.arange(1,n+1),data1_sub_noise,color='red',alpha=0.3,lw=3,label='Original + Noise')\n",
" plt.xlim([0,n]); plt.ylim([mean-4*stdev,mean+4*stdev])\n",
" plt.xlabel('Location (m)'); plt.ylabel('Porosity (%)'); plt.title('Porosity Over Location, Original and with Random Noise')\n",
" plt.grid(); plt.legend(loc='upper right')\n",
" \n",
" ax2 = fig.add_subplot(spec[1, 0])\n",
" plt.hist(data1_sub,color='blue',alpha=0.3,edgecolor='black',bins=np.linspace(mean-4*stdev,mean+4*stdev,30),\n",
" label='Original')\n",
" plt.hist(data1_sub_noise,color='red',alpha=0.3,edgecolor='black',bins=np.linspace(mean-4*stdev,mean+4*stdev,30),\n",
" label='Original + Noise')\n",
" plt.xlim([mean-4*stdev,mean+4*stdev]); plt.ylim([0,30])\n",
" plt.xlabel('Porosity (%)'); plt.ylabel('Frequency'); plt.title('Histogram')\n",
" plt.grid(); plt.legend(loc='upper right')\n",
" \n",
" ax3 = fig.add_subplot(spec[1, 1])\n",
" labels = ['Signal','Noise',]\n",
" plt.pie([psignal, pnoise,],radius = 1, autopct='%1.1f%%', \n",
" colors = ['#0000FF','#FF0000'], explode = [.02,.02],wedgeprops = {\"edgecolor\":\"k\",'linewidth':1,\"alpha\":0.3},)\n",
" plt.title('Variance of Signal and Noise')\n",
" plt.legend(labels,loc='lower left')\n",
" \n",
" ax4 = fig.add_subplot(spec[1, 2])\n",
" data1_sub_reshape = data1_sub.reshape((n,1))\n",
" lag,gamma,npp = gam(data1_sub,-9999,9999,1.0,1.0,0,1,nlag,1)\n",
" _,gamma_noise,_ = gam(data1_sub_noise,-9999,9999,1.0,1.0,0,1,nlag,1)\n",
" plt.scatter(lag,gamma,s=30,color='blue',alpha=0.3,edgecolor='black',label='Original')\n",
" plt.scatter(lag,gamma_noise,s=30,color='red',alpha=0.3,edgecolor='black',label='Original + Noise')\n",
" plt.plot([0,nlag],[1.0,1.0],color='black',ls='--')\n",
" plt.xlim([0,nlag]); plt.ylim([0,2.0]); plt.grid(); plt.legend(loc='upper right')\n",
" plt.xlabel('Lag Distance (h)'); plt.ylabel('Variogram'); plt.title('Experimental Variogram')\n",
"\n",
" plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.6, wspace=0.1, hspace=0.3); plt.show()\n",
"\n",
"# connect the function to make the samples and plot to the widgets \n",
"interactive_plot = widgets.interactive_output(run_plot, {'pnoise':pnoise,'vrange':vrange})\n",
"interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Take some time to observe a random phenomenon. \n",
"\n",
"* see any patterns, e.g., strings of low or high values, increasing or decreasing trends?\n",
"\n",
"#### Add Spatial Correlation\n",
"\n",
"We can use convolution to add spatial continuity to a random set of values\n",
"\n",
"* we won't go into the details, but the convolution kernel can actually be related to the variogram in sequential Gaussian simulation.\n",
"\n",
"* we apply an affine correction to ensure that we don't change the mean or standard deviation with the convolution, we just change the spatial continuity\n",
"\n",
"* since we are using convolution, it is likely that there will be edge artifacts, so we have 'cut off' the edges of the model (500 m on each side)."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "100c9cb7acbd4ae7b830fc4c82f8bf54",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(Text(value=' Variogram Nugger Effect Demons…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "836cee2e2cd84072b1db197fe1481830",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': '', 'i…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(ui2, interactive_plot) # display the interactive plot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Comments\n",
"\n",
"This was an interactive demonstration of the variogram nugget effect structure resulting from the addition of random noise to spatial data. \n",
"\n",
"I have many other demonstrations on simulation to build spatial models with spatial continuity and many other workflows available [here](https://github.com/GeostatsGuy/PythonNumericalDemos), along with a package for geostatistics in Python called [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy). \n",
" \n",
"We hope this was helpful,\n",
"\n",
"*Michael*\n",
"\n",
"***\n",
"\n",
"#### More on Michael Pyrcz and the Texas Center for Geostatistics:\n",
"\n",
"### Michael Pyrcz, Associate Professor, University of Texas at Austin \n",
"*Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions*\n",
"\n",
"With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development. \n",
"\n",
"For more about Michael check out these links:\n",
"\n",
"#### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)\n",
"\n",
"#### Want to Work Together?\n",
"\n",
"I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.\n",
"\n",
"* Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you! \n",
"\n",
"* Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!\n",
"\n",
"* I can be reached at mpyrcz@austin.utexas.edu.\n",
"\n",
"I'm always happy to discuss,\n",
"\n",
"*Michael*\n",
"\n",
"Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin\n",
"\n",
"#### More Resources Available at: [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}