{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", " \n", "\n", "

\n", "\n", "## Subsurface Data Analytics \n", "\n", "## Introduction to Artificial Neural Networks: Single Layer ANN\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Supervised by:\n", "\n", "#### Michael Pyrcz, Professor, The University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "\n", "#### Neural Networks\n", "\n", "Artificial neural networks or ANNs are modeled after a human's neural network in the sense that they receive inputs, interpret those inputs, and output the result of that interpretation.\n", "There are many different variants of ANNs. For this workflow, we will be examining the different aspects of a single layer neural network:\n", "\n", "* Input layer \n", "* Hidden layer\n", "* Output layer\n", "* Weights\n", "* Learning Rate\n", "\n", "#### Real Life Application\n", "\n", "Today, ANNs are used to solve problems and make predictions by finding complex relationships between inputs and outputs. Forecasting, customer research, data validation, and risk management are some of the many problems that businesses use ANNs to solve. Oil and gas companies have been optimizing their operations with machine learning for many years. Machine learning has been commonly used for enhanced reservoir modeling, allowing users to predict how formations will react to certain drilling techniques.\n", "\n", "#### Workflow Goals\n", "\n", "Learn the basics of machine learning in python to predict subsurface features. This includes:\n", "\n", "* Loading and visualizing sample data\n", "* Developing a basic understanding of neural networks \n", "\n", "#### Getting Started\n", "\n", "The following data files are available for you to download into your working directory. That raw data data files and images are already built into the workflow, but they are available [here](https://github.com/GeostatsGuy/GeoDataSets):\n", "\n", "* Tabular data - [Stochastic_1D_por_perm_demo.csv](https://github.com/GeostatsGuy/GeoDataSets/blob/master/Stochastic_1D_por_perm_demo.csv)\n", "* Tabular data - [Random_Parabola.csv](https://github.com/GeostatsGuy/GeoDataSets/blob/master/Random_Parabola.csv)\n", "\n", "These datasets are available in the folder: https://github.com/GeostatsGuy/GeoDataSets.\n", "\n", "And the following images:\n", "\n", "* [AnimalClassification.JPG](https://github.com/GeostatsGuy/Resources/blob/master/AnimalClassification.JPG)\n", "* [Backpropagation.JPG](https://github.com/GeostatsGuy/Resources/blob/master/BackPropagation.JPG)\n", "* [EmptyNeuralNet1.png](https://github.com/GeostatsGuy/Resources/blob/master/EmptyNeuralNet1.png)\n", "* [NeuaralNet1.jpg](https://github.com/GeostatsGuy/Resources/blob/master/NeuralNet1.jpg)\n", "* [NeuralNet2.jpg](https://github.com/GeostatsGuy/Resources/blob/master/NeuralNet2.jpg)\n", "\n", "The images are available in the folder: https://github.com/GeostatsGuy/Resources.\n", "\n", "#### Import Required Packages\n", "\n", "We will also need some standard packages. These should have been installed with Anaconda 3." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np # ndarrys for gridded data\n", "import pandas as pd # DataFrames for tabular data\n", "import os # set working directory, run executables\n", "import matplotlib.pyplot as plt # for plotting\n", "import matplotlib.image as mpimg \n", "import seaborn as sns # for plotting\n", "import warnings # supress warnings from seaborn pairplot\n", "from sklearn.model_selection import train_test_split # train / test DatFrame split\n", "from sklearn.pipeline import Pipeline # for polynomial regression\n", "from sklearn.preprocessing import PolynomialFeatures\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.model_selection import cross_val_score\n", "from IPython.core.display import display, Javascript, clear_output\n", "from IPython.display import Markdown as md\n", "%matplotlib inline " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following workflow has been made interactive through the use of ipywidgets. \n", "\n", "Instructions for installation of this package can be found [here](https://ipywidgets.readthedocs.io/en/latest/user_install.html)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from ipywidgets import * # import shortcut for widgets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will also need the following packages to train and test our artificial neural nets:\n", "\n", "* Tensorflow - open source machine learning \n", "\n", "* Keras - high level application programing interface (API) to build and train models\n", "\n", "More information is available at [tensorflow install](https://www.tensorflow.org/install).\n", "\n", "You can find great instructions for installation with Anaconda [here](https://www.youtube.com/watch?v=O8yye2AHCOk&list=LLXbFTH5dgall9uMesDFL02w&index=10&t=0s)!\n", "\n", "* This workflow was designed with tensorflow version 2.1.0\n", "\n", "To check your current version of tensorflow you could run the next block of code." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2.1.0'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import tensorflow as tf\n", "from tensorflow.keras.models import Sequential, load_model\n", "from tensorflow.keras.layers import Dense, Dropout, Activation\n", "import tensorflow.keras as keras\n", "from tensorflow.python.keras import backend as k\n", "tf.__version__ # check the installed version of tensorflow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set the working directory\n", "\n", "I always like to do this so I don't lose files and to simplify subsequent read and writes (avoid including the full address each time). \n", "\n", "by running the following code you will be able to see the current directory that you are working in\n", "\n", "```python \n", "os.getcwd() \n", "```\n", "make sure that all of the files listed above are saved there. Otherwise, specify the file location with:\n", "\n", "```python\n", "os.chdir() \n", "```" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'C:\\\\Users\\\\jmcca\\\\PGE383'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "os.getcwd() # get the current working directory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will now set the working directory. In the cell below, replace ```os.getcwd()``` with the file path if you do not wish to save the images and the .csv files in the same folder as this workflow. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "os.chdir(os.getcwd()) # set the working directory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Loading Data\n", "Let's load the provided multivariate, spatial dataset 'Stochastic_1D_por_perm_demo.csv'. It is a comma delimited file with: \n", "\n", "* Porosity\n", "* Permeability\n", "\n", "It is common to transform properties to a standard normal for geostatistical workflows.\n", "\n", "We load it with the pandas 'read_csv' function into a data frame we called 'df' and then preview it to make sure it loaded correctly.\n", "\n", "**Python Tip: using functions from a package** just type the label for the package that we declared at the beginning:\n", "\n", "```python\n", "import pandas as pd\n", "```\n", "\n", "so we can access the pandas function 'read_csv' with the command: \n", "\n", "```python\n", "pd.read_csv()\n", "```\n", "\n", "but read csv has required input parameters. The essential one is the name of the file. For our circumstance all the other default parameters are fine. If you want to see all the possible parameters for this function, just go to the docs [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html). \n", "\n", "* The docs are always helpful\n", "* There is often a lot of flexibility for Python functions, possible through using various inputs parameters\n", "\n", "also, the program has an output, a pandas DataFrame loaded from the data. So we have to specficy the name / variable representing that new object.\n", "\n", "```python\n", "df = pd.read_csv('Stochastic_1D_por_perm_demo.csv') \n", "```\n", "\n", "Let's run this command to load the data and then look at the resulting DataFrame to ensure that we loaded it. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0PorosityPermeability
0013.746408193.721529
119.608479105.718666
2211.664361138.539297
338.37533893.719985
4413.183358169.738825
\n", "
" ], "text/plain": [ " Unnamed: 0 Porosity Permeability\n", "0 0 13.746408 193.721529\n", "1 1 9.608479 105.718666\n", "2 2 11.664361 138.539297\n", "3 3 8.375338 93.719985\n", "4 4 13.183358 169.738825" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(r'https://raw.githubusercontent.com/GeostatsGuy/GeoDataSets/master/Stochastic_1D_por_perm_demo.csv') # read a .csv file in as a DataFrame\n", "df.head() # we could also use this command for a table preview " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unnamed: 0\n", "Porosity\n", "Permeability\n" ] } ], "source": [ "for x in df.columns: # find the names of the columns!\n", " print(x) # other option is: list(df.columns) " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PorosityPermeability
013.746408193.721529
19.608479105.718666
211.664361138.539297
38.37533893.719985
413.183358169.738825
\n", "
" ], "text/plain": [ " Porosity Permeability\n", "0 13.746408 193.721529\n", "1 9.608479 105.718666\n", "2 11.664361 138.539297\n", "3 8.375338 93.719985\n", "4 13.183358 169.738825" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = df.drop(\"Unnamed: 0\", axis=1) #drop the unnecessary column\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Data Normalization\n", "\n", "We must normalize the features before we apply them in an artificial neural network model. These are the motivations for this normalization:\n", "\n", "* remove the impact of scale when using different types of data\n", "\n", "* activation functions in artificial neural networks are designed to be more sensitive to values of nodes closer to 0.0 (i.e., results in higher gradient and improves backpropagation in training)\n", "\n", "Lets get the minimums and maximums of our data first:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The minimum value in the Porosity column is: 3.2587377609366293\n", "The maximum value in the Porosity column is: 21.68483990762835\n", "The minimum value in the Permeability column is: 44.50587461913492\n", "The maximum value in the Permeability column is: 605.7101403233376\n" ] } ], "source": [ "# find the minimums and maximums of the data\n", "\n", "por_min = df['Porosity'].values.min()\n", "print(\"The minimum value in the Porosity column is: \" + str(por_min))\n", "por_max = df['Porosity'].values.max()\n", "print(\"The maximum value in the Porosity column is: \" + str(por_max))\n", "perm_min = df['Permeability'].values.min()\n", "print(\"The minimum value in the Permeability column is: \" + str(perm_min))\n", "perm_max = df['Permeability'].values.max()\n", "print(\"The maximum value in the Permeability column is: \" + str(perm_max))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Normalization Formula\n", "\\begin{equation}\n", "X_{normalized} =\\frac{X - X_{minimum}}{X_{maximum}-X_{minimum}}\n", "\\end{equation}\n", "\n", "Let's normalize each feature. \n", "\n", "* We apply the min max normalization by-hand to force both the predictor and response features to be bound $[-1,1]$.\n", "\n", "* It is easy to backtransform given we keep track of the original min and max values" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PorosityPermeabilitynorm_Porositynorm_Permeabilityn_Perm_zero_to_one
013.746408193.7215290.138349-0.4682310.265885
19.608479105.718666-0.310788-0.7818520.109074
211.664361138.539297-0.087640-0.6648870.167557
38.37533893.719985-0.444636-0.8246120.087694
413.183358169.7388250.077235-0.5536990.223150
\n", "
" ], "text/plain": [ " Porosity Permeability norm_Porosity norm_Permeability \\\n", "0 13.746408 193.721529 0.138349 -0.468231 \n", "1 9.608479 105.718666 -0.310788 -0.781852 \n", "2 11.664361 138.539297 -0.087640 -0.664887 \n", "3 8.375338 93.719985 -0.444636 -0.824612 \n", "4 13.183358 169.738825 0.077235 -0.553699 \n", "\n", " n_Perm_zero_to_one \n", "0 0.265885 \n", "1 0.109074 \n", "2 0.167557 \n", "3 0.087694 \n", "4 0.223150 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['norm_Porosity'] = (df['Porosity'] - por_min)/(por_max - por_min) * 2 - 1 # normalize porosity to range from 0 to 1\n", "df['norm_Permeability'] = (df['Permeability'] - perm_min)/(perm_max - perm_min) * 2 - 1 # normalize permeability to range from -1 to 1\n", "df['n_Perm_zero_to_one'] = (df['Permeability'] - perm_min)/(perm_max - perm_min) # normalize permeability to range from 0 to 1\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also a good idea to check the summary statistics. \n", "\n", "* All normalized features should now range from -1 to 1 or 0 to 1" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
Porosity105.011.9964023.6204283.2587389.57274211.66436114.20985421.68484
Permeability105.0170.01031095.37803744.505875104.656351139.628789200.101407605.71014
norm_Porosity105.0-0.0515990.392967-1.000000-0.314667-0.0876400.1886531.00000
norm_Permeability105.0-0.5527320.339905-1.000000-0.785638-0.661004-0.4454941.00000
n_Perm_zero_to_one105.00.2236340.1699520.0000000.1071810.1694980.2772531.00000
\n", "
" ], "text/plain": [ " count mean std min 25% \\\n", "Porosity 105.0 11.996402 3.620428 3.258738 9.572742 \n", "Permeability 105.0 170.010310 95.378037 44.505875 104.656351 \n", "norm_Porosity 105.0 -0.051599 0.392967 -1.000000 -0.314667 \n", "norm_Permeability 105.0 -0.552732 0.339905 -1.000000 -0.785638 \n", "n_Perm_zero_to_one 105.0 0.223634 0.169952 0.000000 0.107181 \n", "\n", " 50% 75% max \n", "Porosity 11.664361 14.209854 21.68484 \n", "Permeability 139.628789 200.101407 605.71014 \n", "norm_Porosity -0.087640 0.188653 1.00000 \n", "norm_Permeability -0.661004 -0.445494 1.00000 \n", "n_Perm_zero_to_one 0.169498 0.277253 1.00000 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe().transpose()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Activation Functions\n", "\n", "The purpose of an activation function is to add a **non-linear property** to the function. Every node in a neural network has two jobs:\n", "\n", "* Linear Summation\n", "* Activation Computation\n", "\n", "The activation function continuously fires while the ANN is in use. This is done by calculating the **weighted sum** of the inputs, adding a bias, and introducing non-linearity to the ouput. Without the activation function, our model will not be able to learn and will be limited in complexity.\n", "\n", "#### Sigmoid Function\n", "\n", "A sigmoid function has an \"S\" shaped curve that ranges from 0 to 1. This function is commonly used in models that need to predict probability as an output.\n", "\n", "## \\begin{equation}\n", "f(x) = \\frac{1}{1 + e^{-x}}\n", "\\end{equation}\n", "\n", "#### Tanh Function\n", "\n", "The hyperbolic tangent (tanh) function is similar to the sigmoid function. The advantage in using tanh is that it ranges from -1 to 1. This means that strong negative and postive outputs will be mapped as such while inputs with a value of zero will remain near zero.\n", "\n", "## \\begin{equation}\n", "f(x) = \\frac{e^{x} - {e^{-x}}}{e^{x} + e^{-x}}\n", "\\end{equation}\n", "\n", "#### ReLU Function\n", "\n", "The rectified linear units function (ReLU) is the most popular activation function. This function ranges from 0 to infinity. If the input is positive, the ouput is the same as the input. Otherwise, the output is 0.\n", "\n", "## \\begin{equation}\n", "f(x) = max(0, x)\n", "\\end{equation}\n", "\n", "Below we will create plots for each function and widget sliders that control the inputs." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# create title\n", "\n", "l = widgets.Text(value=' Artificial Neural Networks Activation Function Demo, John Eric McCarthy II and Michael Pyrcz, The University of Texas at Austin',\n", " layout=Layout(width='950px', height='30px'))\n", "\n", "# create slider for each activation function\n", "\n", "xs = widgets.FloatSlider(min=-10.0, max = 10.0, step = .1, value=0, description = 'X',orientation='horizontal', continuous_update=True)\n", "xs.style.handle_color = 'red'\n", "xt = widgets.FloatSlider(min=-10.0, max = 10.0, value=0, description = 'X',orientation='horizontal', continuous_update=True)\n", "xt.style.handle_color = 'yellow'\n", "xr = widgets.FloatSlider(min=-1.0, max = 1.0, value=0, description = 'X',orientation='horizontal', continuous_update=True)\n", "xr.style.handle_color = 'orange'\n", "\n", "# create activation function plots\n", "\n", "def sigmoid(xs):\n", "\n", " y = 1/(1 + np.exp(-xs)) # function for dot connected to slider\n", " plt.figure(figsize=(7.5,5))\n", " plt.title('Sigmoid Function', fontsize=20) \n", " plt.xlabel(\"Input\") \n", " plt.ylabel(\"Output\") \n", " plt.ylim(-1.1,1.1)\n", " plt.xlim(-10,10)\n", " x1 = np.linspace(-10, 10, 100)\n", " y1 = 1/(1 + np.exp(-x1)) # function for line plot\n", " plt.plot(x1,y1,'-',color='black')\n", " plt.plot(xs,y,'o', markerfacecolor='red', markeredgecolor='black', markersize=20, alpha=1)\n", " plt.grid(True)\n", " plt.show()\n", " print('\\033[1m' + 'f({}) = {}'.format(xs, 1/(1 + np.exp(-xs))))\n", " \n", "def tanh(xt):\n", " y = np.tanh(xt) # function for dot connected to slider\n", " plt.figure(figsize=(7.5,5))\n", " plt.title('Tanh Function', fontsize=20) \n", " plt.xlabel(\"Input\") \n", " plt.ylabel(\"Output\") \n", " plt.ylim(-1.1,1.1)\n", " plt.xlim(-10,10)\n", " x1 = np.linspace(-10, 10, 100)\n", " y1 = np.tanh(x1) # function for line plot\n", " plt.plot(x1,y1,'-',color='black')\n", " plt.plot(xt,y,'o', markerfacecolor='yellow', markeredgecolor='black', markersize=20, alpha=1)\n", " plt.grid(True)\n", " plt.show()\n", " print('\\033[1m' + 'f({}) = {}'.format(xt, np.tanh(xt)))\n", " \n", "def relu(xr):\n", " if xr < 0: # function for dot connected to slider\n", " y = 0\n", " if xr >= 0:\n", " y = xr\n", " plt.figure(figsize=(7.5,5))\n", " plt.title('ReLU Function', fontsize=20) \n", " plt.xlabel(\"Input\") \n", " plt.ylabel(\"Output\") \n", " plt.ylim(-1,1)\n", " plt.xlim(-1,1)\n", " x1 = np.linspace(-1.1, 1.1, 1000)\n", " zero = np.zeros(len(x1))\n", " y1 = np.max([zero, x1], axis=0) # function for line plot\n", " plt.plot(x1,y1,'-',color='black')\n", " plt.plot(xr,y,'o', markerfacecolor='orange', markeredgecolor='black', markersize=20, alpha=1)\n", " plt.grid(True)\n", " plt.show()\n", " if xr < 0:\n", " print('\\033[1m' + 'f({}) = {}'.format(xr, 0))\n", " if xr >= 0:\n", " print('\\033[1m' + 'f({}) = {}'.format(xr, xr))\n", " \n", " \n", "interactive_plot1 = widgets.interactive_output(sigmoid, {'xs': xs})\n", "interactive_plot1.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "\n", "interactive_plot2 = widgets.interactive_output(tanh, {'xt': xt})\n", "interactive_plot2.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "\n", "interactive_plot3 = widgets.interactive_output(relu, {'xr': xr})\n", "interactive_plot3.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "\n", "# create dashboard/formatting\n", "\n", "uia = widgets.HBox([interactive_plot1],)\n", "uia2 = widgets.VBox([xs, uia],)\n", "\n", "uib = widgets.HBox([interactive_plot2],)\n", "uib2 = widgets.VBox([xt, uib],)\n", "\n", "uic = widgets.HBox([interactive_plot3],)\n", "uic2 = widgets.VBox([xr, uic],)\n", "\n", "uid = widgets.HBox([uia2,uib2,uic2],)\n", "uid2 = widgets.VBox([l, uid],)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interactive Activation Function Demonstration\n", "\n", "* select the inputs and observe the outputs of the activation functions\n", "* interactive plot demonstration with ipywidgets, matplotlib packages\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs\n", "\n", "Select the inputs for each function:\n", "\n", "* **X** is the input computed by the activation function\n", "* **f(x)** is the output of the activation function" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bc1844e2562d43d6a1827ef74a8b8bd7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(Text(value=' Artificial Neural Networks Activation Function Demo…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(uid2) # display the interactive plots" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Interactive Neural Network\n", "\n", "Neural networks have 3 main layers known as the input, hidden, and output layers. Each layer is responsible for executing a task and holds a collection of nodes. A single node in a layer is connected to every node in the next layer. Each connection has a weight applied to it. The output of a node is multiplied by a weight before it is passed on as an input. The **input layer** contains the predictor features and is the very beginning of the workflow. The **hidden layers** are any layers between the input and output layers. **Deep Learning** is the use of more than one hidden layer. You will be able to experiment with this at end of this workflow. Nodes located in the hidden layers take in a weighted sum of the outputs from the previous layer and produce an output through an activation function. The **output layer** is the last layer of nodes. This layer contains the output(s) of the hidden layer(s) and does not interact with the rest of the ANN.\n", "\n", "\n", "\n", "\n", "\n", "#### Putting it all together\n", "\n", "Below you will be tasked with becoming a neural network. Try your best to fit the data by adjusting the weights and bias of the ANN.\n", "\n", "**Notice:**\n", "* The importance of normalizing the dataset beforehand\n", "* Through trial and error, it becomes easier to fit the data\n", "* The weights and bias that work with one activation function won't work with another\n", "* Some activation functions work better than others with this dataset\n", "\n", "Below we will create the the widgets and plots that change depending on the activation function and inputs." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# create title\n", "\n", "l = widgets.Text(value=' Simple Artificial Neural Network Demo, John Eric McCarthy II and Michael Pyrcz, The University of Texas at Austin',\n", " layout=Layout(width='950px', height='30px'))\n", "\n", "# create dropdown menu\n", "\n", "act_func = widgets.Dropdown(options=['Sigmoid', 'Tanh', 'ReLU'], value='ReLU', description='Act. Funtion:', disabled=False, layout=Layout(width='200px', height='30px'), style = {'description_width': 'initial'})\n", "\n", "# sliders for hyperparameters\n", "\n", "w1 = widgets.FloatSlider(min=-5.00, max = 5.00, value=1.20, description = 'Weight 1',orientation='horizontal', continuous_update=False)\n", "w1.style.handle_color = 'blue'\n", "b1 = widgets.FloatSlider(min=-5.00, max = 5.00, value=1.20, description = 'Bias 1',orientation='horizontal', continuous_update=False)\n", "b1.style.handle_color = 'red'\n", "w2 = widgets.FloatSlider(min=-5.00, max = 5.00, value=0.41, description = 'Weight 2',orientation='horizontal', continuous_update=False)\n", "w2.style.handle_color = 'green'\n", "\n", "\n", "def update_plot(w1, w2, b1, act_func): # create plots\n", " \n", " x = np.linspace(-2, 2, 1000) \n", " x1 = (x * w1) + b1\n", " if act_func == 'Sigmoid':\n", " y = 1/(1 + np.exp(-x1)) # sigmoid function\n", "\n", " elif act_func == 'Tanh':\n", " y = np.tanh(x1) # tanh function\n", "\n", " elif act_func == 'ReLU':\n", " zero = np.zeros(len(x1))\n", " y = np.max([zero, x1], axis=0) # relu function\n", " \n", " x2 = (y * w2)\n", " \n", " if act_func == 'Sigmoid':\n", " y2 = 1/(1 + np.exp(-x2)) \n", "\n", " elif act_func == 'Tanh':\n", " y2 = np.tanh(x2)\n", "\n", " elif act_func == 'ReLU':\n", " zero = np.zeros(len(x2))\n", " y2 = np.max([zero, x2], axis=0)\n", " fig = plt.subplots(figsize=(15, 5))\n", " plt.title('Interactive Neural Network: \"Inter- net\"', fontsize=20)\n", " plt.plot(x, y2)\n", " if act_func == 'Sigmoid' or 'ReLU':\n", " y_values = df['n_Perm_zero_to_one'].values\n", " if act_func == 'Tanh':\n", " y_values = df['norm_Permeability'].values\n", " \n", " plt.plot(df['norm_Porosity'].values,y_values, 'o', markerfacecolor='red', markeredgecolor='black', alpha=0.7, label = \"Test Data\")\n", " if act_func == 'Sigmoid' or 'ReLU':\n", " y_min = -.1\n", " if act_func == 'Tanh':\n", " y_min = -1.1\n", " plt.ylim(y_min,1.1)\n", " plt.xlim(-1.1,1.1)\n", " plt.xlabel(\"Porosity\") \n", " plt.ylabel(\"Permeability\") \n", " plt.legend()\n", " plt.show()\n", "\n", "def ann_plot(w1, b1, w2): # display input and output of function\n", " \n", " img = mpimg.imread(r\"https://raw.githubusercontent.com/GeostatsGuy/Resources/master/EmptyNeuralNet1.png\")\n", " plt.figure(figsize=(15,12.5))\n", " plt.text(350,190, b1,{'color': 'red', 'fontsize': 24})\n", " plt.text(465,190, w1,{'color': 'blue', 'fontsize': 24})\n", " plt.text(190,200, w1,{'color': 'blue', 'fontsize': 24})\n", " plt.text(735,225, w2,{'color': 'green', 'fontsize': 24})\n", " plt.text(555,200, w2,{'color': 'green', 'fontsize': 24})\n", " plot = plt.imshow(img) \n", " plt.axis('off') # clear x-axis and y-axis\n", " \n", "interactive_plot = widgets.interactive_output(update_plot, {'act_func': act_func,'w1': w1, 'b1': b1, 'w2': w2})\n", "interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "\n", "update_ann = widgets.interactive_output(ann_plot, {'w1': w1, 'b1': b1, 'w2': w2})\n", "\n", "# create dashboard/formatting\n", "\n", "ui = widgets.HBox([act_func],) # basic widget formatting \n", "uia = widgets.HBox([w1,b1,w2],)\n", "uib = widgets.HBox([ui, uia])\n", "uic = widgets.HBox([interactive_plot, update_ann])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interactive Neural Network Demonstration\n", "\n", "* adjust the hyperparameter and select the activation function\n", "* observe the fit across the test data\n", "* interactive plot demonstration with ipywidgets, matplotlib packages\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs\n", "\n", "Interact with the widgets to adjust the hyperparameters and fit the data provided\n", "\n", "* **weight 1** is multiplied times each input before it is passed to the hidden layer\n", "* **bias 1** is added to the product of the input and weight 1\n", "* **weight 2** is multiplied times the output of the hidden layer once the activation function computation is done" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "45799753b45f4afbb9373c0a229e8537", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Text(value=' Simple Artificial Neural Network Demo, John Eric McCarthy II a…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7b5a32a5dd714ffd848e9d7c40baf431", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HBox(children=(Dropdown(description='Act. Funtion:', index=2, layout=Layout(height='30px', widt…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "53712f849d354d34809c452163925b27", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Output(), Output()))" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(l, uib, uic) # display the interactive plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Single Layer Neural Network\n", "\n", "\n", "\n", "#### Increased Complexity\n", "\n", "Below, you can find a new neural network that receives 3 inputs and predicts the output. Since we are predicting probability, we will be using the sigmoid activation function. The hidden layer creates a weighted sum of the inputs and adds a bias. This is equivalent to \"s\". Next, the activation function takes \"s\" as an input and the result is sent to the output layer. By design choice and for the sake of simplicity, there will not be a weight between the hidden and output layer for this example. This is great, but how does our ANN learn?\n", "\n", "#### Introducing Backpropagation\n", "\n", "\n", "\n", "The process of running inputs through the neural network and receiving an output is called **forward propagation** or feed forward. ANNs require training and adjustment in order to learn. The weights between each node are the only thing that we have control over when it comes to changing our outputs. Below, we use **batch gradient descent** to adjust our weights throughout training. This is done by comparing the predicted result to the actual result and measuring the generated error. The **learning rate** affects how fast our neural network learns, or in other words, how much the weights are updated. We then **back-propagate** our error and repeat the process by passing our training data through the ANN. Back propagation is an algorithm for supervised learning that forces the ANN to \"learn from mistakes\". There are two main ways that an ANN can learn. In a **supervised learning** model, the ANN learns on a labeled dataset, providing an answer key that the ANN can use to evaluate its accuracy on training data. An **unsupervised learning** model, in contrast, provides unlabeled data that the ANN tries to make sense of by extracting features and patterns on its own.\n", "\n", "When we update the weights we have to make a few simple calculations. First, we find the error by calculating the difference between the desired and actual output. We then multiply this by the learning rate to find the error rate. Next, we multiply the error rate by the input and the gradient (derivative) of the activation function. Design has gone into the creation of activation functions that make them relatively easy to calculate. In our ANN below, we use the sigmoid function because its derivative is actually just itself times 1 minus itself. Super cool! Although these calculations are simple, they do stack up fast.\n", "\n", "Our goal is to minimize the error or **cost**. The cost can be calculated by finding the squared difference between the desired and actual output after each iteration. We will plot both the weight and cost changes of the ANN while it trains below. \n", "\n", "#### How it Works\n", "This neural network accepts 3 inputs: *x1*, *x2*, and *x3*. Based off of those inputs, a number is then predicted by the ANN. Below is a set of data that contains information on 4 animals. The ANN predicts the likelihood of an animal being a mammal or not based on: if it has hair, the number of legs it has, and the number of eyes it has. We will begin by extracting features from the dataset to put into our ANN. Then, we will classify the data by converting the data into 0s and 1s.\n", "\n", "\n", "\n", "**Input Layer:**\n", "* Hair?\n", " - No = 0\n", " - Yes = 1\n", "* 2 or 4 Legs?\n", " - 2 Legs = 0\n", " - 4 Legs = 1\n", "* More Than 1 Eye?\n", " - More than 1 Eye = 1\n", " - 1 Eye or Less = 0\n", "\n", "**Output Layer:**\n", "* Mammal?\n", " - No = 0\n", " - Yes = 1\n", " \n", "Below we will create an ANN that uses the above as a guide for the inputs and outputs. While training the model we will backpropagate the error and adjust the weights. We will also record the cost function throughout training." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Random starting weights: \n", "[[-0.16595599]\n", " [ 0.44064899]\n", " [-0.99977125]]\n", "Weights after training: \n", "[[ 9.67299303]\n", " [-0.2078435 ]\n", " [-4.62963669]]\n" ] } ], "source": [ "# create title\n", "\n", "l = widgets.Text(value=' Mammal Prediction Demo, John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin',\n", " layout=Layout(width='950px', height='30px'))\n", "\n", "# create dropdown menus\n", "\n", "hair = widgets.Dropdown(options=['Yes', 'No'], value='Yes', description='Hair?', disabled=False, layout=Layout(width='100px', height='30px'), style = {'description_width': 'initial'})\n", "legs = widgets.Dropdown(options=['2', '4'], value='2', description='Number of Legs?', disabled=False, layout=Layout(width='150px', height='30px'), style = {'description_width': 'initial'})\n", "n_eyes = widgets.Dropdown(options=['Yes', 'No'], value='Yes', description='More than 1 eye?', disabled=False, layout=Layout(width='175px', height='30px'), style = {'description_width': 'initial'})\n", "\n", "class NeuralNetwork():\n", " \n", " def __init__(self):\n", " \n", " np.random.seed(1) # seed the random number generator \n", " self.weights = 2 * np.random.random((3, 1)) - 1 # set the weights to a 3x1 matrix with values from -1 to 1 and mean 0\n", " self.weight_change = np.empty((3,3,1)) # capture the weights after each iteration\n", " self.learning_rate = 1 # define the learning rate of the model\n", " self.cost = np.empty((1,4,1)) # create an empty array to store the cost function values\n", " \n", " def sigmoid(self, x): # the sigmoid function takes in the weighted sum of the inputs and outputs a number between 0 and 1\n", "\n", " return 1 / (1 + np.exp(-x))\n", "\n", " def sigmoid_derivative(self, x): # the derivative of the sigmoid function used to calculate necessary weight adjustments\n", "\n", " return x * (1 - x)\n", "\n", " def train(self, training_inputs, training_outputs, epochs): # train the model by evaluating the error and adjusting the weights to get better results\n", " \n", " for iteration in range(epochs): # pass training set through the neural network\n", " \n", " output = self.think(training_inputs)\n", "\n", " \n", " error = (training_outputs - output) * self.learning_rate # calculate the error rate\n", "\n", " # multiply error by input and gradient of the sigmoid function\n", " # less confident weights are adjusted more through the nature of the function\n", " adjustments = np.dot(training_inputs.T, error * self.sigmoid_derivative(output))\n", "\n", " \n", " self.weights += adjustments # adjust synaptic weights\n", " \n", " # store new weight values\n", " self.weight_change = np.append(self.weight_change, [self.weights], axis=0)\n", " \n", " # evaluate the cost function\n", " loss_error = 0\n", " loss_error += (training_outputs - output) ** 2\n", " self.cost = np.append(self.cost, [loss_error], axis=0)\n", " \n", "\n", " def think(self, inputs): # run the inputs through the neural network to get the outputs\n", " \n", " inputs = inputs.astype(float)\n", " output = self.sigmoid(np.dot(inputs, self.weights))\n", " return output\n", "\n", "if __name__ == \"__main__\": # initialize the single neuron neural network\n", " \n", " neural_network = NeuralNetwork()\n", "\n", " print(\"Random starting weights: \")\n", " print(neural_network.weights)\n", "\n", " training_inputs = np.array([[0,0,1], # the training set, with 4 examples consisting of 3 input values and 1 output value\n", " [1,1,1],\n", " [1,0,1],\n", " [0,1,1]])\n", "\n", " training_outputs = np.array([[0,1,1,0]]).T # 4 output values\n", "\n", " neural_network.train(training_inputs, training_outputs, 10000) # train the neural network\n", "\n", " print(\"Weights after training: \")\n", " print(neural_network.weights)\n", "\n", "def test_ann(hair, legs, n_eyes):\n", "\n", " if hair == 'Yes':\n", " x1 = 1\n", " if hair == 'No':\n", " x1 = 0\n", " if legs == '2':\n", " x2 = 0\n", " if legs == '4':\n", " x2 = 2\n", " if n_eyes == 'Yes':\n", " x3 = 1\n", " if n_eyes == 'No':\n", " x3 = 0 \n", "\n", " probability = neural_network.think(np.array([x1, x2, x3]))\n", " if probability > .5:\n", " print('\\033[1m' + 'The probability of the animal being a mammal is {}%'.format(np.round(probability * 100),2))\n", " if probability < .5:\n", " print('\\033[1m' + 'The probability of the animal not being a mammal is {}%'.format(np.round(100 - (probability * 100),2)))\n", " if probability == .5:\n", " print('\\033[1m' + 'The probability of the animal being a mammal is {}% \\nThis is due to the nature of the sigmoid function. When all inputs are \"0\", .5 is returned by the function '.format(np.round(100 - (probability * 100),2)))\n", "\n", "\n", "weight1 = neural_network.weight_change[:,0] # store 3 weight changes for later use\n", "weight2 = neural_network.weight_change[:,1]\n", "weight3 = neural_network.weight_change[:,2]\n", "\n", "cost2 = np.sum(neural_network.cost, axis = 1) # store sum of costs\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# plot weight change and cost function\n", "\n", "def weight_plot():\n", " fig = plt.subplots(figsize=(15, 5))\n", " plt.title('Weight Change Across Epochs', fontsize=20)\n", " x = np.linspace(0, len(neural_network.weight_change[:,0]), len(neural_network.weight_change[:,0])) \n", " plt.plot(x, weight1, label = 'Weight 1')\n", " plt.plot(x,weight2, label = 'Weight 2')\n", " plt.plot(x,weight3, label = 'Weight 3')\n", " plt.ylim(-5,10)\n", " plt.xlim(0,len(neural_network.weight_change[:,0] + 100))\n", " plt.xlabel(\"Epochs\") \n", " plt.ylabel(\"Weight Change\") \n", " plt.legend()\n", " plt.show()\n", "\n", "def cost_plot(): \n", " fig = plt.subplots(figsize=(15, 5))\n", " plt.title('Cost Change Across Epochs', fontsize=20)\n", " x = np.linspace(0, len(cost2), len(cost2)) \n", " plt.plot(x, cost2, label = 'cost')\n", " plt.ylim(-.07,1.25)\n", " plt.xlim(0, 200)\n", " plt.xlabel(\"Epochs\") \n", " plt.ylabel(\"cost\") \n", " plt.legend()\n", " plt.show()\n", "\n", "show_probability = widgets.interactive_output(test_ann,{'hair': hair, 'legs': legs, 'n_eyes': n_eyes})\n", "show_probability.clear_output(wait = True) # reduce flickering by delaying plot updating \n", "weight_plot = widgets.interactive_output(weight_plot, {})\n", "cost_plot = widgets.interactive_output(cost_plot, {})\n", "\n", "# create dashboard/formatting\n", "\n", "uia = widgets.HBox([hair, legs, n_eyes],)\n", "uib = widgets.HBox([weight_plot, cost_plot],)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interactive Predictive Neural Network Demonstration\n", "\n", "* test the neural network by answering the given questions\n", "* interactive plot demonstration with ipywidgets, matplotlib packages\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs\n", "\n", "Answer the following questions to find out the probability of an animal with those features being a mammal, based off of what the ANN learned:\n", "* Hair?\n", "* 2 or 4 Legs?\n", "* More Than 1 Eye?" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0e9a17a180fc436b8bf66b2a6e578947", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Text(value=' Mammal Prediction Demo, John Eric McCarthy II, Undergradua…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b3d24bd030ff4e56a29f2d5490306b00", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Dropdown(description='Hair?', layout=Layout(height='30px', width='100px'), options=('Yes', 'No'…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "424dcbe028684e49823b902864538394", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ef0acdf08537454c83c153d68bee4a74", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Output(), Output()))" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(l, uia, show_probability, uib) # display the interactive plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Overfitting and Underfitting\n", "\n", "Advanced neural networks undergo a multitude of transformations and changes in the hopes that the model will be able to make accurate predictions regardless of the dataset it is provided. Part of that training may be the introduction of a validation dataset. The **validation data** is a set of sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. When used, we are essentially asking the model, \"Can you make accurate predictions when given a set of unknown inputs\"? For successful models, the answer to that question is yes. The great thing about using a validation data set is that you know the expected outputs. This makes calculating error and seeing what went wrong, that much easier. \n", "\n", "It's important to understand the general concept behind **underfitting and overfitting**, so let's run through a few real life examples. Overfitting is like using a very specific training set to learn. For example, if someone only used Shakspearean language to learn the English. That person would defnitely know English, however it would be a very specific version of it. This would lead to complications when trying to understand other English text. Underfitting is like trying to learn English by listening to episodes of Friends, but only listening to sentences that begin with \"The\", \"I\", and \"A\". The problem here is that these sentences are limited and are a poor representation of the language.\n", "\n", "You can gain a general understanding of how the model performs by the comparing the bias of the training set and the variance of the validation set with the fit of the model. A high bias and variance leads to underfitting while a low bias and variance leads to overfitting. Below, you can find a visual representation of underfitting versus overfitting while observing statistical data.\n", "\n", "#### Adding Gaussianly Distributed Noise to the Data\n", "\n", "Here, we are going to add gaussianly distributed noise to the porosity and permeability data. Gaussianly distributed meaning that the numbers have a mean of 0. Noise meaning that the numbers around the mean are random.\n", "\n", "#### Gaussian Distribution\n", "\n", "In the demo below you will have control over $\\sigma$, the **standard deviation** being added to the inputs and outputs. The standard deviation is the amount of variation or dispersion of our values. We will also be calculating the sample variance, sample standard deviation, and mean squared error.\n", "\n", "## \\begin{equation}\n", "y = \\frac{1}{\\sigma\\sqrt{2\\pi}}e^{ -\\frac{(x-\\mu)^2}{2\\sigma^2}}\n", "\\end{equation}\n", "\n", "#### Sample Variance\n", "## \\begin{equation}\n", "\\sigma^2 = \\frac{\\sum_{k=1}^n (x_i - \\mu)^2}{n-1} \n", "\\end{equation}\n", "\n", "#### Sample Standard Deviation\n", "## \\begin{equation}\n", "\\sigma = \\sqrt{\\frac{\\sum_{k=1}^n (x_i - \\mu)^2}{n-1}} \n", "\\end{equation}\n", "\n", "#### Mean Squared Error\n", "## \\begin{equation}\n", "MSE = \\frac{1}{n}{\\sum_{k=1}^n (y - \\hat{y})^2} \n", "\\end{equation}\n", "\n", "Below we will create a model that utilizes polynomial regression to fit to training data. We will then test that model with test data. Notice that the train/test data is a split of the ```Stochastic_1D_por_perm_demo.csv``` values. We will keep track of:\n", "* the sample variance and standard deviation of the training data \n", "* testing/training data error" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "def warn(*args, **kwargs):\n", " pass\n", "import warnings\n", "warnings.warn = warn\n", "\n", "# create title\n", "\n", "l = widgets.Text(value=' Overfitting vs Underfitting Demo, John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin',\n", " layout=Layout(width='950px', height='30px'))\n", "\n", "std_deviation = widgets.FloatSlider(min=0, max = 1, value=0, step = 0.1, description = 'Added STD',orientation='horizontal',style = {'description_width': 'initial'}, continuous_update=True)\n", "degrees = widgets.IntSlider(min=1, max = 20, value=1, step = 1, description = 'Degrees',orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=True)\n", "n_samples = widgets.IntSlider(min=15, max = 80, value=30, step = 1, description = 'N. Training Samples',orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=True)\n", "show_test = widgets.Checkbox(value=False, description='Show Test Data', disabled=False, indent=False)\n", "\n", "\n", "def noise_func(std_deviation, degrees, n_samples, show_test):\n", " \n", " np.random.seed(1) # seed the random number generator\n", " \n", " df = pd.read_csv('Stochastic_1D_por_perm_demo.csv') # read a .csv file in as a DataFrame\n", " df = df.drop(\"Unnamed: 0\", axis=1) # drop the unnecessary column\n", " df['X_train_data'] = df['Porosity'].sample(n=int(n_samples), random_state=1) + np.random.normal(loc=0,scale=std_deviation,size=int(n_samples)) # add std to porosity\n", " df['Y_train_data'] = df['Permeability'].sample(n=int(n_samples), random_state=1) + np.random.normal(loc=0,scale=std_deviation,size=int(n_samples)) # add std to permeability\n", " df = df.sort_values('X_train_data') # sort values\n", " df = df.dropna(subset=['Y_train_data', 'X_train_data']) # drop rows with NaN\n", " \n", " \n", " x_train = df['X_train_data'] # extract training data\n", " y_train = df['Y_train_data'] # extract training data\n", " \n", " \n", " df2 = pd.read_csv('Stochastic_1D_por_perm_demo.csv') # read a .csv file in as a DataFrame\n", " df2 = df2.drop(\"Unnamed: 0\", axis=1) # drop the unnecessary column\n", " df2['X_test_data'] = df2['Porosity'].drop(x_train.index) # remove training data, leave test data\n", " df2['Y_test_data'] = df2['Permeability'].drop(y_train.index) # remove training data, leave test data\n", " \n", " \n", " x_test = df2['X_test_data'] # extract test data\n", " y_test = df2['Y_test_data'] # extract test data\n", " \n", " \n", " polynomial_features = PolynomialFeatures(degree=int(degrees), include_bias=False)\n", " linear_regression = LinearRegression()\n", " pipeline = Pipeline([(\"polynomial_features\", polynomial_features), (\"linear_regression\", linear_regression)])\n", " pipeline.fit(x_train[:, np.newaxis], y_train) # fit to training data\n", " \n", " \n", " fig, ax = plt.subplots(figsize=(15, 5))\n", " model_pred = np.linspace(x_train.min(), x_train.max(),105)\n", " ax.plot(model_pred, pipeline.predict(model_pred[:, np.newaxis]), color='black', label=\"Model\")\n", " ax.scatter(x_train, y_train, c='red', edgecolors='black', alpha=0.7, label = \"{}% Train Data\".format(round(len(x_train) / 105 * 100),1))\n", " plt.title(\"Training Data vs. Model Predictions\", fontsize=20)\n", " plt.xlabel(\"Porosity\") \n", " plt.ylabel(\"Permeability\") \n", " \n", " if show_test == True:\n", " ax.scatter(x_test, y_test,c='blue',edgecolors='black',alpha=0.7,label = \"{}% Test Data\".format(round((105- len(x_train)) / 105 * 100),1))\n", " plt.legend()\n", " \n", "def mse_plot(std_deviation, degrees, n_samples):\n", " \n", " np.random.seed(1) # seed the random number generator\n", " df = pd.read_csv('Stochastic_1D_por_perm_demo.csv') # read a .csv file in as a DataFrame\n", " df = df.drop(\"Unnamed: 0\", axis=1) # drop the unnecessary column\n", " df['X_train_data'] = df['Porosity'].sample(n=int(n_samples), random_state=1) + np.random.normal(loc=0,scale=std_deviation,size=int(n_samples)) # add std to porosity\n", " df['Y_train_data'] = df['Permeability'].sample(n=int(n_samples), random_state=1) + np.random.normal(loc=0,scale=std_deviation,size=int(n_samples)) # add std to permeability\n", " \n", " df = df.dropna(subset=['Y_train_data', 'X_train_data']) # drop rows with NaN\n", " \n", " \n", " x_train = df['X_train_data'] # extract training data\n", " y_train = df['Y_train_data'] # extract training data\n", "\n", " df2 = pd.read_csv('Stochastic_1D_por_perm_demo.csv') # read a .csv file in as a DataFrame\n", " df2 = df2.drop(\"Unnamed: 0\", axis=1) # drop the unnecessary column\n", " df2['X_test_data'] = df2['Porosity'].drop(x_train.index) # remove training data, leave test data\n", " df2['Y_test_data'] = df2['Permeability'].drop(y_train.index) # remove training data, leave test data\n", " \n", " \n", " df2 = df2.dropna(subset=['X_test_data', 'Y_test_data'])\n", " x_test = df2['X_test_data'] # extract test data\n", " y_test = df2['Y_test_data'] # extract test data\n", " \n", " polynomial_features = PolynomialFeatures(degree=int(degrees), include_bias=False)\n", " linear_regression = LinearRegression()\n", " pipeline = Pipeline([(\"polynomial_features\", polynomial_features), (\"linear_regression\", linear_regression)])\n", " \n", " pipeline.fit(x_train[:, np.newaxis], y_train) # fit to training data\n", " \n", " train_scores = cross_val_score(pipeline, x_train[:, np.newaxis], y_train, scoring=\"neg_mean_squared_error\", cv=10) # evaluate the model using crossvalidation\n", " test_scores = cross_val_score(pipeline, x_test[:, np.newaxis], y_test, scoring=\"neg_mean_squared_error\", cv=10) # evaluate the model using crossvalidation\n", " \n", " df['train_pred'] = pipeline.predict(x_train[:, np.newaxis])\n", " df2['test_pred'] = pipeline.predict(x_test[:, np.newaxis])\n", " \n", " train_error = (y_train - df['train_pred']) # calculate the squared training error\n", " test_error = (y_test - df2['test_pred']) # calculate the squared test error\n", " \n", " df = df.sort_values('train_pred') # prepare data for plotting\n", " df2 = df2.sort_values('test_pred') # prepare data for plotting\n", " \n", " fig2, ax2 = plt.subplots(figsize=(7.5, 5))\n", " ax2.hist(train_error, color='red', alpha=.2, label='Train Error')\n", " ax2.hist(test_error, color='blue', alpha=.2, label='Test Error')\n", " plt.title(\"Error vs. Frequency\", fontsize=20)\n", " plt.xlabel(\"Error\") \n", " plt.ylabel(\"Frequency\")\n", " plt.legend()\n", " plt.show()\n", " print(\"Training MSE = {} \\nTest MSE = {}\".format(int(-train_scores.mean()),int(-test_scores.mean()))) # mean squared error for training and test data\n", " \n", "def variance_output(std_deviation, n_samples): # calculate sample variance and std for training data\n", " \n", " np.random.seed(1) # seed the random number generator\n", " \n", " y1 = df['Permeability'].sample(n=int(n_samples), random_state=1) + np.random.normal(loc=0,scale=std_deviation,size=int(n_samples))\n", " mean = y1.mean()\n", " variance = sum(((y1 - mean) **2)) / ((len(y1) - 1))\n", " \n", " \n", " print('Sample Variance of Training Data = {}'.format(np.round(variance,2)))\n", " print('Sample std of Training Data = {}'.format(np.round(np.sqrt(variance),2)))\n", " \n", "\n", "variance = widgets.interactive_output(variance_output, {'std_deviation': std_deviation, 'n_samples': n_samples})\n", "interactive_plot = widgets.interactive_output(noise_func, {'std_deviation': std_deviation, 'degrees': degrees, 'n_samples': n_samples, 'show_test': show_test})\n", "interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "interactive_plot2 = widgets.interactive_output(mse_plot, {'std_deviation': std_deviation, 'degrees': degrees, 'n_samples': n_samples})\n", "interactive_plot2.clear_output(wait = True) # reduce flickering by delaying plot updating\n", "\n", "# create dashboard/formatting \n", "\n", "ui = widgets.HBox([std_deviation, degrees, n_samples, show_test],)\n", "ui2 = widgets.HBox([interactive_plot, interactive_plot2],)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Interactive Underfitting vs Overfitting Demonstration \n", "\n", "* observe underfitting and overfitting by changing the data variance and bias\n", "* interactive plot demonstration with sklearn, ipywidgets, matplotlib packages\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs\n", "\n", "Adjust the sliders to decrease the model error bwtween training and test data\n", "\n", "* **Added Standard Deviation:** the deviation of the porosity/permeability data from its mean; adjusting this will affect the variance of the training data\n", "* **Degrees:** the degree of the polynomial used to fit the data\n", "* **Number of Training Samples:** the number samples taken from the dataset used for training\n", "\n", "**Note:** This is not an ANN (the calculations would be too slow with this much data). Here, we are using linear regression to make a polynomial fit to the training data and comparing that fit to test data." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "96fb4b6c61c045bda1140ddfed500242", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Text(value=' Overfitting vs Underfitting Demo, John Eric McCarthy II, Un…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f26b48ceaf0d4eb5a259a09c24e5135b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatSlider(value=0.0, description='Added STD', max=1.0, style=SliderStyle(description_width='i…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5bcb43b7b3134338b726e513b2c2e3a4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ffc32e4c636047779c61ec4b44e962ee", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(Output(), Output()))" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(l, ui, variance, ui2) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Advanced ANN using Tensorflow and Keras\n", "\n", "Congratulations, you are on the final stretch for understanding the basics of an artificail neural network! We are now going to use **TensorFlow and Keras** to build a model. In short, TensorFlow is an end-to-end open source platform for machine learning. Keras, on the other hand, is a high-level neural networks library. Keras allows for users to perform deep learnign with ease on both CPUs and GPUs. Both frameworks provide high-level application program interfaces (APIs) for building and training models with ease." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "df2 = pd.read_csv(r'https://raw.githubusercontent.com/GeostatsGuy/GeoDataSets/master/Random_Parabola.csv') # read a .csv file in as a DataFrame\n", "df2 = df2.drop(\"Unnamed: 0\", axis=1) # drop the unnecessary column" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The minimum value in the X column is: -1.0\n", "The maximum value in the X column is: 1.0\n", "The minimum value in the Y column is: 0.0\n", "The maximum value in the Y column is: 1.0\n" ] } ], "source": [ "# find the minimums and maximums of the data\n", "\n", "x_min = df2['X'].values.min()\n", "print(\"The minimum value in the X column is: \" + str(x_min))\n", "x_max = df2['X'].values.max()\n", "print(\"The maximum value in the X column is: \" + str(x_max))\n", "y_min = df2['Y'].values.min()\n", "print(\"The minimum value in the Y column is: \" + str(y_min))\n", "y_max = df2['Y'].values.max()\n", "print(\"The maximum value in the Y column is: \" + str(y_max))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
XYnorm_Xnorm_Y
0-1.01.00-1.01.00
1-0.90.81-0.90.62
2-0.80.64-0.80.28
3-0.70.49-0.7-0.02
4-0.60.36-0.6-0.28
\n", "
" ], "text/plain": [ " X Y norm_X norm_Y\n", "0 -1.0 1.00 -1.0 1.00\n", "1 -0.9 0.81 -0.9 0.62\n", "2 -0.8 0.64 -0.8 0.28\n", "3 -0.7 0.49 -0.7 -0.02\n", "4 -0.6 0.36 -0.6 -0.28" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# normalize data\n", "\n", "df2['norm_X'] = (df2['X'] - x_min)/(x_max - x_min) * 2 - 1\n", "df2['norm_Y'] = (df2['Y'] - y_min)/(y_max - y_min) * 2 - 1\n", "df2.head()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
Porosity105.011.9964023.6204283.2587389.57274211.66436114.20985421.68484
Permeability105.0170.01031095.37803744.505875104.656351139.628789200.101407605.71014
norm_Porosity105.0-0.0515990.392967-1.000000-0.314667-0.0876400.1886531.00000
norm_Permeability105.0-0.5527320.339905-1.000000-0.785638-0.661004-0.4454941.00000
n_Perm_zero_to_one105.00.2236340.1699520.0000000.1071810.1694980.2772531.00000
\n", "
" ], "text/plain": [ " count mean std min 25% \\\n", "Porosity 105.0 11.996402 3.620428 3.258738 9.572742 \n", "Permeability 105.0 170.010310 95.378037 44.505875 104.656351 \n", "norm_Porosity 105.0 -0.051599 0.392967 -1.000000 -0.314667 \n", "norm_Permeability 105.0 -0.552732 0.339905 -1.000000 -0.785638 \n", "n_Perm_zero_to_one 105.0 0.223634 0.169952 0.000000 0.107181 \n", "\n", " 50% 75% max \n", "Porosity 11.664361 14.209854 21.68484 \n", "Permeability 139.628789 200.101407 605.71014 \n", "norm_Porosity -0.087640 0.188653 1.00000 \n", "norm_Permeability -0.661004 -0.445494 1.00000 \n", "n_Perm_zero_to_one 0.169498 0.277253 1.00000 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe().transpose()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Separation of Training and Testing Data\n", "\n", "We also need to split our data into training / testing datasets so that we:\n", "\n", "* can train our artificial neural networks using the training data \n", "\n", "* while testing their performance with the withheld testing (validation) data." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "X2 = df2.iloc[:,[0,2]] # extract the predictor feature - X\n", "y2 = df2.iloc[:,[1,3]] # extract the response feature - Y\n", "X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y2, test_size=0.2, random_state=73073)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Specify the Prediction Locations\n", "\n", "Given this training and testing data, let's specify the prediction locations over the range of the observed depths at regularly spaced $nbins$ locations. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "# Specify the prediction locations\n", "\n", "nbins = 1000\n", "x_bins = np.linspace(x_min, x_max, nbins) # set the bins for prediction\n", "norm_x_bins = (x_bins-x_min)/(x_max-x_min)*2-1 # use normalized bins" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we will create the widgets that control our advanced ANN's hyperparameters and plot the model's fit on the given data." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# create title\n", "\n", "l = widgets.Text(value=' ANN Demo, John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin',\n", " layout=Layout(width='950px', height='30px'))\n", "\n", "# create sliders for hyperparameters\n", "\n", "width = widgets.IntSlider(min=5, max = 50, value=10, step = 1, description = 'Width of Hidden Layers',orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=False)\n", "learning_rate = widgets.FloatLogSlider(value=.001, base=10, min=-3, max=-2, step=0.01, description = 'Learning Rate', orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=False)\n", "epochs = widgets.IntSlider(min=500, max = 1100, value=1000, step = 100, description = 'Epochs',orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=False)\n", "n_hidden = widgets.IntSlider(min=1, max = 3, value=1, step = 1, description = 'Number of Hidden Layers',orientation='horizontal', style = {'description_width': 'initial'}, continuous_update=False)\n", "\n", "uia = widgets.HBox([width, learning_rate],) # basic widget formatting\n", "uib = widgets.HBox([n_hidden, epochs],)\n", "\n", "\n", "def model(width, learning_rate, epochs, n_hidden):\n", "\n", " # Design the neural network\n", "\n", " if n_hidden == 1:\n", " model_2 = Sequential([\n", " Dense(1, activation='linear', input_shape=(1,)), # input layer\n", " Dense(int(width), activation='relu'),\n", " Dense(1, activation='linear'), # output layer\n", " ])\n", " if n_hidden == 2:\n", " model_2 = Sequential([\n", " Dense(1, activation='linear', input_shape=(1,)), # input layer\n", " Dense(int(width), activation='relu'),\n", " Dense(int(width), activation='relu'),\n", " Dense(1, activation='linear'), # output layer\n", " ])\n", " if n_hidden == 3:\n", " model_2 = Sequential([\n", " Dense(1, activation='linear', input_shape=(1,)), # input layer\n", " Dense(int(width), activation='relu'),\n", " Dense(int(width), activation='relu'),\n", " Dense(int(width), activation='relu'),\n", " Dense(1, activation='linear'), # output layer\n", " ])\n", " \n", " # Select the Optimizer\n", " adam = keras.optimizers.Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, epsilon=1e-07, decay=0.0, amsgrad=False) # adam optimizer\n", " #sgd = keras.optimizers.SGD(lr=0.001, momentum=0.0, decay = 0.0, nesterov=False) # stochastic gradient descent\n", "\n", " # Compile the Machine\n", " model_2.compile(optimizer=adam,loss='mse',metrics=['accuracy'])\n", "\n", " # Train the Network\n", " hist_2 = model_2.fit(X2_train['norm_X'], y2_train['norm_Y'],\n", " batch_size=5, epochs=int(epochs),\n", " validation_data=(X2_test['norm_X'], y2_test['norm_Y']),verbose = 0)\n", "\n", " # Predict with the Network\n", " pred_norm_y = model_2.predict(np.array(norm_x_bins)) # predict with our ANN\n", " pred_y = ((pred_norm_y + 1)/2*(y_max - y_min)+y_min)\n", "\n", " # Plot the Model Predictions\n", "\n", " plt.subplots(figsize=(7.5,5))\n", " plt.plot(x_bins,pred_y,'black',linewidth=7)\n", " plt.plot(X2_train['X'].values,y2_train['Y'].values, 'o', markerfacecolor='red', markeredgecolor='black', markersize=20, alpha=0.7, label = \"Train\")\n", " plt.plot(X2_test['X'].values,y2_test['Y'].values, 'o', markerfacecolor='blue', markeredgecolor='black', markersize=20, alpha=0.7, label = \"Test\")\n", " plt.xlabel('X')\n", " plt.ylabel('Y')\n", " plt.legend()\n", " plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=0.8, wspace=0.2, hspace=0.2)\n", " plt.legend(prop={'size': 20})\n", " \n", "# create dashboard/formatting\n", " \n", "interactive_plot = widgets.interactive_output(model, {'width': width, 'learning_rate': learning_rate, 'epochs': epochs, 'n_hidden': n_hidden})\n", "interactive_plot.clear_output(wait = True) # reduce flickering by delaying plot updating" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Advanced Artificial Neural Network Demonstration\n", "\n", "* change the hyperparameters so that the model fits the testing and training data\n", "* interactive plot demonstration with tensorflow, ipywidgets, matplotlib packages\n", "\n", "#### John Eric McCarthy II, Undergraduate Student, The University of Texas at Austin\n", "\n", "##### [LinkedIn](https://www.linkedin.com/in/john-mccarthy2)\n", "\n", "#### Michael Pyrcz, Associate Professor, University of Texas at Austin \n", "\n", "##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1) | [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy)\n", "\n", "### The Inputs: \n", "\n", "Adjust the hyperparameters to fit the parabola:\n", "\n", "**Width of Hidden Layers:** The number of nodes in a given hidden layer\n", "\n", "**Number of Hidden Layers:** Deep learning\n", "\n", "**Learning Rate:** How fast the neural network learns and how much the weights are adjusted after each iteration\n", "\n", "**Epochs:** How many times the model goes through the entire dataset.\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "900d6b35626d47ce80c98f28b056029e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Text(value=' ANN Demo, John Eric McCarthy II, Undergraduate Student…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "26eba05d880a41e29ecd566b33a7e25a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntSlider(value=10, continuous_update=False, description='Width of Hidden Layers', max=50, min=…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3dbef4aa2798489a91f80cb295bb201d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntSlider(value=1, continuous_update=False, description='Number of Hidden Layers', max=3, min=1…" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fe912aefbcdf422fabf60da4790c552a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(l, uia, uib, interactive_plot) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Comments\n", "\n", "This was an interactive workflow covering the basics of neural networks. \n", "\n", "The Texas Center for Geostatistics has many other demonstrations on the basics of working with DataFrames, ndarrays, univariate statistics, plotting data, declustering, data transformations, trend modeling and many other workflows available [here](https://github.com/GeostatsGuy/PythonNumericalDemos), along with a package for geostatistics in Python called [GeostatsPy](https://github.com/GeostatsGuy/GeostatsPy). \n", " \n", "We hope this was helpful,\n", "\n", "*John Eric* and *Michael*\n", "\n", "***\n", "\n", "#### More About Michael Pyrcz:\n", "\n", "### Michael Pyrcz, Professor, The University of Texas at Austin \n", "*Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions*\n", "\n", "With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development. \n", "\n", "For more about Michael check out these links:\n", "\n", "#### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)\n", "\n", "#### Want to Work Together?\n", "\n", "I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.\n", "\n", "* Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you! \n", "\n", "* Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!\n", "\n", "* I can be reached at mpyrcz@austin.utexas.edu.\n", "\n", "I'm always happy to discuss,\n", "\n", "*Michael*\n", "\n", "Michael Pyrcz, Ph.D., P.Eng. Professor, Cockrell School of Engineering and The Jackson School of Geosciences, The University of Texas at Austin\n", "\n", "#### More Resources Available at: [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig) | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 4 }