{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction: IPython Widgets\n", "\n", "In this notebook, we will get an introduction to IPython widgets. These are tools that allow us to build interactivity into our notebooks often with a single line of code. These widgets are very useful for data exploration and analysis, for example, selecting certain data or updating charts. In effect, Widgets allow you to make Jupyter Notebooks into an interactive dashboard instead of a static document." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the below cell if needed. You can also do this from the command line. If in Jupyter lab, [check out the instructions for that environment](https://ipywidgets.readthedocs.io/en/stable/user_install.html). " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:15.249992Z", "start_time": "2019-02-11T21:59:12.848653Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33mYou are using pip version 18.1, however version 19.0.1 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n", "Enabling notebook extension jupyter-js-widgets/extension...\n", " - Validating: \u001b[32mOK\u001b[0m\n" ] } ], "source": [ "!pip install -U -q ipywidgets\n", "!jupyter nbextension enable --py widgetsnbextension" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These are the other imports will use. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:16.649010Z", "start_time": "2019-02-11T21:59:15.252863Z" } }, "outputs": [ { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Standard Data Science Helpers\n", "import numpy as np\n", "import pandas as pd\n", "import scipy\n", "\n", "import plotly.plotly as py\n", "import plotly.graph_objs as go\n", "from plotly.offline import iplot, init_notebook_mode\n", "init_notebook_mode(connected=True)\n", "\n", "import cufflinks as cf\n", "cf.go_offline(connected=True)\n", "cf.set_config_file(colorscale='plotly', world_readable=True)\n", "\n", "# Extra options\n", "pd.options.display.max_rows = 30\n", "pd.options.display.max_columns = 25\n", "\n", "# Show all code cells outputs\n", "from IPython.core.interactiveshell import InteractiveShell\n", "InteractiveShell.ast_node_interactivity = 'all'" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:16.677478Z", "start_time": "2019-02-11T21:59:16.650547Z" } }, "outputs": [], "source": [ "import os\n", "from IPython.display import Image, display, HTML" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data\n", "\n", "For this project, we'll work with my medium stats data. You can grab your own data or just use mine! " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:17.739135Z", "start_time": "2019-02-11T21:59:16.679089Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
clapsdays_since_publicationfanslinknum_responsespublicationpublished_dateread_ratioread_timereadsstarted_datetagstexttitletitle_word_counttypeviewsword_countclaps_per_wordediting_days<tag>Education<tag>Data Science<tag>Towards Data Science<tag>Machine Learning<tag>Python
1292597.3011232https://medium.com/p/screw-the-environment-but...0None2017-06-10 14:25:0042.177702017-06-10 14:24:00[Climate Change, Economics]Screw the Environment, but Consider Your Walle...Screw the Environment, but Consider Your Wallet8published16618590.001076000000
12518589.9831683https://medium.com/p/the-vanquishing-of-war-pl...0None2017-06-17 22:02:0030.3414542017-06-17 22:02:00[Climate Change, Humanity, Optimism, History]The Vanquishing of War, Plague and Famine Part...The Vanquishing of War, Plague and Famine8published17838910.004626000000
13251577.36329220https://medium.com/p/capstone-project-mercedes...0None2017-06-30 12:55:0020.02422222017-06-30 12:00:00[Machine Learning, Python, Udacity, Kaggle]Capstone Project: Mercedes-Benz Greener Manufa...Capstone Project: Mercedes-Benz Greener Manufa...7published1109120250.004241000011
1260576.5206880https://medium.com/p/home-of-the-scared-5af0fe...0None2017-07-01 09:08:0035.859192017-06-30 18:21:00[Politics, Books, News, Media Criticism]Home of the Scared A review of A Culture of Fe...Home of the Scared4published5325330.000000000000
1210572.5330350https://medium.com/p/the-triumph-of-peace-f485...0None2017-07-05 08:51:008.471452017-07-03 20:18:00[Books, Psychology, History, Humanism]The Triumph of Peace A review of The Better An...The Triumph of Peace4published5938920.000000100000
\n", "
" ], "text/plain": [ " claps days_since_publication fans \\\n", "129 2 597.301123 2 \n", "125 18 589.983168 3 \n", "132 51 577.363292 20 \n", "126 0 576.520688 0 \n", "121 0 572.533035 0 \n", "\n", " link num_responses \\\n", "129 https://medium.com/p/screw-the-environment-but... 0 \n", "125 https://medium.com/p/the-vanquishing-of-war-pl... 0 \n", "132 https://medium.com/p/capstone-project-mercedes... 0 \n", "126 https://medium.com/p/home-of-the-scared-5af0fe... 0 \n", "121 https://medium.com/p/the-triumph-of-peace-f485... 0 \n", "\n", " publication published_date read_ratio read_time reads \\\n", "129 None 2017-06-10 14:25:00 42.17 7 70 \n", "125 None 2017-06-17 22:02:00 30.34 14 54 \n", "132 None 2017-06-30 12:55:00 20.02 42 222 \n", "126 None 2017-07-01 09:08:00 35.85 9 19 \n", "121 None 2017-07-05 08:51:00 8.47 14 5 \n", "\n", " started_date tags \\\n", "129 2017-06-10 14:24:00 [Climate Change, Economics] \n", "125 2017-06-17 22:02:00 [Climate Change, Humanity, Optimism, History] \n", "132 2017-06-30 12:00:00 [Machine Learning, Python, Udacity, Kaggle] \n", "126 2017-06-30 18:21:00 [Politics, Books, News, Media Criticism] \n", "121 2017-07-03 20:18:00 [Books, Psychology, History, Humanism] \n", "\n", " text \\\n", "129 Screw the Environment, but Consider Your Walle... \n", "125 The Vanquishing of War, Plague and Famine Part... \n", "132 Capstone Project: Mercedes-Benz Greener Manufa... \n", "126 Home of the Scared A review of A Culture of Fe... \n", "121 The Triumph of Peace A review of The Better An... \n", "\n", " title title_word_count \\\n", "129 Screw the Environment, but Consider Your Wallet 8 \n", "125 The Vanquishing of War, Plague and Famine 8 \n", "132 Capstone Project: Mercedes-Benz Greener Manufa... 7 \n", "126 Home of the Scared 4 \n", "121 The Triumph of Peace 4 \n", "\n", " type views word_count claps_per_word editing_days \\\n", "129 published 166 1859 0.001076 0 \n", "125 published 178 3891 0.004626 0 \n", "132 published 1109 12025 0.004241 0 \n", "126 published 53 2533 0.000000 0 \n", "121 published 59 3892 0.000000 1 \n", "\n", " Education Data Science Towards Data Science \\\n", "129 0 0 0 \n", "125 0 0 0 \n", "132 0 0 0 \n", "126 0 0 0 \n", "121 0 0 0 \n", "\n", " Machine Learning Python \n", "129 0 0 \n", "125 0 0 \n", "132 1 1 \n", "126 0 0 \n", "121 0 0 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_parquet('https://github.com/WillKoehrsen/Data-Analysis/blob/master/medium/data/medium_data_2019_01_26?raw=true')\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:17.847235Z", "start_time": "2019-02-11T21:59:17.741021Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
clapsdays_since_publicationfansnum_responsesread_ratioread_timereadstitle_word_countviewsword_countclaps_per_wordediting_days<tag>Education<tag>Data Science<tag>Towards Data Science<tag>Machine Learning<tag>Python
count133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000133.000000
mean1815.263158248.407273352.0526327.04511329.07466212.9172936336.3007527.12782023404.0300753029.1203010.95763820.3308270.7293230.6090230.4360900.3834590.315789
std2449.074661179.370879479.0601179.05610812.4176709.5107959007.2847263.15847533995.6364962393.4144561.84675674.1115790.4459890.4898140.4977740.4880670.466587
min0.0000001.2186290.0000000.0000008.1100001.0000001.0000002.0000003.000000163.0000000.000000-13.0000000.0000000.0000000.0000000.0000000.000000
25%121.00000074.54382223.0000000.00000020.0200008.000000363.0000005.0000001375.0000001653.0000000.0521150.0000000.0000000.0000000.0000000.0000000.000000
50%815.000000245.416130136.0000004.00000027.06000010.0000002049.0000007.0000007608.0000002456.0000000.4215251.0000001.0000001.0000000.0000000.0000000.000000
75%2700.000000376.080598528.00000012.00000034.91000014.0000007815.0000008.00000030141.0000003553.0000001.0993665.0000001.0000001.0000001.0000001.0000001.000000
max13600.000000597.3011232588.00000059.00000074.37000054.00000041978.00000016.000000173714.00000015063.00000017.891817349.0000001.0000001.0000001.0000001.0000001.000000
\n", "
" ], "text/plain": [ " claps days_since_publication fans num_responses \\\n", "count 133.000000 133.000000 133.000000 133.000000 \n", "mean 1815.263158 248.407273 352.052632 7.045113 \n", "std 2449.074661 179.370879 479.060117 9.056108 \n", "min 0.000000 1.218629 0.000000 0.000000 \n", "25% 121.000000 74.543822 23.000000 0.000000 \n", "50% 815.000000 245.416130 136.000000 4.000000 \n", "75% 2700.000000 376.080598 528.000000 12.000000 \n", "max 13600.000000 597.301123 2588.000000 59.000000 \n", "\n", " read_ratio read_time reads title_word_count views \\\n", "count 133.000000 133.000000 133.000000 133.000000 133.000000 \n", "mean 29.074662 12.917293 6336.300752 7.127820 23404.030075 \n", "std 12.417670 9.510795 9007.284726 3.158475 33995.636496 \n", "min 8.110000 1.000000 1.000000 2.000000 3.000000 \n", "25% 20.020000 8.000000 363.000000 5.000000 1375.000000 \n", "50% 27.060000 10.000000 2049.000000 7.000000 7608.000000 \n", "75% 34.910000 14.000000 7815.000000 8.000000 30141.000000 \n", "max 74.370000 54.000000 41978.000000 16.000000 173714.000000 \n", "\n", " word_count claps_per_word editing_days Education \\\n", "count 133.000000 133.000000 133.000000 133.000000 \n", "mean 3029.120301 0.957638 20.330827 0.729323 \n", "std 2393.414456 1.846756 74.111579 0.445989 \n", "min 163.000000 0.000000 -13.000000 0.000000 \n", "25% 1653.000000 0.052115 0.000000 0.000000 \n", "50% 2456.000000 0.421525 1.000000 1.000000 \n", "75% 3553.000000 1.099366 5.000000 1.000000 \n", "max 15063.000000 17.891817 349.000000 1.000000 \n", "\n", " Data Science Towards Data Science Machine Learning \\\n", "count 133.000000 133.000000 133.000000 \n", "mean 0.609023 0.436090 0.383459 \n", "std 0.489814 0.497774 0.488067 \n", "min 0.000000 0.000000 0.000000 \n", "25% 0.000000 0.000000 0.000000 \n", "50% 1.000000 0.000000 0.000000 \n", "75% 1.000000 1.000000 1.000000 \n", "max 1.000000 1.000000 1.000000 \n", "\n", " Python \n", "count 133.000000 \n", "mean 0.315789 \n", "std 0.466587 \n", "min 0.000000 \n", "25% 0.000000 \n", "50% 0.000000 \n", "75% 1.000000 \n", "max 1.000000 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Simple Widgets\n", "\n", "Let's get started using some widgets! We'll start off pretty simple just to see how the interface works." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:17.894781Z", "start_time": "2019-02-11T21:59:17.849710Z" } }, "outputs": [], "source": [ "import ipywidgets as widgets\n", "from ipywidgets import interact, interact_manual" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make a function interactive, all we have to do is use the `interact` decorator. This will automatically infer the input types for us! " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.144261Z", "start_time": "2019-02-11T21:59:17.897741Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6063ca449ade47acae25016b44ecef8d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Text(value='claps', description='column'), IntSlider(value=5000, description='x', max=15…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def show_articles_more_than(column='claps', x=5000):\n", " display(HTML(f'

Showing articles with more than {x} {column}

'))\n", " display(df.loc[df[column] > x, ['title', 'published_date', 'read_time', 'tags', 'views', 'reads']])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `interact` decorator automatically inferred we want a `text` box for the `column` and an `int` slider for `x`! This makes it incredibly simple to add interactivity. We can also set the options how we want." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.251039Z", "start_time": "2019-02-11T21:59:18.147133Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6a9ed18d11cb49dda4a6517c600111b8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(IntSlider(value=3000, description='x', max=5000, min=1000, step=100), Dropdown(descripti…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def show_titles_more_than(x=(1000, 5000, 100),\n", " column=list(df.select_dtypes('number').columns), \n", " ):\n", " # display(HTML(f'

Showing articles with more than {x} {column}

'))\n", " display(df.loc[df[column] > x, ['title', 'published_date', 'read_time', 'tags', 'views', 'reads']])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This now gives us a `dropdown` for the `column` selection and still an `int` slider for `x`, but with limits. This can be useful when we need to enforce certains constraints on the interaction." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Explorer\n", "\n", "Let's see another quick example of creating an interactive function. This one allows us to display images from a folder." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.359381Z", "start_time": "2019-02-11T21:59:18.253803Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7e9847476587483892f6cb844b32aea9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='file', options=('1080_1189866210-spanish-sunset.jpg', '1080_201401…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fdir = 'nature/'\n", "\n", "@interact\n", "def show_images(file=os.listdir(fdir)):\n", " display(Image(fdir+file))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You could use this for example if you have a training set of images that you'd quickly like to run through." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# File Browser\n", "\n", "We can do a similar operation to create a very basic file browser. Instead of having to manually run the command every time, we just can use this function to look through our files." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.526419Z", "start_time": "2019-02-11T21:59:18.361052Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 23368\r\n", "drwxr-xr-x 26 williamkoehrsen staff 832 Jan 26 10:09 \u001b[34mnature\u001b[m\u001b[m\r\n", "drwxr-xr-x 42 williamkoehrsen staff 1344 Jan 26 10:35 \u001b[34mimages\u001b[m\u001b[m\r\n", "drwxr-xr-x 18 williamkoehrsen staff 576 Jan 27 09:55 \u001b[34massorted\u001b[m\u001b[m\r\n", "-rw-r--r-- 1 williamkoehrsen staff 5978832 Jan 27 17:02 Widgets Overview-Extended Work.ipynb\r\n", "drwxr-xr-x 4 williamkoehrsen staff 128 Jan 27 17:30 \u001b[34m.ipynb_checkpoints\u001b[m\u001b[m\r\n", "-rw-r--r--@ 1 williamkoehrsen staff 8196 Jan 28 09:03 .DS_Store\r\n", "drwxr-xr-x 38 williamkoehrsen staff 1216 Feb 11 09:20 \u001b[34m..\u001b[m\u001b[m\r\n", "-rw-r--r-- 1 williamkoehrsen staff 5968304 Feb 11 16:59 Widgets-Overview.ipynb\r\n", "drwxr-xr-x 9 williamkoehrsen staff 288 Feb 11 16:59 \u001b[34m.\u001b[m\u001b[m\r\n" ] } ], "source": [ "!ls -a -t -r -l" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.608223Z", "start_time": "2019-02-11T21:59:18.528880Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4a2c8e4c49cd4428b116b44f9b5812ef", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='dir', options=('additive_models', 'statistics', 'recall_precision'…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import subprocess\n", "import pprint\n", "\n", "root_dir = '../'\n", "dirs = [d for d in os.listdir(root_dir) if not '.' in d]\n", "\n", "@interact\n", "def show_dir(dir=dirs):\n", " x = subprocess.check_output(f\"cd {root_dir}{dir} && ls -a -t -r -l -h\", shell=True).decode()\n", " print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dataframe Explorer\n", "\n", "Let's look at a few more examples of using widgets to explore data. Here we create a widget that quickly lets us find correlations between columns." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.682329Z", "start_time": "2019-02-11T21:59:18.610843Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5b489f5b32d246c2b7a4fbc43a310d30", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='column1', options=('claps', 'days_since_publication', 'fans', 'num…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def correlations(column1=list(df.select_dtypes('number').columns), \n", " column2=list(df.select_dtypes('number').columns)):\n", " print(f\"Correlation: {df[column1].corr(df[column2])}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's one to describe a specific column." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:18.734969Z", "start_time": "2019-02-11T21:59:18.683939Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "63d7d541f06e488cb2529456f9966776", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='column', options=('claps', 'days_since_publication', 'fans', 'link…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def describe(column=list(df.columns)):\n", " print(df[column].describe())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Interactive Widgets for Plots\n", "\n", "We can use the same basic approach to create interactive widgets for plots. This expands the capabilities of the already powerful plotly visualization library." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:19.742380Z", "start_time": "2019-02-11T21:59:18.736434Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "83c8632a69b54ba3882dd597c69e7e09", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='x', options=('claps', 'days_since_publication', 'fans', 'num_respo…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def scatter_plot(x=list(df.select_dtypes('number').columns), \n", " y=list(df.select_dtypes('number').columns)[1:]):\n", " df.iplot(kind='scatter', x=x, y=y, mode='markers', \n", " xTitle=x.title(), yTitle=y.title(), title=f'{y.title()} vs {x.title()}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's add some options to control the column scheme." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:20.056907Z", "start_time": "2019-02-11T21:59:19.745853Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e7c2a01e69894b4198e110fe0918b58c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='x', options=('claps', 'days_since_publication', 'fans', 'num_respo…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def scatter_plot(x=list(df.select_dtypes('number').columns), \n", " y=list(df.select_dtypes('number').columns)[1:],\n", " theme=list(cf.themes.THEMES.keys()), \n", " colorscale=list(cf.colors._scales_names.keys())):\n", " \n", " df.iplot(kind='scatter', x=x, y=y, mode='markers', \n", " xTitle=x.title(), yTitle=y.title(), \n", " text='title',\n", " title=f'{y.title()} vs {x.title()}',\n", " theme=theme, colorscale=colorscale)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next plot lets us choose the grouping category for the plot. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:20.694691Z", "start_time": "2019-02-11T21:59:20.059308Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "af9543c50e77482ca705ee384e73d712", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='x', options=('claps', 'days_since_publication', 'fans', 'num_respo…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df['binned_read_time'] = pd.cut(df['read_time'], bins=range(0, 56, 5))\n", "df['binned_read_time'] = df['binned_read_time'].astype(str)\n", "\n", "df['binned_word_count'] = pd.cut(df['word_count'], bins=range(0, 100001, 1000))\n", "df['binned_word_count'] = df['binned_word_count'].astype(str)\n", "\n", "@interact\n", "def scatter_plot(x=list(df.select_dtypes('number').columns), \n", " y=list(df.select_dtypes('number').columns)[1:],\n", " categories=['binned_read_time', 'binned_word_count', 'publication', 'type'],\n", " theme=list(cf.themes.THEMES.keys()), \n", " colorscale=list(cf.colors._scales_names.keys())):\n", " \n", " df.iplot(kind='scatter', x=x, y=y, mode='markers', \n", " categories=categories, \n", " xTitle=x.title(), yTitle=y.title(), \n", " text='title',\n", " title=f'{y.title()} vs {x.title()}',\n", " theme=theme, colorscale=colorscale)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may have noticed this plot was a little slow to update. When that is the case, we can use `interact_manual` which only updates the function when the button is pressed." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:20.751182Z", "start_time": "2019-02-11T21:59:20.696239Z" } }, "outputs": [], "source": [ "from ipywidgets import interact_manual" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:20.859857Z", "start_time": "2019-02-11T21:59:20.753440Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "672948c1f1da4ea49b694808966eadbe", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='x', options=('claps', 'days_since_publication', 'fans', 'num_respo…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact_manual\n", "def scatter_plot(x=list(df.select_dtypes('number').columns), \n", " y=list(df.select_dtypes('number').columns)[1:],\n", " categories=['binned_read_time', 'binned_word_count', 'publication', 'type'],\n", " theme=list(cf.themes.THEMES.keys()), \n", " colorscale=list(cf.colors._scales_names.keys())):\n", " \n", " df.iplot(kind='scatter', x=x, y=y, mode='markers', \n", " categories=categories, \n", " xTitle=x.title(), yTitle=y.title(), \n", " text='title',\n", " title=f'{y.title()} vs {x.title()}',\n", " theme=theme, colorscale=colorscale)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Making Our Own Widgets\n", "\n", "The decorator `interact` (or `interact_manual`) is not the only way to use widgets. We can also explicity create our own. One of the most useful I've found is the `DataPicker`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:20.916568Z", "start_time": "2019-02-11T21:59:20.861736Z" } }, "outputs": [], "source": [ "df.set_index('published_date', inplace=True)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.047340Z", "start_time": "2019-02-11T21:59:20.920596Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "01f5a46fd1854010b895745b3de75f5c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(DatePicker(value=Timestamp('2018-01-01 00:00:00'), description='start_date'), DatePicker…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def print_articles_published(start_date, end_date):\n", " start_date = pd.Timestamp(start_date)\n", " end_date = pd.Timestamp(end_date)\n", " stat_df = df.loc[(df.index >= start_date) & (df.index <= end_date)].copy()\n", " total_words = stat_df['word_count'].sum()\n", " total_read_time = stat_df['read_time'].sum()\n", " num_articles = len(stat_df)\n", " print(f'You published {num_articles} articles between {start_date.date()} and {end_date.date()}.')\n", " print(f'These articles totalled {total_words:,} words and {total_read_time/60:.2f} hours to read.')\n", " \n", "_ = interact(print_articles_published,\n", " start_date=widgets.DatePicker(value=pd.to_datetime('2018-01-01')),\n", " end_date=widgets.DatePicker(value=pd.to_datetime('2019-01-01')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this function, we use a `Dropdown` and a `DatePicker` to plot one column cumulatively up to a certain time. Instead of having to write this ourselves, we can just let `ipywidgets` do all the work!" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.284716Z", "start_time": "2019-02-11T21:59:21.049952Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a74cbf5512d645438d4670e35f37b202", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='column', options=('claps', 'days_since_publication', 'fans', 'num_…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def plot_up_to(column, date):\n", " date = pd.Timestamp(date)\n", " plot_df = df.loc[df.index <= date].copy()\n", " plot_df[column].cumsum().iplot(mode='markers+lines', \n", " xTitle='published date',\n", " yTitle=column, \n", " title=f'Cumulative {column.title()} Until {date.date()}')\n", " \n", "_ = interact(plot_up_to, column=widgets.Dropdown(options=list(df.select_dtypes('number').columns)), \n", " date = widgets.DatePicker(value=pd.to_datetime('2019-01-01')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dependent Widgets\n", "\n", "How do we get a value of a widget to depend on that of another? Using the `observe` method.\n", "\n", "Going back to the Image Browser earlier, let's make a function that allows us to change the directory for the images to list." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.382907Z", "start_time": "2019-02-11T21:59:21.286742Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "62525add0f8948c78862cc59528dc710", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='fdir', options=('images', 'nature', 'assorted'), value='images'), …" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "directory = widgets.Dropdown(options=['images', 'nature', 'assorted'])\n", "images = widgets.Dropdown(options=os.listdir(directory.value))\n", "\n", "def update_images(*args):\n", " images.options = os.listdir(directory.value)\n", "\n", "directory.observe(update_images, 'value')\n", "\n", "def show_images(fdir, file):\n", " display(Image(f'{fdir}/{file}'))\n", "\n", "_ = interact(show_images, fdir=directory, file=images)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also assign to the `interact` call and then reuse the widget. This has unintended affects though! " ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.616488Z", "start_time": "2019-02-11T21:59:21.387700Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c01e7415a1d84c2a842dce775337afc1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='tag', options=('Towards Data Science', 'Education', 'Machine Learn…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def show_stats_by_tag(tag):\n", " display(df.groupby(f'{tag}').describe()[['views', 'reads', 'claps', 'read_ratio']])\n", " \n", "stats = interact(show_stats_by_tag,\n", " tag=widgets.Dropdown(options=['Towards Data Science', 'Education', 'Machine Learning', 'Python', 'Data Science']))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.815239Z", "start_time": "2019-02-11T21:59:21.620545Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c01e7415a1d84c2a842dce775337afc1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='tag', options=('Towards Data Science', 'Education', 'Machine Learn…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "stats.widget" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now changing the value in one location changes it in both places! This can be a slight inconvenience, but on the plus side, now we can reuse the interactive element." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Linked Values\n", "\n", "We can link the value of two widgets to each other using the `jslink` function. This ties the values to be the same." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:21.947232Z", "start_time": "2019-02-11T21:59:21.817802Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f8fbc7b1bbc44bc7897d704e60446181", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(IntText(value=100, description='column1_value'), IntSlider(value=100, description='colum…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def show_less_than(column1_value, column2_value):\n", " display(df.loc[(df['views'] < column1_value) & \n", " (df['reads'] < column2_value), \n", " ['title', 'read_time', 'tags', 'views', 'reads']])\n", " \n", "column1_value=widgets.IntText(value=100, label='First')\n", "column2_value=widgets.IntSlider(value=100, label='Second')\n", "\n", "linked = widgets.jslink((column1_value, 'value'),\n", " (column2_value, 'value'))\n", "\n", "less_than = interact(show_less_than, column1_value=column1_value,\n", " column2_value=column2_value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I'm not exactly sure why you would want to link two widgets, but there you go! We can unlink them using the `unlink` command (sometimes syntax does make sense)." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:22.001784Z", "start_time": "2019-02-11T21:59:21.948880Z" } }, "outputs": [], "source": [ "linked.unlink()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:22.072554Z", "start_time": "2019-02-11T21:59:22.003500Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f8fbc7b1bbc44bc7897d704e60446181", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(IntText(value=100, description='column1_value'), IntSlider(value=100, description='colum…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "less_than.widget" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusions\n", "\n", "These widgets are not going to change your life, but they do make notebooks closer to interactive dashboards. I've only shown you some of the capabilities so be sure to look at the [documentation for the full details]. The Jupyter Notebook is useful by itself, but with additional tools, it can be an even better data exploration and analysis technology. Thanks to the efforts of many developers and contributors to open-source, we have these great technologies, so we might as well get the most from these libraries! " ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:22.128193Z", "start_time": "2019-02-11T21:59:22.074179Z" } }, "outputs": [], "source": [ "cscales = ['Greys', 'YlGnBu', 'Greens', 'YlOrRd', 'Bluered', 'RdBu',\n", " 'Reds', 'Blues', 'Picnic', 'Rainbow', 'Portland', 'Jet',\n", " 'Hot', 'Blackbody', 'Earth', 'Electric', 'Viridis', 'Cividis']" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:22.211843Z", "start_time": "2019-02-11T21:59:22.130241Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7be3c5f5f98049a992d708dd400d7ffb", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='colorscale', options=('Greys', 'YlGnBu', 'Greens', 'YlOrRd', 'Blue…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import plotly.figure_factory as ff\n", "\n", "corrs = df.corr()\n", "\n", "@interact_manual\n", "def plot_corrs(colorscale=cscales):\n", " figure = ff.create_annotated_heatmap(z = corrs.round(2).values, \n", " x =list(corrs.columns), \n", " y=list(corrs.index), \n", " colorscale=colorscale,\n", " annotation_text=corrs.round(2).values)\n", " iplot(figure)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2019-02-11T21:59:22.438146Z", "start_time": "2019-02-11T21:59:22.213746Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "964144e410ac466cb6c10667659cc9c5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(Dropdown(description='column1', options=('claps', 'views', 'read', 'word_count'), value=…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "@interact\n", "def plot_spread(column1=['claps', 'views', 'read', 'word_count'], \n", " column2=['views', 'claps', 'read', 'word_count']):\n", " df.iplot(kind='ratio',\n", " y=column1,\n", " secondary_y=column2,\n", " title=f'{column1.title()} and {column2.title()} Spread Plot',\n", " xTitle='Published Date')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "hide_input": false, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }