{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Indexing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Setting up the data\n", "\n", "Let's create the structures that will be used later in this notebook" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "np.random.seed(42) # Setting the random seed" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.37454012, 0.95071431, 0.73199394, 0.59865848, 0.15601864,\n", " 0.15599452, 0.05808361, 0.86617615, 0.60111501, 0.70807258])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# a vector: the argument to the array function is a Python list\n", "v = np.random.rand(10)\n", "v" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.02058449, 0.96990985],\n", " [0.83244264, 0.21233911],\n", " [0.18182497, 0.18340451],\n", " [0.30424224, 0.52475643],\n", " [0.43194502, 0.29122914],\n", " [0.61185289, 0.13949386],\n", " [0.29214465, 0.36636184],\n", " [0.45606998, 0.78517596],\n", " [0.19967378, 0.51423444],\n", " [0.59241457, 0.04645041]])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# a matrix: the argument to the array function is a nested Python list\n", "M = np.random.rand(10, 2)\n", "M" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We can index elements in an array using the square bracket and indices:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.3745401188473625" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# v is a vector, and has only one dimension, taking one index\n", "v[0]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.21233911067827616" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# M is a matrix, or a 2 dimensional array, taking two indices \n", "M[1,1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If we omit an index of a multidimensional array it returns the whole row (or, in general, a N-1 dimensional array) " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.83244264, 0.21233911])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[1] " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The same thing can be achieved with using `:` instead of an index: " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.83244264, 0.21233911])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[1,:] # row 1" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.96990985, 0.21233911, 0.18340451, 0.52475643, 0.29122914,\n", " 0.13949386, 0.36636184, 0.78517596, 0.51423444, 0.04645041])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[:,1] # column 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can assign new values to elements in an array using indexing:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "M[0,0] = 1" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[1. , 0.96990985],\n", " [0.83244264, 0.21233911],\n", " [0.18182497, 0.18340451],\n", " [0.30424224, 0.52475643],\n", " [0.43194502, 0.29122914],\n", " [0.61185289, 0.13949386],\n", " [0.29214465, 0.36636184],\n", " [0.45606998, 0.78517596],\n", " [0.19967378, 0.51423444],\n", " [0.59241457, 0.04645041]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# also works for rows and columns\n", "M[1,:] = 0\n", "M[:,1] = -1" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1. , -1. ],\n", " [ 0. , -1. ],\n", " [ 0.18182497, -1. ],\n", " [ 0.30424224, -1. ],\n", " [ 0.43194502, -1. ],\n", " [ 0.61185289, -1. ],\n", " [ 0.29214465, -1. ],\n", " [ 0.45606998, -1. ],\n", " [ 0.19967378, -1. ],\n", " [ 0.59241457, -1. ]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Index slicing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Index slicing is the technical name for the syntax `M[lower:upper:step]` to extract part of an array:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4, 5])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([1,2,3,4,5])\n", "a" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([2, 3])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1:3]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Array slices are **mutable**: if they are assigned a new value the original array from which the slice was extracted is modified:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([ 1, -2, -3, 4, 5])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1:3] = [-2,-3]\n", "\n", "a" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* We can omit any of the three parameters in `M[lower:upper:step]`:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([ 1, -2, -3, 4, 5])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[::] # lower, upper, step all take the default values" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([ 1, -3, 5])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[::2] # step is 2, lower and upper defaults to the beginning and end of the array" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([ 1, -2, -3])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[:3] # first three elements" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([4, 5])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[3:] # elements from index 3" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* Negative indices counts from the end of the array (positive index from the begining):" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = np.array([1,2,3,4,5])" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[-1] # the last element in the array" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 4, 5])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[-3:] # the last three elements" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* Index slicing works exactly the same way for multidimensional arrays:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4],\n", " [10, 11, 12, 13, 14],\n", " [20, 21, 22, 23, 24],\n", " [30, 31, 32, 33, 34],\n", " [40, 41, 42, 43, 44]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[n+m*10 for n in range(5)] \n", " for m in range(5)])\n", "A" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[11, 12, 13],\n", " [21, 22, 23],\n", " [31, 32, 33]])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# a block from the original array\n", "A[1:4, 1:4]" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 2, 4],\n", " [20, 22, 24],\n", " [40, 42, 44]])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# strides\n", "A[::2, ::2]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Indexing and Array Memory Management" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Numpy arrays support two different way of storing data into memory, namely\n", "\n", "* F-Contiguous \n", " - i.e. *column-wise* storage, Fortran-like\n", "* C-Contiguous\n", " - i.e. *row-wise* storage, C-like\n", " \n", "The **storage** strategy is controlled by the parameter `order` of `np.array`\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Fancy indexing" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "import numpy as np\n", "FC = np.array([[1, 2, 3], [4, 5, 6], \n", " [7, 8, 9], [10, 11, 12]], order='F')" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "CC = np.array([[1, 2, 3], [4, 5, 6], \n", " [7, 8, 9], [10, 11, 12]], order='C')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* **Note**: no changes in meaning for indexing operations" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "FC[0, 1]" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "CC[0, 1]" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "(4, 3)" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "FC.shape" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(4, 3)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "CC.shape" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fancy Indexing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Fancy indexing is the name for when an array or list is used in-place of an index: " ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 11, 12, 13, 14],\n", " [20, 21, 22, 23, 24],\n", " [30, 31, 32, 33, 34]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_indices = [1, 2, 3]\n", "A[row_indices]" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([11, 22, 34])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "col_indices = [1, 2, -1] # remember, index -1 means the last element\n", "A[row_indices, col_indices]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* We can also index **masks**: \n", "\n", " - If the index mask is an Numpy array of with data type `bool`, then an element is selected (True) or not (False) depending on the value of the index mask at the position each element: " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b = np.array([n for n in range(5)])\n", "b" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 2])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_mask = np.array([True, False, True, False, False])\n", "b[row_mask]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* Alternatively:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 2])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# same thing\n", "row_mask = np.array([1,0,1,0,0], dtype=bool)\n", "b[row_mask]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This feature is very useful to conditionally select elements from an array, using for example comparison operators:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,\n", " 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.arange(0, 10, 0.5)\n", "x" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([False, False, False, False, False, False, False, False, False,\n", " False, False, True, True, True, True, True, True, True,\n", " True, True])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask = (5 < x)\n", "\n", "mask" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([5.5, 6. , 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can use the condition (mask) array directly within brackets to index the array" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([5.5, 6. , 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[(5 < x)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Exercises on Indexing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Index slicing is the technical name for the syntax `M[lower:upper:step]` to extract part of an array" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.1 \n", "\n", "Generate a three-dimensional array of any size containing random numbers taken from an uniform distribution (_guess the numpy function in `np.random`_). Then print out separately the first entry along the three axis (i.e. `x, y, z`) \n", "\n", "\n", "* _hint_: Slicing with numpy arrays works quite like Python lists" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.2\n", "\n", "Create a vector and print out elements in reverse order\n", "\n", "#### Hint: Use slicing for this exercise" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.3\n", "\n", "Generate a $7 \\times 7$ matrix and replace all the elements in odd rows and even columns with `1`.\n", "\n", "#### Hint: Use slicing to solve this exercise!\n", "\n", "#### Note: Take a look at the original matrix, then." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Use fancy indexing** to get all the elements of the previous matrix that are equals to `1`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.4 \n", "\n", "Generate a `10 x 10` matrix of numbers `A`. Then, generate a numpy array of integers in range `1-9`. Pick `5` random values (with no repetition) from this array and use these values to extract rows from the original matrix `A`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.5 \n", "\n", "Repeat the previous exercise but this time extract columns from `A`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Ex 3.6\n", "\n", "Generate an array of numbers from `0` to `20` with step `0.5`. \n", "Extract all the values greater than a randomly generated number in the same range.\n", "\n", "#### Hint: Try to write the condition as an expression and save it to a variable. Then, use this variable in square brackets to index.... this is when the magic happens!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.7 (NumPy EuroSciPy)", "language": "python", "name": "numpy-euroscipy" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }