{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 03: Linear and logistic regressions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of this lab is to explore linear and logistic regression, implement them yourself and learn to use their respective scikit-learn implementation.\n", "\n", "Let us start by loading some of the usual librairies" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "import pandas as pd\n", "%pylab inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Linear regression\n", "\n", "We will now implement a linear regression, first using the closed form solution, and second with our gradient descent.\n", "\n", "## 1.1 Linear regression data\n", "\n", "Our first data set regards the quality ratings of a white _vinho verde_. Each wine is described by a number of physico-chemical descriptors such as acidity, sulfur dioxide content, density or pH." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fixed acidityvolatile aciditycitric acidresidual sugarchloridesfree sulfur dioxidetotal sulfur dioxidedensitypHsulphatesalcoholquality
07.00.270.3620.70.04545.0170.01.00103.000.458.86
16.30.300.341.60.04914.0132.00.99403.300.499.56
28.10.280.406.90.05030.097.00.99513.260.4410.16
37.20.230.328.50.05847.0186.00.99563.190.409.96
47.20.230.328.50.05847.0186.00.99563.190.409.96
\n", "
" ], "text/plain": [ " fixed acidity volatile acidity citric acid residual sugar chlorides \\\n", "0 7.0 0.27 0.36 20.7 0.045 \n", "1 6.3 0.30 0.34 1.6 0.049 \n", "2 8.1 0.28 0.40 6.9 0.050 \n", "3 7.2 0.23 0.32 8.5 0.058 \n", "4 7.2 0.23 0.32 8.5 0.058 \n", "\n", " free sulfur dioxide total sulfur dioxide density pH sulphates \\\n", "0 45.0 170.0 1.0010 3.00 0.45 \n", "1 14.0 132.0 0.9940 3.30 0.49 \n", "2 30.0 97.0 0.9951 3.26 0.44 \n", "3 47.0 186.0 0.9956 3.19 0.40 \n", "4 47.0 186.0 0.9956 3.19 0.40 \n", "\n", " alcohol quality \n", "0 8.8 6 \n", "1 9.5 6 \n", "2 10.1 6 \n", "3 9.9 6 \n", "4 9.9 6 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# load the regression task data\n", "wine_data = pd.read_csv('data/winequality-white.csv', sep=\";\")\n", "wine_data.head(5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Load the data into X and y data arrays\n", "X_regr = wine_data.drop(['quality'], axis=1).values\n", "y_regr = wine_data['quality'].values\n", "\n", "# Standardize the data\n", "from sklearn import preprocessing\n", "sc = preprocessing.StandardScaler()\n", "sc.fit(X_regr)\n", "X_regr = sc.transform(X_regr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.2 Cross-validation\n", "\n", "Let us create a cross-validation utility function (similar to what we have done in Lab 3, but for regression)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# set up folds for cross_validation\n", "from sklearn import model_selection\n", "folds_regr = model_selection.KFold(n_splits=10, shuffle=True)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def cross_validate_regr(design_matrix, labels, regressor, cv_folds):\n", " \"\"\" Perform a cross-validation and returns the predictions.\n", " \n", " Parameters:\n", " -----------\n", " design_matrix: (n_samples, n_features) np.array\n", " Design matrix for the experiment.\n", " labels: (n_samples, ) np.array\n", " Vector of labels.\n", " regressor: Regressor instance; must have the following methods:\n", " - fit(X, y) to train the regressor on the data X, y\n", " - predict(X) to apply the trained regressor to the data X and return estimates \n", " cv_folds: sklearn cross-validation object\n", " Cross-validation iterator.\n", " \n", " Returns:\n", " -------\n", " pred: (n_samples, ) np.array\n", " Vectors of predictions (same order as labels).\n", " \"\"\"\n", " pred = np.zeros(labels.shape)\n", " for tr, te in cv_folds:\n", " regressor.fit(design_matrix[tr,:], labels[tr])\n", " pred[te] = (regressor.predict(design_matrix[te,:]))\n", " return pred" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.3 Linear regression with scikit-learn\n", "\n", "__Question__ Cross-validate scikit-learn's [linear_model.LinearRegression](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) on your data." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean squared error: 0.568\n" ] } ], "source": [ "from sklearn import linear_model\n", "\n", "# Initialize a LinearRegression model\n", "regr = linear_model.LinearRegression()\n", "\n", "# Cross-validate it\n", "pred = cross_validate_regr(X_regr, y_regr, regr, folds_regr.split(X_regr, y_regr))\n", "\n", "from sklearn import metrics\n", "print(\"Mean squared error: %.3f\" % metrics.mean_squared_error(y_regr, pred))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. Logistic regression\n", "\n", "We will now implement a linear regression, first using the closed form solution, and second with our gradient descent.\n", "\n", "## 2.1 Logistic regression data\n", "\n", "Our second data set comes from the world of bioinformatics. In this data set, each observation is a tumor, and it is described by the expression of 3,000 genes. The expression of a gene is a measure of how much of that gene is present in the biological sample. Because this affects how much of the protein this gene codes for is produced, and because proteins dictacte what cells can do, gene expression gives us valuable information about the tumor. In particular, the expression of the same gene in the same individual is different in different tissues (although the DNA is the same): this is why blood cells look different from skin cells. In our data set, there are two types of tumors: breast tumors and ovary tumors. Let us see if gene expression can be used to separate them!" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Load the classification task data\n", "breast_data = pd.read_csv('data/small_Breast_Ovary.csv')\n", "\n", "# Drop the 'Tissue' column to create the design matrix\n", "X_clf = np.array(breast_data.drop(['Tissue', 'ID_REF'], axis=1).values)\n", "\n", "# Use the 'Tissue' column to create the labels (0=Breast, 1=Ovary)\n", "y_clf = np.array(breast_data['Tissue'].values)\n", "y_clf[np.where(y_clf == 'Breast')] = 0\n", "y_clf[np.where(y_clf == 'Ovary')] = 1\n", "y_clf = y_clf.astype(np.int)\n", "\n", "#sc = preprocessing.StandardScaler()\n", "#sc.fit(X_clf)\n", "#X_clf = sc.transform(X_clf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Question:__ How many samples do we have? How many belong to each class? How many features do we have?" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of samples : 542\n", "Class Breast : 344\n", "Class Ovary : 198\n" ] } ], "source": [ "print(\"Number of samples : \", len(y_clf))\n", "print(\"Class Breast : \", sum(y_clf == 0))\n", "print(\"Class Ovary : \", sum(y_clf == 1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.2 Cross-validation\n", "\n", "Let us create a cross-validation utility function (similar to what we have done in Lab 3)." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# Set up folds for cross_validation\n", "from sklearn import model_selection\n", "folds_clf = model_selection.StratifiedKFold(n_splits=10, shuffle=True)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def cross_validate_clf(design_matrix, labels, classifier, cv_folds):\n", " \"\"\" Perform a cross-validation and returns the predictions.\n", " \n", " Parameters:\n", " -----------\n", " design_matrix: (n_samples, n_features) np.array\n", " Design matrix for the experiment.\n", " labels: (n_samples, ) np.array\n", " Vector of labels.\n", " classifier: sklearn classifier object\n", " Classifier instance; must have the following methods:\n", " - fit(X, y) to train the classifier on the data X, y\n", " - predict_proba(X) to apply the trained classifier to the data X and return probability estimates \n", " cv_folds: sklearn cross-validation object\n", " Cross-validation iterator.\n", " \n", " Return:\n", " -------\n", " pred: (n_samples, ) np.array\n", " Vectors of predictions (same order as labels).\n", " \"\"\"\n", " pred = np.zeros(labels.shape)\n", " for tr, te in cv_folds:\n", " classifier.fit(design_matrix[tr,:], labels[tr])\n", " pred[te] = classifier.predict_proba(design_matrix[te,:])[:,1]\n", " return pred" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3 Logistic regression with scikit-learn\n", "\n", "__Question__ Cross-validate scikit-learn's [linear_model.LogisticRegression](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) on your data." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.948\n" ] } ], "source": [ "from sklearn import linear_model\n", "\n", "# Initialize a LogisticRegression model. \n", "# Use C=1e7 to ensure there is no regularization (we'll talk about regularization next time!)\n", "clf = linear_model.LogisticRegression(C=1e7)\n", "\n", "# Cross-validate it\n", "ypred_logreg = cross_validate_regr(X_clf, y_clf, clf, folds_clf.split(X_clf, y_clf))\n", "\n", "#print(\"Accuracy: %.3f\" % metrics.accuracy_score(ypred_logreg > 0.5, 1, 0))\n", "print(\"Accuracy: %.3f\" % metrics.accuracy_score(ypred_logreg > 0.5, y_clf))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Question : ** Plot the ROC curve. Use plt.semilogx to use a logarithmic scale on the x-axis. This \"spreads out\" the curve a little, making it easier to read." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEhCAYAAACUW2yNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAt3ElEQVR4nO3debwWdd3/8dcbFBB3AVckMNxxSU8mLWiL3ajlUqbmFqaRt1rJ3SLanZppambmnXarKZFL2J0r5pZLoJYLaJq44A8VFTEX3EAEBD6/P75zcLi2M+ec65zrcHw/H4/rcV0z852Zz7XNZ+b7/c6MIgIzM7O8Ho0OwMzMuh4nBzMzK+PkYGZmZZwczMysjJODmZmVcXIwM7MyTg4NJGmUpMg9Fkl6RtLPJfWpMs/HJV0j6RVJCyXNlPRbSRtVKb+ypKMl/V3SW9k8z0kaJ2mHjn2HXVv22V3Rietr/r4Ht3Keb9ZjWQaSJkma1Og4VgQrNToAA+BrwCxgdWBf4ITs9XfyhSQdCvweuBf4HjAb2BL4EbCfpC9ExL9y5VcFbgE+DlwI/ByYBwwFDgHuBNbuyDdmy7kJGA683Ip5RpH+p+PqsCyDoxsdwIpCPgmucSSNIm3sN42IGbnxtwOfAlaLiKXZuM2BR4G/APs3j8+m9QMeAJYCW0fE+9n4S4BDgV0j4r4K6983Iq7roLfXIkm9I2JhA9c/E7g3Ig5pVAwtyfZyV4qIT3fiOnuStg2LO2FdDf0NWHWuVuqaHgZWAfrnxh0H9AS+k08MABExBzgR2BT4CoCkDUh7nb+rlBiy+VpMDJJ2kXS7pLclvSvpUUlH5KaHpFNK5hmcjR+VGzde0ixJwyX9Q9J7wC8k3SzpoQrr3UDSYknH5cYNkXSlpNey6rFHJO3b0ntoD0k7SbpD0rzs/d8paacK5b6XVVMtkPSgpE9mw+NzZcqqgiQdJOmf2fLflvSYpG9n0yYBuwCfylU9Tqq2rGz8tyQ9LOk9SW9Kmizpky28x5B0uqSxkp4DFgHbZNN2yd7z3Oz93yZpWMn8PSWdJullSfMl3SVpi9LfhqRTsnHDsuXMA/4vm9ZX0llZleei7PnHknrk5l9N0m8kvZB9/69k380WJd/Dk7n3PzX/G6lUrSRpc0nXKVW7vifpfkkjS8o0x76ppJuy7+t5SSflY+xOuuWb6gYGA28Dc3LjPg9MjYhq1Qg3kY4cPpcNf5aUTCa2NQhJe5OqnnoB3wb2JlVvfKSNi1wTuAqYAOwO/BG4DNhB0lYlZQ/KnidksWxMOjraDhgD7EVKotdI2isXc3NiOqWNMS4jaVtgMqnqbRRwGLAGMFnSdrlyRwK/Bu4gfUbjs/e2VgvL/zRwRbaOfUjVi7/LzXc08E/gX6QqpOHUqBaR9EvgYtLnsj+p6vBuYFCBtzsK2BP4QfY8W9KepO9/Xrasg0jVnfdk30ezn5J2Ti4jvf/bqP27uyF7z3sB50paKZvnSOA80m/jEuAnwNm5+c7N3tdPgd2Ao4BHyD4vSQcD55B+M3sABwNXA+tUC0TShqRq2u2AY7PlvwXcJGn3CrNcB9xF+r6uz2L5Ro33uuKKCD8a9CD9IQPYnFSvvDbwTWAxcGxJ2feACS0s79/Azdnr45uX3cbYBMwEpgI9apQL4JSScYOz8aNy48Zn4/YuKbsKKRGeUTL+keb3kg1fCrwG9CspdzvwSG74I9nnd1KB9zgTuKLG9KtJG4q1cuPWAN4Ars2GewAv5mPNxn8le7/jK3zfg7PhHwBvtBDjJFLVV7XfTvOyhgJLgF+14bsOUvvVKiXjZwB3loxbA3gd+HU2vDYpefy2pNx/lf42gFOycd8rKXtoNn5Eyfgfk45i1s2Gp9V6f8D5wMMFPs9JueFfZr+XoblxPYHp+WXlYj+8ZHmPAX9ty3+sqz985NA1PAW8T9roXApcFBHnt2E5qmNMm5M2tJdESTVWOywmtZksExHvAdcAB0sSgKRtSHtyl+WKjgRuBt6WtFLzg7THuZ2kNbLlPR8RK0XEqXWIdwTwl4h4KxfvO6S94l2yUQOzx59L5r0he7+1TAHWlnSFpC9JWqsdsX6BlKgubuP8t2bfBQCSNgU+ClxZ8nnPB+4jfTaQqp9Wpfz9X11jXaXVmSOB54F/lKzrr8DKwM5ZuSnAKEknSmpSahvJmwJsn1U9fUFS3wLvewRwf+Ta/CJiCenoY/vm31XOTSXD0yh2ZLbCcXLoGvYl9Sjag1Q1cbSkw0rKzCLtkVek1DOpP2kvltxzW6uA+uXWWy+vZn+8UpcBGwO7ZsOHAnNJG9hm65Kqdd4veTRXO/Sj/tahcm+gf/NBL68NsudX8wWy9/l6rYVHxGRSVdLGpA3ma1kd+rZtiLW931fp+1w3e76U8s/8S7n1VXz/wCutXNdHKqznwWx687q+A1xEOrqeArwq6dxcErgM+E/gE6SdhjckXVvaLlOi1ncsynvzvVEyvBCo2O18ReeurF3DtOY9F0l3keqYz5Z0TUS8m5W5EzhC0gZRud1hT1KyvysbnkSqZvgyaQ+stZo3bBXPn8hZSGqTyKu2oa7WNW4y8AJwiKTJwNeBq/N7sqT2l3uAs6osY3YLcbbFG8D6FcavzwcbiebvYt18gWyvNt+hoKKIuBq4WtJqpOR4FnCrpIGtPGLLf1/TWzHfslBKhpvbu04g7bCUWpQ959//47np67VyXc+R6vsrmQkQEfOyeE6Q9BFgP+DMLJbjI9XzXARcJGlt4IukNog/kRJGJbW+46A8GXxo+Mihi4nUre+HpD9bvvHxPFKD829Ke0dIWod0DsMM4NpsObNJ9fyjJQ2vtC5J+9QI5WnSn/LI5uqeKp4HhpWM27NG+TLZn/pK0p99D1I1zWUlxW4FtgUej4ipFR4d0R1yMrCnpNWbR2Svv5xNg7SnPot0BJC3D63Y+YqIeRHxF9LGbQM+SLALSe0yLbmD9PsYXXSdLZhO+v63rvJ5N59P8xjwLuXvv3S4lltJR0/zqqyr7Agsqz48J1t/6e+PiHgzIv5E6g1VNj1nMrCzlu9B1hM4APhnRMxtxfvoVnzk0AVFxERJU4AfSDo/It6LiCeVujheAtwp6ULSXtsWpJPg1gJ2i+wch8xxwGa58neQGg83IfXkaCL1uKgUQyh1I70WuCub/zXSSXfrRsTJWdGrgP+W9GPgfuAzpD3/1rqMtFd4IalKbHLJ9JNI1Qx3SzqftOFam/TH3yQivgmQ7VE+A5xasN1hkKT9Koy/D/gZqQrlTklnkfYkjwf6AqcCRMRSST8Ffqd0XsmfSZ/vWFJDe9W9f0mnkvaw/0Y68hkIfJfUwP5aVuwJUjXjAdn7mhsRZUcGEfGMpHOB/8oS2ETSkeNOwFPZhrKw7Ps/BrhBUi/SRvb1LN5PAi9ExK8i4k1JvwZOlDSX9BvbAWju7lzk6OdK4HDS53wO6XyeXqQ2j72AfSJivqT7svf1GOl3vAupbeoPAJIuJlVH3keq5tqMVEVZ68j5XFLj/u2STgbeIe2UbUYrd3K6nUa3iH+YH3zQ42RohWlfzKaNKRm/M1n9NOlw+nnSBnXjKutYGTgG+Afph7+IdAh/CbBtgRg/R9p4zcsej5LrsUGqbz2PlKjmkg7hd6Jyb6VZLaxrSjbfz6tMH5jF/VL2Pl4m9VY6JFdmMBV6UFVZ3sysbKXHflmZT/BBUn2XVL23U4VlHZd9FwtIPbw+DbwJnFvh+x6cDe9Jqht/mXSE8CKpjn/D3Dzrkxri52bzTqq0rFz5o0jVkgtJVSKTgOEtfA4BnFZl2nBSJ4I3s/c2k7RDMDxXpidwOqme/r1snZ+kpGcSH/T4WanCevpk05/KxT4lG7dSVuYsUtfet7Pv4jHgu7llfCNb96vZMp4jbfzXyJWZRK63UjZuc9JO0tvZe7wfGFlSpmLspN/1zEZvSzri4TOkzTqApI+TjnQOi4jLGx1PZ5P0NdLRxoiIuKfR8VjrOTmYtZOkIaSjs3tIR2dbkk4KWwQMi4j5DQyvw0n6BOko6AHSnveOpGq16cAnwxuZFZLbHMza7z1S28dhpHaQN0lVUWO7e2LIzCOdL3AM6SS5V0lHDSc4May4fORgZmZl3JXVzMzKODmYmVmZbtHm0L9//xg8eHCjwzAzW6E89NBDr0fEgErTukVyGDx4MFOnTm10GGZmKxRJz1eb5molMzMr4+RgZmZlnBzMzKyMk4OZmZXp1OQgaZykVyVNqzJdkv5H0gxJ/5K0Q2fGZ2ZmSWcfOYwn3RKwmt2BTbPHaOB/OyEmMzMr0aldWSPi7hZu2bc3cFl2PZb7Ja1V485nZmbdWwQsfR+WzIfF8ys/rzoE1tq67qvuauc5bMQH9z6GdIetjahwj1dJo8nuejVoULe8v7eZdWWxFJYsqLzRXvxu7Q16a+apeNv1nC1/BB+rdvfctutqyaHS7SgrXhkwIi4GLgZoamry1QPN7ANLF1fY+M6HJe9W2TDX2nhXmWfJey3HUUawUl/o2Tc9r7TqB69794OVNv5guNpzfp6efaFvS7d5b5uulhxmke4l22wgHXPjeDNrhAhYurDj9rKbn5e+33IspXqsDD1XrbxBXmXtbMNcbaOd22DX2rD36A01b8nedXS15DAROFbSVaTbM77t9gazTrJ0Sdob7siN9uL5VKkMqK3axnblNWGVDapvsGvugefHrZKSgy3TqclB0gRgV6C/pFnAyaR7HBMRF5LulbsHMAOYT7rpuNmH23KNktWqOGo9F5xn6cLWx6ae5dUczc991mu5SqTQ8yorzN52d9LZvZW+3sL0IN1NymzFsKxRssIec7022kUaJSvp2af6HnXvAcX3sGvVfXtvu9vqatVKZvWz9P1WVonUanys9tzWRskqG+TlGiUL1mNXqkrp0Qd69Kz7R2ofHk4O1vki0t52R1SNtLtRslf1jW/fdVqxV50ft+ry43r0cjWJdXlODra8pUvq2/jY4Y2Sq0KvtaDnhq2vEqlUt93DfwkzcHJYcUTA0kXF6rPb08tk6aLWx1axUTLbW155/dZXiVSa1rOP97bNOpGTQz3E0lT33OLJNe/WmFbtOTdPLG19bPlGyeXqsFeDPuu2cQ+75NmNkmbdTvdIDoveghevywZKqiuitPqidPqS6hv2onvmSxa0Pmb1KG90XNYo2b94PXbNBstV0nrMzFqpeySHec/APV+p3/J69K6+p1yrUbLIXrYbJc1sBdA9ksOaW8LuE3IjSja6ZRthLf/ajZJmZsvpHlvBnn1h7e0aHYWZWbfhCmkzMyvj5GBmZmWcHMzMrIyTg5mZlXFyMDOzMk4OZmZWxsnBzMzKODmYmVkZJwczMyvj5GBmZmWcHMzMrIyTg5mZlXFyMDOzMk4OZmZWptXJQdJqkj4iyfeGNDPrpgonB0lfkvQw8DbwDLBNNv4SSQd1UHxmZtYAhZKDpH2AG4DXgeNL5nsO+EbdIzMzs4YpeuRwMvD7iPgi8OuSadOAYfUMyszMGqtoctgS+FP2OkqmvQn0q1tEZmbWcEWTwztA/yrTBgOv1SUaMzPrEoomh9uBEyStlRsXknoDxwK31DswMzNrnJUKlvsx8CAwHbiZVLU0FtgWWBPYpyOCMzOzxih05BARM4EdgL8AuwFLgBHA/cAnImJ20RVKGilpuqQZksZWmL6mpBslPSrpcUmHF122mZnVR9EjByJiFnBEe1YmqSdwASnBzAKmSJoYEU/kih0DPBERX5Y0AJgu6cqIWNSedZuZWXFFz3O4S9IWVaZtJumuguvbCZgREc9mG/urgL1LygSwuiQBqwFvAIsLLt/MzOqgaIP0rsAaVaatDuxScDkbAS/mhmdl4/LOJ3WdnQ08BnwvIpaWLkjSaElTJU197TV3ljIzq6fWXFup9PyGZh8F5hVchgos9z+AR4ANge2B8yWVJaaIuDgimiKiacCAAQVXb2ZmRVRtc8gagpsbgwO4WNLckmKrkM6OvrPg+mYBG+eGB5KOEPIOB86MiABmSHoO2ILUW8rMzDpBrSOHpaReSUtIe/z54ebHHOB/Kd5QPQXYVNIQSb2AA4GJJWVeAD4PIGk9YHPg2YLLNzOzOqh65BARfwD+ACDpb8B/RsRT7VlZRCyWdCxwG9ATGBcRj0s6Kpt+IfAzYLykx0hJ6fiIeL096zUzs9ZRqr1ZsTU1NcXUqVMbHYaZ2QpF0kMR0VRpWuHzHLIFbUeq5ulTOi0iLmtbeGZm1tUUSg7ZNZVuAnZuHpU95w87nBzMzLqJol1Zf066LPcIUmLYF/gccCWpsXinDonOzMwaomhy+A9Sgrg/G54VEZMi4jDgDuB7HRGcmZk1RtHksAHwbEQsARaQzopudi2wZ70DMzOzximaHP4NrJW9fh4Ynps2tJ4BmZlZ4xXtrXQvKSH8BbgcOFnSYNIF8b5B+YlsZma2AiuaHH5KutYRwNmkxukDgL6kxPCd+odmZmaNUig5RMQzwDPZ6/eB72cPMzPrhlpzVdaKJH1M0nX1CMbMzLqGmkcO2Z3bdgQGAc9ExD9z05qAk4E9gNKrtZqZ2Qqs6pGDpIHAA8B9wP8BUyX9SVIvSZdk0z4HnANs0hnBmplZ56h15HAm6T4KPwEeBoYAJwJ/Jx1N/AEYGxGvdHSQZmbWuWolh88Dp0TEL5tHSJpOOiP6NxHhs6LNzLqpWg3SA/jgchnN7sue/9wx4ZiZWVdQKzn0ABaVjGsent8x4ZiZWVfQ0nkOX5Y0LDfcg3SZ7r0kbZ8vGBHj6hybmZk1SEvJ4cdVxp9UMhyAk4OZWTdRKzkM6bQozMysS6maHCLi+c4MxMzMuo52Xz7DzMy6HycHMzMr4+RgZmZlnBzMzKyMk4OZmZVpVXKQ1EPSMEm7SFq1o4IyM7PGKpwcJB0D/Bt4FLgL2Dwbf72k73ZMeGZm1giFkoOkbwHnAdeT7h2t3OR7gK/WPTIzM2uYokcO/wWcExGjgdJbgj5FdhRhZmbdQ9HkMAS4rcq0d4G16hKNmZl1CUWTw+vA4CrTNgdeKrpCSSMlTZc0Q9LYKmV2lfSIpMclTS66bDMzq4+iyeFG4CRJ+XtFh6T+wBhSW0SLJPUELgB2B7YCvi5pq5IyawG/BfaKiK2BrxWM0czM6qRocvhvYCEwjXSb0AD+B3gSWAKcWnA5OwEzIuLZiFgEXAXsXVLmIODaiHgBICJeLbhsMzOrk0LJISLmAE3AGcDKwDOkK7qeDwyPiLcLrm8j4MXc8KxsXN5mwNqSJkl6SNJhBZdtZmZ10tLNfpaJiLnAz7JHW6nCuKgQ047A54FVgPsk3R8RTy+3IGk0MBpg0KBB7QjJzMxKFT3P4VeltwVto1nAxrnhgcDsCmVujYh3I+J14G5gu9IFRcTFEdEUEU0DBgyoQ2hmZtasaJvD4cBDkqZJ+qGk0qqgoqYAm0oaIqkXcCAwsaTMDcBnJK0kqS/wCVLbhpmZdZKiyWE9YH9gBqla6XlJd0g6tDXXWIqIxcCxpHMmngT+LyIel3SUpKOyMk8CtwL/Ah4ELomIaYXfkZmZtZsiSqv8W5hBWhv4OnAwMByYD1wXEYfWP7ximpqaYurUqY1avZnZCknSQxHRVGlaqy/ZHRFvRsRvI+JTwGeBN0ndT83MrJso3FupWVaNtB9wCLArsBi4pr5hmZlZIxXtrdQju+zFlcArwDigN3A0sH5E7N+BMZqZWScreuQwGxhAapA+C7g8ImZ2VFBmZtZYRZPDNcBlEfFARwZjZmZdQ6HkEBHHdHQgZmbWdVRNDpJGAA9HxLzsdU0RcXddIzMzs4apdeQwCdiZdCLaJMqvgdRM2bSe9QzMzMwap1Zy+CzwRPb6c1RPDmZm1s1UTQ4RMTn3elKnRGNmZl1C0fMcnpVUdmXUbNowSc/WNywzM2ukopfPGEw66a2SPsBH6hKNmZl1Ca25tlK1Nocm4K32h2JmZl1Fra6sY4Ax2WAAN0paVFJsFWAd0r2gzcysm6jVW+lZ4M7s9TeAqcBrJWUWkno0XVL/0MzMrFFq9Va6gXRXNiQBnBoRz3VSXGZm1kBFL59xeEcHYmZmXUetNoeTSLfonJ29riUi4mf1Dc3MzBql1pHDKaR7Oc/OXtcSpHtLm5lZN1CrzaFHpddmZtb9eaNvZmZlil4+YzNJO+WGV5F0hqQbJR3bceGZmVkjFD1yOB/YLzd8OvB9YEPgXEm+GZCZWTdSNDlsC/wdQFIP4DDg+IjYETgNGN0x4ZmZWSMUTQ5rAXOy1x8D1gauzoYnAZvUNSozM2uoosnhFWBo9vqLwDMR8WI2vBqwuN6BmZlZ4xQ6QxqYCJwhaRgwCrgoN20b0nWYzMysmyiaHMaS7tvwH6RE8fPctL2Av9Y5LjMza6Ci11Z6F/hWlWmfrGtEZmbWcEWPHACQtA4wnHQPhznA/RHxRkcEZmZmjVM4OUg6jXRuQ/52oQsl/TIiflL3yMzMrGGKniF9HHAicAXwWWDL7PkK4ERJ3y26QkkjJU2XNEPS2BrlPi5piaT9qpUxM7OOUfTI4SjgvIgYkxs3HZgsaR5wNPA/LS1EUk/gAmA3YBYwRdLEiHiiQrmzgNsKxmdmZnVU9DyHwcBNVabdlE0vYidgRkQ8GxGLSPee3rtCue8A1wCvFlyumZnVUdHkMAcYVmXa1nxw9nRLNgJezA3PysYtI2kjYF/gwloLkjRa0lRJU197rfTW1mZm1h5Fk8N1wM8kHSppZQBJK0n6OnAqaS+/CFUYFyXDvyZdt2lJrQVFxMUR0RQRTQMGDCi4ejMzK6Jom8MJwHbAH4Bxkt4gdWftCdxLaqwuYhawcW54IOlOc3lNwFWSAPoDe0haHBHXF1yHmZm1U9GT4OZKGgHsCXyGlBjeACYDt0RE6d5/NVOATSUNAV4CDgQOKlnXkObXksYDf3FiMDPrXDWTg6T+wCGki+69CVwTEce3dWURsTi7OdBtpKOOcRHxuKSjsuk12xnMzKxzqNpOv6TNgbuBfIX+EmC/iLihE2IrrKmpKaZOndroMMzMViiSHoqIpkrTajVInwYsAHYFViVdffVB4Ff1DtDMzLqWWsnhE8BJEXF3RLwXEY8D3wYGS3L3IDOzbqxWctiIdBZ03nRSd9QNOywiMzNruFrJQaQ2hrylBeYzM7MVXEtdWX8q6fXccPNJbD/LznVoFhHxjfqGZmZmjVIrObxAuvpqqedJl8zIK3qeg5mZrQCqJoeIGNyJcZiZWRfitgMzMyvj5GBmZmWcHMzMrIyTg5mZlXFyMDOzMk4OZmZWpujNfgCQtC0wAugHXBQR/5Y0FHglIuZ2RIBmZtb5CiUHSb2BK4CvkM6SDuBG4N/AL4CngbEdFKOZmXWyotVKpwNfAA4F1mP5e0HfAvxHneMyM7MGKlqt9HXgvyPij5J6lkx7Dhhc16jMzKyhih459AOerLGM3vUJx8zMuoKiyeE5YHiVaTtRft8HMzNbgRVNDpcBYyUdDPTKxoWkzwJjgHEdEZyZmTVG0eTwC+Am4HKg+T4O9wJ3ALdGxG86IDYzM2uQQg3SEbEEOFDSBaSeSesCc0iJYXIHxmdmZg3QqpPgIuIe4J4OisXMzLoIXz7DzMzKFD1Deikt3Ao0IkrPfzAzsxVU0WqlUylPDv2AL5LOcRhfx5jMzKzBijZIn1JpfHa29I3A23WMyczMGqxdbQ5ZL6bfAsfVJRozM+sS6tEg3RtYpw7LMTOzLqJog/SgCqN7AcOAM4Gp9QzKzMwaq2iD9Ewq91YS8AxwTNEVShoJnAf0BC6JiDNLph8MHJ8NzgP+MyIeLbp8MzNrv6LJ4fAK4xYAzwNTsraHFmUN2BcAuwGzgCmSJkbEE7lizwG7RMSbknYHLgY+UTBOMzOrgxaTQ7ZBfwSYHRGvtXN9OwEzIuLZbNlXAXsDy5JDRPwjV/5+YGA712lmZq1UpEE6SG0KH6vD+jYCXswNz8rGVXME6U5zZmbWiVo8coiIpZJeBFatw/pUYVzFM6+zy4EfAXy6yvTRwGiAQYMqtZebmVlbFe3KehFwnKReLZasbRawcW54IDC7tJCkbYFLgL0jYk6lBUXExRHRFBFNAwYMaGdYZmaWV7RBenXgo8Czkm4FXmb5Pf6IiJMLLGcKsKmkIcBLwIHAQfkCWbfZa4FDI+LpgvGZmVkdVU0Okp4F9s26kZ6Ym/TNCsUDaDE5RMRiSccCt5G6so6LiMclHZVNvxA4iXTdpt9KAlgcEU0F34+ZmdVBrSOHwaSzn4mIul3aOyJuBm4uGXdh7vWRwJH1Wp+ZmbWe7+dgZmZlWkoONe/hYGZm3VNLDdI/lfR6geVERHyjHgGZmVnjtZQctgcWFliOjzDMzLqRlpLDPhHxYKdEYmZmXYYbpM3MrIyTg5mZlXFyMDOzMlXbHOp54puZma1YnADMzKyMk4OZmZVxcjAzszJODmZmVsbJwczMyjg5mJlZGScHMzMr4+RgZmZlnBzMzKyMk4OZmZVxcjAzszJODmZmVqalm/2YmbXJ+++/z6xZs1iwYEGjQ/nQ69OnDwMHDmTllVcuPI+Tg5l1iFmzZrH66qszePBgJDU6nA+tiGDOnDnMmjWLIUOGFJ7P1Upm1iEWLFhAv379nBgaTBL9+vVr9RGck4OZdRgnhq6hLd+Dk4OZmZVxcjCzbuu6665DEk899dSycZMmTeJLX/rScuVGjRrF1VdfDaSG9LFjx7LpppsybNgwdtppJ2655ZZ2x3LGGWcwdOhQNt98c2677baKZR599FGGDx/ONttsw5e//GXeeeed5aa/8MILrLbaavzyl79cNm7RokWMHj2azTbbjC222IJrrrmm3bGCk4OZdWMTJkzg05/+NFdddVXheX7yk5/w8ssvM23aNKZNm8aNN97I3Llz2xXHE088wVVXXcXjjz/OrbfeytFHH82SJUvKyh155JGceeaZPPbYY+y7776cffbZy00fM2YMu++++3LjTj/9dNZdd12efvppnnjiCXbZZZd2xdrMycHMuqV58+bx97//nUsvvbRwcpg/fz6/+93v+M1vfkPv3r0BWG+99dh///3bFcsNN9zAgQceSO/evRkyZAhDhw7lwQcfLCs3ffp0RowYAcBuu+223FHA9ddfzyabbMLWW2+93Dzjxo3jhBNOAKBHjx7079+/XbE2c1dWM+t4Dx0Hbz5S32WuvT3s+Ouqk6+//npGjhzJZpttxjrrrMPDDz/MDjvsUHORM2bMYNCgQayxxhotrn7MmDH87W9/Kxt/4IEHMnbs2OXGvfTSS+y8887LhgcOHMhLL71UNu+wYcOYOHEie++9N3/+85958cUXAXj33Xc566yzuP3225erUnrrrbeAdLQzadIkPvrRj3L++eez3nrrtRh/S3zkYGbd0oQJEzjwwAOBtMGeMGECUL3nTmt79Jx77rk88sgjZY/SxADpXIMi6xs3bhwXXHABO+64I3PnzqVXr14AnHzyyYwZM4bVVlttufKLFy9m1qxZfOpTn+Lhhx9m+PDh/OAHP2jV+6im048cJI0EzgN6ApdExJkl05VN3wOYD4yKiIc7O04zq6Mae/gdYc6cOdx1111MmzYNSSxZsgRJ/OIXv6Bfv368+eaby5V/44036N+/P0OHDuWFF15g7ty5rL766jXX0Zojh4EDBy47CoB0guCGG25YNu8WW2zBX//6VwCefvppbrrpJgAeeOABrr76an70ox/x1ltv0aNHD/r06cMxxxxD37592XfffQH42te+xqWXXlrgEyogIjrtQUoIzwCbAL2AR4GtSsrsAdwCCNgZeKCl5e64445hZl3LE0880bB1X3jhhTF69Ojlxo0YMSLuvvvuWLBgQQwePHhZfDNnzoxBgwbFW2+9FRERP/zhD2PUqFGxcOHCiIiYPXt2XH755e2KZ9q0abHtttvGggUL4tlnn40hQ4bE4sWLy8q98sorERGxZMmSOPTQQ+PSSy8tK3PyySfH2WefvWz4gAMOiDvvvDMiIn7/+9/HfvvtVzGGSt8HMDWqbFc7u1ppJ2BGRDwbEYuAq4C9S8rsDVyWxX4/sJakDTo5TjNbgU2YMGHZ3nSzr371q/zxj3+kd+/eXHHFFRx++OFsv/327LffflxyySWsueaaAJx22mkMGDCArbbaimHDhrHPPvswYMCAdsWz9dZbs//++7PVVlsxcuRILrjgAnr27AmkHkpTp05dFndzl9QNN9yQww8/vMVln3XWWZxyyilsu+22XH755ZxzzjntirWZokJdWEeRtB8wMiKOzIYPBT4REcfmyvwFODMi7s2G7wSOj4ipJcsaDYwGGDRo0I7PP/98J70LMyviySefZMstt2x0GJap9H1IeigimiqV7+wjh0otPqXZqUgZIuLiiGiKiKb2ZnUzM1teZyeHWcDGueGBwOw2lDEzsw7U2clhCrCppCGSegEHAhNLykwEDlOyM/B2RLzcyXGaWR10ZrW1VdeW76FTu7JGxGJJxwK3kXoujYuIxyUdlU2/ELiZ1GNpBqkra8stMmbW5fTp04c5c+b4st0NFtn9HPr06dOq+Tq1QbqjNDU1RXNrv5l1Db4TXNdR7U5wtRqkffkMM+sQK6+8cqvuPGZdiy+fYWZmZZwczMysjJODmZmV6RYN0pLeBv5fBy1+TeDtBi+nLfO2dp6i5fsDr7cylu6uXr+RjtKI+DpqnSvq/7G187WmbHv+kx+JiMpnEVe76NKK9AAu7urLbs9y2jJva+cpWp4aF+r6sD468ve3osbXUetcUf+PrZ2vlWU75D/ZXaqVblwBlt2e5bRl3tbO05GfYXfX1T+7RsTXUetcUf+PrZ2v4b+pblGtZJ1H0tSo0i/azDpfR/0nu8uRg3WeixsdgJktp0P+kz5yMDOzMj5yMDOzMk4OZmZWxsnBzMzKODlY3UjaRNKlkq5udCxmH0aSVpX0B0m/k3Rwe5bl5GAASBon6VVJ00rGj5Q0XdIMSWNrLSMino2IIzo2UrMPl1b+N78CXB0R3wL2as96nRys2XhgZH6EpJ7ABcDuwFbA1yVtJWkbSX8peazb+SGbfSiMp+B/k3Rb5RezYkvas1Lfz8EAiIi7JQ0uGb0TMCMingWQdBWwd0ScAXypk0M0+1BqzX8TmEVKEI/Qzp1/HzlYLRvxwV4IpB/eRtUKS+on6ULgY5JO6OjgzD7Eqv03rwW+Kul/aeclOHzkYLVUuvFv1bMmI2IOcFTHhWNmmYr/zYh4Fzi8HivwkYPVMgvYODc8EJjdoFjM7AMd/t90crBapgCbShoiqRdwIDCxwTGZWSf8N50cDABJE4D7gM0lzZJ0REQsBo4FbgOeBP4vIh5vZJxmHzaN+m/6wntmZlbGRw5mZlbGycHMzMo4OZiZWRknBzMzK+PkYGZmZZwczMysjJODtZmkUZKiyuMLrVjOTEnjOzDU0vXl41ws6TlJv5c0sM7rGZytY1Ru3ChJ36xQtvmzHFzPGFqIb9cKn8ULkn4rae02LvM4SV+pd6zW+XxtJauHr5FO5897ohGBtMJ44CLSf2B74KfAJyVtHxHv1WkdLwPDgWdy40Zl6xxXUvamrOzLdVp3a3yXdMZtX+DzwPGkSzN8uQ3LOg64l3QBOFuBOTlYPTwSETMaHUQrvRQR92ev75U0l5QwdqdOG7aIWAjc32LBVPY14LV6rLcNnsx9Fndl9+Y4UtL6EfHvBsVkDeZqJeswkr4o6WZJL0uaL2mapO9nNyqpNd/62a0OZ0tamM2/3A2FJPWVdFZWJbQoe/6xpLb+pqdkz0Oz5W8g6TJJr2cx/EvSIa2Js7RaSdIkYBfgU7mqnEnZtOWqlbLP7aEKn80GWfXPcblxQyRdKem1LI5HJO3bxs8B4OHseVBuHR+XdHV2+Yb3lO5A9nNJq+TKzAQ+Ahyce3/jc9O3kzRR0pvZMv4u6TPtiNM6kI8crB56Ssr/liIilgCbAHcCvwEWAE3AKcAAoNYtRy8nbWR+SLpm/Xqk6o6+ANm6biPdAetnwGPAzsBPgHWA77fhPQzJnt+StCowGVgbODGL4RDgckl9I+LiInFWcDRwBdAT+HY27p0qZS8DJkjaKiLyVXQHZc8TACRtDDwAvAqMIR19HABcI2mfiGjLxdgGk+4iNjM3bhDpBjLjgbnA1sBJpO/4wKzMvsDNwKOk75ksHiTtANwD/BP4FjCfdHn3OyR9MiLKEqE1WET44UebHqT686jwuLdCWZF2Rn4MvAn0yE2bCYzPDc8DvltjvYdm6xlRMv7HwCJg3RbiDuD0LJ4+pMTyJPAusCHpgmYB7Foy3x2kjXDPgnEOzpYzKjduUpXPp/mzHJwNrwK8DZxRUu4R4Obc8KWkDXC/knK3k6r7an0Ou2br/GL2WawO7ENKWL+sMV/zd3kIsDS/7uy7vKLCPHdmn3Gv3Lie2bjrG/1b9qP84Wolq4d9gY/nHkfAsiqQiyQ9T9povw+cBqwF1Lrn9BTgh5K+p3S/6tIbm4wEngf+IWml5gfwV2Bl0sa+JSdm8bxHuuLl+8AeETEbGEFqk5hUMs8VpKOerQrG2WaRGsWvIVXRCEDSNsB2pKOKZiNJe+tvl3wWtwHbSVqjwOpuI73/d4DrgLtJR0PLSFojq8Z7BliYlb+clCg2rbXwrOppF+DPwNJcjCIl3BEFYrRO5uRg9TAtIqbmHtOzuv+JpHtNnwZ8jpQ4Ts/m6VNjeQdk8/4I+BfwkqSTcu0J65Kqc94veTyYTe9XIOZxWTwfA/pHxLYRMTmbtg6Vew39Oze9SJztdRmp19Cu2fChpCqdG3Jl1gUOo/yzODubXuSzOIb0WXwB+BOwJ6mKLu/3pGqg/wF2y8ofk02r9V1C+rx6ZsssjfNYYO06fmZWJ25zsI7yUVIbw6ERcUXzSEktdo+MiFdJG55jJG0OfIPU1fQ14H+BOcBzwP5VFjGzQHwvR8TUKtPeADavMH797HlOwTjbazLwAnCIpMnA14GrY/mutnNIdflnVVlGkbuDPd38WUi6i9R2cqKk30fEi5L6kG5ef0pEnNc8U3YkU8RbpOqnC1j+qGeZiFhacFnWSZwcrKM0N8q+3zxC0srAwa1ZSERMJ22ojgKGZaNvBb4KzIuIp+oQa6nJwNckfSoi/p4bfxCpzeHJgnFWspBUt9+iiAhJV5IS0HWkW0GWblxvJZ0f8XjU4fyMbJ3HkRqOx2br7k3a83+/pPioCotYSGovyS/zXUn3kKrEHnYiWDE4OVhHeZLULnC6pCWkDcuYlmaStCapHvpK4Klsvr1JPYf+mhW7knQT9TslnUPqHdOLdLSyF7BPRMxvR+zjge8B10r6MekEv4NJ1SnfjoglBeOs5AngaEkHkE6Om5sllmouA04ALiT1iJpcMv0kUnXa3ZLOJx01rU1KUJtERNnZ2C2JiEclXQMcIen0iJgt6X7g+5JeBl4HvglsVOX9fUbSl0jVcK9HxEzgv0htGbdJupRUbdcf2IHUwF+r95o1QqNbxP1YcR980MNmaJXp25POlp1P2sCeChxJrldOVm4mWW8l0l7qRcDjpN5A75Aafg8qWXYfUnfJp0h7q29k5U4BVmoh7gBOa6HMBqQG19ez5f8LOCQ3vcU4qdxbaX1SA/LcbNqkks9ycIVYpmTTfl4l1oHAJcBLpIb/l0m9lQ5p4T3umi33CxWmbUnqznpe7r3cksX9KnA+qW1iuV5dwBakaq752bTxJcu8Kpt/YfabmEjqCNDw37Mfyz98m1AzMyvjHgJmZlbGycHMzMo4OZiZWRknBzMzK+PkYGZmZZwczMysjJODmZmVcXIwM7MyTg5mZlbm/wNNVqLkvHxH1QAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fpr_logreg, tpr_logreg, thresholds = metrics.roc_curve(y_clf, ypred_logreg, pos_label=1)\n", "auc_logreg = metrics.auc(fpr_logreg, tpr_logreg)\n", "\n", "plt.semilogx(fpr_logreg, tpr_logreg, '-', color='orange', \n", " label='AUC = %0.3f' % auc_logreg)\n", "plt.xlabel('False Positive Rate', fontsize=16)\n", "plt.ylabel('True Positive Rate', fontsize=16)\n", "plt.title('ROC curve: Logistic regression', fontsize=16)\n", "plt.legend(loc=\"lower right\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data scaling\n", "See [preprocessing.StandardScaler](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question** Scale the data, and compute the cross-validated predictions of the logistic regression on the scaled data." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.952\n" ] } ], "source": [ "from sklearn import preprocessing\n", "\n", "# Scale the data with preprocessing.StandardScaler\n", "# Initialize a scaler\n", "scaler = preprocessing.StandardScaler()\n", "# Scale your design matrix\n", "X_clf_scaled = scaler.fit_transform(X_clf)\n", "\n", "# Initialize a LogisticRegression model. \n", "# Use C=1e7 to ensure there is no regularization (we'll talk about regularization next time!)\n", "clf = linear_model.LogisticRegression(C=1e7)\n", "\n", "# Cross-validate it for the scaled data\n", "ypred_logreg_scaled = cross_validate_regr(X_clf_scaled, y_clf, clf, folds_clf.split(X_clf_scaled, y_clf))\n", "\n", "print(\"Accuracy: %.3f\" % metrics.accuracy_score(ypred_logreg_scaled > 0.5, y_clf))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question** Plot the two ROC curves (one for the logistic regression on the original data, one for the logistic regression on the scaled data) on the same plot." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEhCAYAAACUW2yNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAA640lEQVR4nO3dd3wVZfb48c9JAwKCSBNBDC4q0oWAYqEsIgoK4vpVsIENG6u4uyroz7K2dXexd9aCWAAVUUREbKCuqzQREERQokSQJiAQWpLz++OZhMltmSQ3uUk879frvu6dmWdmzsy9d87M80wRVcUYY4zxS0p0AMYYYyofSw7GGGPCWHIwxhgTxpKDMcaYMJYcjDHGhLHkYIwxJowlhwQSkeEior7XXhH5XkTuFZGaUcbpKiJTRGS9iOwRkSwReUJEmkUpnyoiV4vIf0VkqzfOahF5TkQ6l+8SVm7eunupAudX8H1nlHCcS+IxLQMiMltEZic6jqogJdEBGAD+D8gGDgAGA2O8z3/2FxKRC4Hngc+A64C1wNHAjcDZInKyqi72la8NvAt0BZ4C7gV2AK2AC4APgfrluWCmiHeA7sC6EowzHPc/fS4O0zJwdaIDqCrELoJLHBEZjtvYH6Gqq3z93wdOAOqoar7X7yjga2A6cE5Bf29YA+BLIB9oq6r7vP7PABcCvVT1fxHmP1hVp5bT4hVLRGqo6p4Ezj8L+ExVL0hUDMXx9nJTVPXECpxnMm7bkFsB80rob8BEZ9VKldNCoBbQ0NdvFJAM/NmfGABUdTNwM3AEcBaAiDTF7XX+J1Ji8MYrNjGISE8ReV9EtonIThH5WkQu9Q1XEbkjZJwMr/9wX7/xIpItIt1F5HMR2QX8S0RmiMiCCPNtKiK5IjLK16+liLwsIhu96rFFIjK4uGUoCxHpJiIfiMgOb/k/FJFuEcpd51VT7RaRuSJyvNc93lcmrCpIRM4Tka+86W8TkSUicoU3bDbQEzjBV/U4O9q0vP6Xi8hCEdklIltEZI6IHF/MMqqI3CMio0VkNbAXaO8N6+kt83Zv+d8TkXYh4yeLyN0isk5EckTkIxFpHfrbEJE7vH7tvOnsAF71hqWLyD+9Ks+93vstIpLkG7+OiDwqIj953/9677tpHfI9LPct/3z/byRStZKIHCUiU8VVu+4SkS9E5NSQMgWxHyEi73jf148icps/xuqkWi5UNZABbAM2+/r1AeararRqhHdwRw5/9Lp745LJtNIGISKDcFVPacAVwCBc9cZhpZxkPWASMBE4DXgFmAB0FpE2IWXP894nerEcijs66ghcDwzEJdEpIjLQF3NBYrqjlDEWEpEOwBxc1dtw4CKgLjBHRDr6yl0GPAR8gFtH471lO7CY6Z8IvOTN40xc9eJ/fONdDXwFLMZVIXUnRrWIiIwFxuHWyzm4qsNPgBYBFnc4MAD4m/e+VkQG4L7/Hd60zsNVd37qfR8F/o7bOZmAW/73iP27e8tb5oHAgyKS4o1zGfAw7rfxDHAr8G/feA96y/V3oC9wJbAIb32JyPnA/bjfTH/gfOB14KBogYjIIbhq2o7ASG/6W4F3ROS0CKNMBT7CfV9verEMi7GsVZeq2itBL9wfUoGjcPXK9YFLgFxgZEjZXcDEYqb3CzDD+3xTwbRLGZsAWcB8IClGOQXuCOmX4fUf7us33us3KKRsLVwi/EdI/0UFy+J1PwtsBBqElHsfWOTrPsxbf7cFWMYs4KUYw1/HbSgO9PWrC/wKvOF1JwFr/LF6/c/ylnd8hO87w+v+G/BrMTHOxlV9RfvtFEyrFZAHPFCK71px7Ve1QvqvAj4M6VcX2AQ85HXXxyWPJ0LK/SX0twHc4fW7LqTshV7/HiH9b8EdxTT2upfGWj7gMWBhgPU529c91vu9tPL1SwZW+Kfli/3ikOktAWaV5j9W2V925FA5fAvsw210ngWeVtXHSjEdiWNMR+E2tM9oSDVWGeTi2kwKqeouYApwvogIgIi0x+3JTfAVPRWYAWwTkZSCF26Ps6OI1PWm96OqpqjqnXGItwcwXVW3+uL9DbdX3NPr1dx7vRYy7lve8sYyD6gvIi+JyOkicmAZYj0Zl6jGlXL8md53AYCIHAH8AXg5ZH3nAP/DrRtw1U+1CV/+12PMK7Q681TgR+DzkHnNAlKB47xy84DhInKziGSKaxvxmwd08qqeThaR9ADL3QP4Qn1tfqqahzv66FTwu/J5J6R7KcGOzKocSw6Vw2DcGUX9cVUTV4vIRSFlsnF75BGJOzOpIW4vFt97aauAGvjmGy8bvD9eqAnAoUAvr/tCYDtuA1ugMa5aZ1/Iq6DaoQHxdxCRzwb6hf1neTX13jf4C3jLuSnWxFV1Dq4q6VDcBnOjV4feoRSxlvX7Cl3Oxt77s4Sv89N984u4/MD6Es7rsAjzmesNL5jXn4GncUfX84ANIvKgLwlMAK4CjsXtNPwqIm+EtsuEiPUdC+Fn8/0a0r0HiHjaeVVnp7JWDksL9lxE5CNcHfO/RWSKqu70ynwIXCoiTTVyu8MAXLL/yOuejatmOAO3B1ZSBRu2iNdP+OzBtUn4RdtQRzs1bg7wE3CBiMwBhgKv+/dkce0vnwL/jDKNtcXEWRq/AgdH6H8w+zcSBd9FY38Bb6/Wf0JBRKr6OvC6iNTBJcd/AjNFpHkJj9j839eKEoxXGEpId0F71xjcDkuovd67f/m/8Q1vUsJ5rcbV90eSBaCqO7x4xojIYcDZwH1eLDepq+d5GnhaROoDp+DaICbjEkYksb5jJTwZ/G7YkUMlo+60vhtwfzZ/4+PDuAbnR0PPjhCRg3DXMKwC3vCmsxZXzz9CRLpHmpeInBkjlO9wf8rLCqp7ovgRaBfSb0CM8mG8P/XLuD97f1w1zYSQYjOBDsA3qjo/wqs8ToecAwwQkQMKenifz/CGgdtTz8YdAfidSQl2vlR1h6pOx23cmrI/we7BtcsU5wPc72NE0HkWYwXu+28bZX0XXE+zBNhJ+PKHdscyE3f0tCPKvMKOwLzqw/u9+Yf+/lDVLao6GXc2VNhwnznAcVL0DLJk4FzgK1XdXoLlqFbsyKESUtVpIjIP+JuIPKaqu1R1ubhTHJ8BPhSRp3B7ba1xF8EdCPRV7xoHzyjgSF/5D3CNh4fjzuTIxJ1xESkGFXca6RvAR974G3EX3TVW1du9opOA/ycitwBfACfh9vxLagJur/ApXJXYnJDht+GqGT4RkcdwG676uD/+4ap6CYC3R/k9cGfAdocWInJ2hP7/A+7CVaF8KCL/xO1J3gSkA3cCqGq+iPwd+I+460pew63f0biG9qh7/yJyJ24P+2PckU9z4FpcA/tGr9gyXDXjud5ybVfVsCMDVf1eRB4E/uIlsGm4I8duwLfehjIw7/u/BnhLRNJwG9lNXrzHAz+p6gOqukVEHgJuFpHtuN9YZ6DgdOcgRz8vAxfj1vP9uOt50nBtHgOBM1U1R0T+5y3XEtzvuCeubeoFABEZh6uO/B+umutIXBVlrCPnB3GN+++LyO3Ab7idsiMp4U5OtZPoFvHf84v9Z5y0ijDsFG/Y9SH9j8Orn8YdTv+I26AeGmUeqcA1wOe4H/5e3CH8M0CHADH+Ebfx2uG9vsZ3xgauvvVhXKLajjuE70bks5Wyi5nXPG+8e6MMb+7F/bO3HOtwZytd4CuTQYQzqKJML8srG+l1tlfmWPYn1Z246r1uEaY1yvsuduPO8DoR2AI8GOH7zvC6B+DqxtfhjhDW4Or4D/GNczCuIX67N+7sSNPylb8SVy25B1clMhvoXsx6UODuKMO6404i2OItWxZuh6C7r0wycA+unn6XN8/jCTkzif1n/KREmE9Nb/i3vtjnef1SvDL/xJ3au837LpYA1/qmMcyb9wZvGqtxG/+6vjKz8Z2t5PU7CreTtM1bxi+AU0PKRIwd97vOSvS2pDxedoW0MeVARLrijnQuUtUXEx1PRROR/8MdbfRQ1U8THY8pOUsOxpSRiLTEHZ19ijs6Oxp3UdheoJ2q5iQwvHInIsfijoK+xO15d8FVq60AjlfbyFRJ1uZgTNntwrV9XIRrB9mCq4oaXd0Tg2cH7nqBa3AXyW3AHTWMscRQddmRgzHGmDB2KqsxxpgwlhyMMcaEqRZtDg0bNtSMjIxEh2GMMVXKggULNqlqo0jDqkVyyMjIYP78+YkOwxhjqhQR+THaMKtWMsYYE8aSgzHGmDCWHIwxxoSx5GCMMSZMhSYHEXlORDaIyNIow0VEHhGRVSKyWEQ6V2R8xhhjnIo+chiPeyRgNKcBR3ivEcCTFRCTMcaYEBV6KquqflLMI/sGARO8+7F8ISIHxnjymTHGVCt5ebB3r++1R9mzex+5u3PI3ZND3h73nr83h/x97tWgRUtadWkb91gq23UOzdj/7GNwT9hqRoRnvIrICLynXrVoUS2f722MiaOCDe+ePUU3wMV1Ryuzb28++ft2o7k5SF4OmptDUn4Okp9Dsu4kKT+HZHJIIYcUce+pSTmkSg5pye5VIzmHGik51EzJIT11J7XSckivkUO6914/LYeU5EiPXd9v9qc30qpLtKfnll5lSw6RHkcZ8c6AqjoOGAeQmZlpdw80JkFUi+7xBt24lmaDHHScvNxcknX/xjk1yW2Aa9fc6Ta8IRvhiO9pORxQI4faNXYW7V8nh/SDckivsav4lRMiX4W9eenulZ/O3vza5Go6uaSTSwPyOJQcSWeHpJMv6WheOko6JKcjKemQko6k1iYpNZ2ktHSSa6TTuntxj3kvncqWHLJxz5It0JzyeXC8MVWCKuTmVszGtizTDX5zZ6VG6p4iG9vaNXZG3jCn53BArRyapOdQp2YOdQ7IoXbNHGp7G+xaaTnulZpDzVS38a+R4vbGU5L2FR9KiHxSyZPa5Es6+UnpaLLbKJOcDin1kZR0xNsoJ6Wmo2mu2w2v7b17r+TI70lJNagpQs0SR1fxKltymAaMFJFJuMczbrP2BlNe/Bve8t6TLcs45XFX/bS0/a8aNdx7zRp5HJC+i3q13Yb54Ho7qVPL2zDXchtl/0a8Vqq3YU7ZXzVSI3knacluTz0tyatO8apUkskhWXOQyJUBsUXZ2JJcD1KaRhleO+aGumi/WiQlpdq5/T4VmhxEZCLQC2goItnA7bhnHKOqT+GeldsfWAXk4B46bqogVdi3r+I3tiWdbnko2NhG2gD7u+vUgYMOijy8aD+lZto+t9ectrPo3nLBxtlXh11Qr50i7j2Zna6KRXNIUlcnLrk5kJcD/vf8PSVfWEkuutdc5L1JhA1ytLKx3muBRKpxNuWpos9WGlrMcMU9TcrEoBq+QUzk3m+0fuWhYIMZaWPr765bN9gGuqRlwsZJzadG6m7Skna6PWXcxpfckA1vxPedAcp47xqhUTIf2OO9IkmuGX2Pukaj4HvYUfe8a0NSavl80SbhKlu1Uqls2wbvvLO/O9JheGi/IGWCjpefv38vuSI2yPtKXp1aLJGiG75YG8V69cpxYxujTHJyCXcg8/cF3/gWbsyjbLB358COSOOWvFESJPoGuUYDSDm0RPXYEatSkmpCUnIpYjPGqRaPCRXJVKi8t+xOSir/DWdZxklLg5SK3E1QhbzdwTfaJd3LLnjPL0UWTUqLvacceK/a36920X5JaVZNYioFEVmgqpmRhlWLI4fWrWHChKL9Iv33QvsFKRNkPJHYG+TkqrQDl59Xgj3tnSXcwPve49YoWRvSDoTkQ0peJRKpbjupWvwljCmzavFPqF0bunZNdBTlTBXy94ZUgcRxo10wTn4pGgsiNkp6e8upB5e8SiTSsOSatrdtTAWqFskh4TTf1T1HrceOUp8daOPtG0fzSx6bv1GySB12HajZuJR72CHv1ihpTLVTPZLD3q2wZqrXUVwLcujwvOgb9qB75nm7Sx6zJIU3OhY2SjYMXo8ds8GylpuPMcaUUPVIDju+h0/Pit/0kmpE31NOP6hse9nWKGmMqQKqR3KodzScNtHXo7iWZyn62RoljTGmiOqxFUxOh/odEx2FMcZUG1YhbYwxJowlB2OMMWEsORhjjAljycEYY0wYSw7GGGPCWHIwxhgTxpKDMcaYMJYcjDHGhLHkYIwxJowlB2OMMWEsORhjjAljycEYY0wYSw7GGGPCWHIwxhgTpsTJQUTqiMhhImLPhjTGmGoqcHIQkdNFZCGwDfgeaO/1f0ZEziun+IwxxiRAoOQgImcCbwGbgJtCxlsNDIt7ZMYYYxIm6JHD7cDzqnoK8FDIsKVAu3gGZYwxJrGCJoejgcneZw0ZtgVoELeIjDHGJFzQ5PAb0DDKsAxgY1yiMcYYUykETQ7vA2NE5EBfPxWRGsBI4N14B2aMMSZxUgKWuwWYC6wAZuCqlkYDHYB6wJnlEZwxxpjECHTkoKpZQGdgOtAXyAN6AF8Ax6rq2qAzFJFTRWSFiKwSkdERhtcTkbdF5GsR+UZELg46bWOMMfER9MgBVc0GLi3LzEQkGXgcl2CygXkiMk1Vl/mKXQMsU9UzRKQRsEJEXlbVvWWZtzHGmOCCXufwkYi0jjLsSBH5KOD8ugGrVPUHb2M/CRgUUkaBA0REgDrAr0BuwOkbY4yJg6AN0r2AulGGHQD0DDidZsAaX3e218/vMdyps2uBJcB1qpofOiERGSEi80Vk/saNdrKUMcbEU0nurRR6fUOBPwA7Ak5DAky3H7AIOAToBDwmImGJSVXHqWqmqmY2atQo4OyNMcYEEbXNwWsILmgMVmCciGwPKVYLd3X0hwHnlw0c6utujjtC8LsYuE9VFVglIquB1rizpYwxxlSAWEcO+bizkvJwe/z+7oLXZuBJgjdUzwOOEJGWIpIGDAGmhZT5CegDICJNgKOAHwJO3xhjTBxEPXJQ1ReAFwBE5GPgKlX9tiwzU9VcERkJvAckA8+p6jcicqU3/CngLmC8iCzBJaWbVHVTWeZrjDGmZMTV3lRtmZmZOn/+/ESHYYwxVYqILFDVzEjDAl/n4E2oI66ap2boMFWdULrwjDHGVDaBkoN3T6V3gOMKennv/sMOSw7GGFNNBD2V9V7cbbl74BLDYOCPwMu4xuJu5RKdMcaYhAiaHPrhEsQXXne2qs5W1YuAD4DryiM4Y4wxiRE0OTQFflDVPGA37qroAm8AA+IdmDHGmMQJmhx+AQ70Pv8IdPcNaxXPgIwxxiRe0LOVPsMlhOnAi8DtIpKBuyHeMMIvZDPGGFOFBU0Of8fd6wjg37jG6XOBdFxi+HP8QzPGGJMogZKDqn4PfO993gf81XsZY4yphkpyV9aIROQYEZkaj2CMMcZUDjGPHLwnt3UBWgDfq+pXvmGZwO1AfyD0bq3GGGOqsKhHDiLSHPgS+B/wKjBfRCaLSJqIPOMN+yNwP3B4RQRrjDGmYsQ6crgP9xyFW4GFQEvgZuC/uKOJF4DRqrq+vIM0xhhTsWIlhz7AHao6tqCHiKzAXRH9qKraVdHGGFNNxWqQbsT+22UU+J/3/lr5hGOMMaYyiJUckoC9If0KunPKJxxjjDGVQXHXOZwhIu183Um423QPFJFO/oKq+lycYzPGGJMgxSWHW6L0vy2kWwFLDsYYU03ESg4tKywKY4wxlUrU5KCqP1ZkIMYYYyqPMt8+wxhjTPVjycEYY0wYSw7GGGPCWHIwxhgTxpKDMcaYMCVKDiKSJCLtRKSniNQur6CMMcYkVuDkICLXAL8AXwMfAUd5/d8UkWvLJzxjjDGJECg5iMjlwMPAm7hnR4tv8KfAn+IemTHGmIQJeuTwF+B+VR0BhD4S9Fu8owhjjDHVQ9Dk0BJ4L8qwncCBcYnGGGNMpRA0OWwCMqIMOwr4OegMReRUEVkhIqtEZHSUMr1EZJGIfCMic4JO2xhjTHwETQ5vA7eJiP9Z0SoiDYHrcW0RxRKRZOBx4DSgDTBURNqElDkQeAIYqKptgf8LGKMxxpg4CZoc/h+wB1iKe0yoAo8Ay4E84M6A0+kGrFLVH1R1LzAJGBRS5jzgDVX9CUBVNwSctjHGmDgJlBxUdTOQCfwDSAW+x93R9TGgu6puCzi/ZsAaX3e218/vSKC+iMwWkQUiclHAaRtjjImT4h72U0hVtwN3ea/Skgj9NEJMXYA+QC3gfyLyhap+V2RCIiOAEQAtWrQoQ0jGGGNCBb3O4YHQx4KWUjZwqK+7ObA2QpmZqrpTVTcBnwAdQyekquNUNVNVMxs1ahSH0IwxxhQI2uZwMbBARJaKyA0iEloVFNQ84AgRaSkiacAQYFpImbeAk0QkRUTSgWNxbRvGGGMqSNDk0AQ4B1iFq1b6UUQ+EJELS3KPJVXNBUbirplYDryqqt+IyJUicqVXZjkwE1gMzAWeUdWlgZfIGGNMmYlqaJV/MSOI1AeGAucD3YEcYKqqXhj/8ILJzMzU+fPnJ2r2xhhTJYnIAlXNjDSsxLfsVtUtqvqEqp4A9Aa24E4/NcYYU00EPlupgFeNdDZwAdALyAWmxDcsY4wxiRT0bKUk77YXLwPrgeeAGsDVwMGqek45xmiMMaaCBT1yWAs0wjVI/xN4UVWzyisoY4wxiRU0OUwBJqjql+UZjDHGmMohUHJQ1WvKOxBjjDGVR9TkICI9gIWqusP7HJOqfhLXyIwxxiRMrCOH2cBxuAvRZhN+D6QC4g1LjmdgxhhjEidWcugNLPM+/5HoycEYY0w1EzU5qOoc3+fZFRKNMcaYSiHodQ4/iEjYnVG9Ye1E5If4hmWMMSaRgt4+IwN30VskNYHD4hKNMcaYSqEk91aK1uaQCWwteyjGGGMqi1insl4PXO91KvC2iOwNKVYLOAj3LGhjjDHVRKyzlX4APvQ+DwPmAxtDyuzBndH0TPxDM8YYkyixzlZ6C/dUNkQE4E5VXV1BcRljjEmgoLfPuLi8AzHGGFN5xGpzuA33iM613udYVFXvim9oxhhjEiXWkcMduGc5r/U+x6K4Z0sbY4ypBmK1OSRF+myMMab6s42+McaYMEFvn3GkiHTzddcSkX+IyNsiMrL8wjPGGJMIQY8cHgPO9nXfA/wVOAR4UETsYUDGGFONBE0OHYD/AohIEnARcJOqdgHuBkaUT3jGGGMSIWhyOBDY7H0+BqgPvO51zwYOj2tUxhhjEipoclgPtPI+nwJ8r6prvO46QG68AzPGGJM4ga6QBqYB/xCRdsBw4GnfsPa4+zAZY4ypJoImh9G45zb0wyWKe33DBgKz4hyXMcaYBAp6b6WdwOVRhh0f14iMMcYkXNAjBwBE5CCgO+4ZDpuBL1T11/IIzBhjTOIETg4icjfu2gb/40L3iMhYVb017pEZY4xJmKBXSI8CbgZeAnoDR3vvLwE3i8i1QWcoIqeKyAoRWSUio2OU6yoieSJydrQyxhhjykfQI4crgYdV9XpfvxXAHBHZAVwNPFLcREQkGXgc6AtkA/NEZJqqLotQ7p/AewHjM8YYE0dBr3PIAN6JMuwdb3gQ3YBVqvqDqu7FPXt6UIRyfwamABsCTtcYY0wcBU0Om4F2UYa1Zf/V08VpBqzxdWd7/QqJSDNgMPBUrAmJyAgRmS8i8zduDH20tTHGmLIImhymAneJyIUikgogIikiMhS4E7eXH4RE6Kch3Q/h7tuUF2tCqjpOVTNVNbNRo0YBZ2+MMSaIoG0OY4COwAvAcyLyK+501mTgM1xjdRDZwKG+7ua4J835ZQKTRASgIdBfRHJV9c2A8zDGGFNGQS+C2y4iPYABwEm4xPArMAd4V1VD9/6jmQccISItgZ+BIcB5IfNqWfBZRMYD0y0xGGNMxYqZHESkIXAB7qZ7W4ApqnpTaWemqrnew4Hewx11PKeq34jIld7wmO0MxhhjKoZE2+kXkaOATwB/hX4ecLaqvlUBsQWWmZmp8+fPT3QYxhhTpYjIAlXNjDQsVoP03cBuoBdQG3f31bnAA/EO0BhjTOUSKzkcC9ymqp+o6i5V/Qa4AsgQETs9yBhjqrFYyaEZ7ipovxW401EPKbeIjDHGJFys5CC4Nga//ADjGWOMqeKKO5X17yKyydddcBHbXd61DgVUVYfFNzRjjDGJEis5/IS7+2qoH3G3zPALep2DMcaYKiBqclDVjAqMwxhjTCVibQfGGGPCWHIwxhgTxpKDMcaYMJYcjDHGhLHkYIwxJowlB2OMMWGCPuwHABHpAPQAGgBPq+ovItIKWK+q28sjQGOMMRUvUHIQkRrAS8BZuKukFXgb+AX4F/AdMLqcYjTGGFPBglYr3QOcDFwINKHos6DfBfrFOS5jjDEJFLRaaSjw/1T1FRFJDhm2GsiIa1TGGGMSKuiRQwNgeYxp1IhPOMYYYyqDoMlhNdA9yrBuhD/3wRhjTBUWNDlMAEaLyPlAmtdPRaQ3cD3wXHkEZ4wxJjGCJod/Ae8ALwIFz3H4DPgAmKmqj5ZDbMYYYxIkUIO0quYBQ0TkcdyZSY2BzbjEMKcc4zPGGJMAJboITlU/BT4tp1iMMcZUEnb7DGOMMWGCXiGdTzGPAlXV0OsfjDHGVFFBq5XuJDw5NABOwV3jMD6OMRljjEmwoA3Sd0Tq710t/TawLY4xGWOMSbAytTl4ZzE9AYyKSzTGGGMqhXg0SNcADorDdIwxxlQSQRukW0TonQa0A+4D5sczKGOMMYkVtEE6i8hnKwnwPXBN0BmKyKnAw0Ay8Iyq3hcy/HzgJq9zB3CVqn4ddPrGGGPKLmhyuDhCv93Aj8A8r+2hWF4D9uNAXyAbmCci01R1ma/YaqCnqm4RkdOAccCxAeM0xhgTB8UmB2+DvghYq6obyzi/bsAqVf3Bm/YkYBBQmBxU9XNf+S+A5mWcpzHGmBIK0iCtuDaFY+Iwv2bAGl93ttcvmktxT5ozxhhTgYo9clDVfBFZA9SOw/wkQr+IV157twO/FDgxyvARwAiAFi0itZcbY4wpraCnsj4NjBKRtGJLxpYNHOrrbg6sDS0kIh2AZ4BBqro50oRUdZyqZqpqZqNGjcoYljHGGL+gDdIHAH8AfhCRmcA6iu7xq6reHmA684AjRKQl8DMwBDjPX8A7bfYN4EJV/S5gfMYYY+IoanIQkR+Awd5ppDf7Bl0SobgCxSYHVc0VkZHAe7hTWZ9T1W9E5Epv+FPAbbj7Nj0hIgC5qpoZcHmMMcbEQawjhwzc1c+oatxu7a2qM4AZIf2e8n2+DLgsXvMzxhhTcvY8B2OMMWGKSw4xn+FgjDGmeiquQfrvIrIpwHRUVYfFIyBjjDGJV1xy6ATsCTAdO8IwxphqpLjkcKaqzq2QSIwxxlQa1iBtjDEmTNCL4IwxUezbt4/s7Gx2796d6FCMiahmzZo0b96c1NTUwONYcjCmjLKzsznggAPIyMjAu3DTmEpDVdm8eTPZ2dm0bNky8HhRq5VUNcnaG4wp3u7du2nQoIElBlMpiQgNGjQo8ZGttTkYEweWGExlVprfpyUHY4wxYSw5GGPIysqiXbt2JRpn+PDhvP7662WeblZWFq+88kqJ5h3Jxo0bSU1N5emnny7Sv06dOkW6x48fz8iRIwu7J0yYQLt27Wjbti1t2rRh7NixZY5l5syZHHXUUbRq1Yr77rsvYpktW7YwePBgOnToQLdu3Vi6dGnhsIyMDNq3b0+nTp3IzNx/39E77riDZs2a0alTJzp16sSMGTMiTTouLDkYYxIqXsnhtdde47jjjmPixImBx3n33Xd56KGHmDVrFt988w0LFy6kXr16ZYojLy+Pa665hnfffZdly5YxceJEli1bFlbu3nvvpVOnTixevJgJEyZw3XXXFRn+8ccfs2jRIubPn1+k//XXX8+iRYtYtGgR/fv3L1OssVhyMKaK27lzJwMGDKBjx460a9eOyZMnAzBv3jyOP/54OnbsSLdu3di+fTtZWVmcdNJJdO7cmc6dO/P555+HTS8vL48bbriBrl270qFDh8I9cVVl5MiRtGnThgEDBrBhw4aI8SxYsICOHTvSvXt3Hn/88cL+0eY9evRoPv30Uzp16sSDDz4YM8ZOnTpFXQ8TJ07k/vvvJzs7m59//jnQuvvHP/7B2LFjOeSQQwB3yufll18eaNxo5s6dS6tWrTj88MNJS0tjyJAhvPXWW2Hlli1bRp8+fQBo3bo1WVlZrF+/vkzzjic7ldWYOBo1ChYtiu80O3WChx6KPnzmzJkccsghvPPOOwBs27aNvXv3cu655zJ58mS6du3Kb7/9Rq1atWjcuDHvv/8+NWvWZOXKlQwdOjRsz/TZZ5+lXr16zJs3jz179nDCCSdwyimn8NVXX7FixQqWLFnC+vXradOmDZdcEv54l4svvphHH32Unj17csMNNxT2jzbv++67j7FjxzJ9+nQAcnJyosa4KMrKXbNmDb/88gvdunXjnHPOYfLkyfzlL38pdt0uXbqULl26FFvu5Zdf5t///ndY/1atWoVVrf38888ceuj+B142b96cL7/8Mmzcjh078sYbb3DiiScyd+5cfvzxR7Kzs2nSpAkiwimnnIKIcMUVVzBixIjC8R577DEmTJhAZmYm999/P/Xr1y82/tKw5GBMFde+fXv+9re/cdNNN3H66adz0kknsWTJEpo2bUrXrl0BqFu3LuCOMkaOHMmiRYtITk7mu+/CH7Y4a9YsFi9eXLjR27ZtGytXruSTTz5h6NChJCcnc8ghh/DHP/4xbNxt27axdetWevbsCcCFF17Iu+++C7iLBYubd0nK+U2aNIlzzjkHgCFDhnDppZfGTA4lPXvn/PPP5/zzzw9UVjX8VnOR5jd69Giuu+46OnXqRPv27TnmmGNISXGb5P/+978ccsghbNiwgb59+9K6dWt69OjBVVddxa233oqIcOutt/LXv/6V5557rkTLEpQlB2PiKNYefnk58sgjWbBgATNmzGDMmDGccsopnHnmmRE3SA8++CBNmjTh66+/Jj8/n5o1a4aVUVUeffRR+vXrV6T/jBkzit2oqmrUMkHmXZJyfhMnTmT9+vW8/PLLAKxdu5aVK1dyxBFHUKtWLfbu3UtaWhoAv/76Kw0bNgSgbdu2LFiwIGKi8yvJkUPz5s1Zs2ZNYXd2dnZhtZVf3bp1ef755wG33lq2bFl4kVpB+caNGzN48GDmzp1Ljx49aNKkSeH4l19+OaeffnrsFVMG1uZgTBW3du1a0tPTueCCC/jb3/7GwoULad26NWvXrmXevHkAbN++ndzcXLZt20bTpk1JSkrixRdfJC8vL2x6/fr148knn2Tfvn0AfPfdd+zcuZMePXowadIk8vLyWLduHR9//HHYuAceeCD16tXjs88+AyjcWANR533AAQewffv2YsuBq5sPtWLFCnbu3MnPP/9MVlYWWVlZjBkzhkmTJgHQs2dPXnrpJQB27drFq6++Su/evQEYM2YMN954I7/88gsAe/bs4ZFHHgmbx/nnn1/YCOx/RTpbq2vXrqxcuZLVq1ezd+9eJk2axMCBA8PKbd26lb179wLwzDPP0KNHD+rWrcvOnTsL18fOnTuZNWtW4Rlf69atKxx/6tSpJT7DrCTsyMGYKm7JkiXccMMNJCUlkZqaypNPPklaWhqTJ0/mz3/+M7t27aJWrVp88MEHXH311fzpT3/itddeo3fv3tSuXTtsepdddhlZWVl07twZVaVRo0a8+eabDB48mI8++oj27dtz5JFHFlYdhXr++ee55JJLSE9PL3L0EW3eHTp0ICUlhY4dOzJ8+PCo5TZt2hSxymbixIkMHjy4SL8//elPDBkyhFtvvZWHH36YK664gkceeQRV5aKLLqJHjx4A9O/fn/Xr13PyyScXHvVEakcpiZSUFB577DH69etHXl4el1xyCW3btgXgqafcE5GvvPJKli9fzkUXXURycjJt2rTh2WefBWD9+vWFy5Obm8t5553HqaeeCsCNN97IokWLEBEyMjLCTtuNJ4m0squazMxMDW1UM6aiLF++nKOPPjrRYVR706dP54cffuDaa69NdChVUqTfqYgsUNXMSOXtyMEYUyWUZ/26CWdtDsYYY8JYcjDGGBPGkoMxxpgwlhyMMcaEseRgjDEmjCUHY35H+vfvz9atW2OWue222/jggw9KNf3Zs2cHOquoV69eYfd0CvXQQw+Rk5NTqjj8Bg0aRPfu3Yv0i3S7cf+tvb/77jv69+9Pq1atOProoznnnHPKfFO8X3/9lb59+3LEEUfQt29ftmzZErHcww8/XHgL8YciXHI/duxYRIRNmzYV9lu8eDHdu3enbdu2tG/fPi7PM7fkYMzvgKqSn5/PjBkzOPDAA2OWvfPOOzn55JMrJrAY4pEctm7dysKFC9m6dSurV68ONM7u3bsZMGAAV111FatWrWL58uVcddVVbNy4sUyx3HffffTp04eVK1fSp0+fiM95WLp0Kf/5z3+YO3cuX3/9NdOnT2flypWFw9esWcP7779PixYtCvvl5uZywQUX8NRTT/HNN98we/ZsUlNTyxQrWHIwplp44IEHaNeuHe3atSvc28zKyuLoo4/m6quvpnPnzqxZs4aMjIzCPc677rqL1q1b07dvX4YOHVr4kBv/XnVGRga33347nTt3pn379nz77beAuy318ccfzzHHHMPxxx/PihUrYsa3a9cuhgwZQocOHTj33HPZtWtX4bCrrrqKzMxM2rZty+233w7AI488wtq1a+ndu3fhrS4ilQN3pDNt2rSI850yZQpnnHEGQ4YMKbydRnFeeeUVunfvzhlnnFHYr3fv3mW+VcVbb73FsGHDABg2bBhvvvlmWJnly5dz3HHHkZ6eTkpKCj179mTq1KmFw6+//nr+9a9/Fbl/1axZs+jQoQMdO3YEoEGDBiQnJ5cpVrCL4IyJrwWjYMui+E6zfifo8lD0WS5YwPPPP8+XX36JqnLsscfSs2dP6tevz4oVK3j++ed54okniowzf/58pkyZwldffUVubi6dO3eOeuvqhg0bsnDhQp544gnGjh3LM888Q+vWrfnkk09ISUnhgw8+4Oabb2bKlClRY3zyySdJT09n8eLFLF68mM6dOxcOu+eeezjooIPIy8ujT58+LF68mGuvvZYHHniAjz/+uPAmeZHKdejQgTvvvDPqfCdOnMjtt99OkyZNOPvssxkzZkzUsgWC3sZ7+/btnHTSSRGHvfLKK7Rp06ZIv/Xr19O0aVMAmjZtGvF5GO3ateOWW25h8+bN1KpVixkzZhQ+CW7atGk0a9asMAkU+O677xAR+vXrx8aNGxkyZAg33nhjsfEXx5KDMVXcZ599xuDBgwvvQXTWWWfx6aefMnDgQA477DCOO+64iOMMGjSIWrVqARTZSw511llnAdClSxfeeOMNwN0cb9iwYaxcuRIRKbxJXzSffPJJ4W0vOnToQIcOHQqHvfrqq4wbN47c3FzWrVvHsmXLigwvabkC69evZ9WqVZx44omICCkpKSxdupR27dpFvHNsSW/jfcABB0R9vkRpHX300dx000307duXOnXq0LFjR1JSUsjJyeGee+5h1qxZYePk5uby2WefMW/ePNLT0+nTpw9dunQpfJBQaVV4chCRU4GHgWTgGVW9L2S4eMP7AznAcFVdWNFxGlMqMfbwy0us+6NFurFeceOEqlGjBgDJycnk5uYCcOutt9K7d2+mTp1KVlYWvXr1KnY6kTa+q1evZuzYscybN4/69eszfPjwiI2pQcv5TZ48mS1bthTeBvu3335j0qRJ3H333TRo0KBIg3DobbznzJlT7PKU9MihSZMmrFu3jqZNm7Ju3ToaN24ccdxLL72USy+9FICbb76Z5s2b8/3337N69erCo4bs7Gw6d+7M3Llzad68OT179iyMv3///ixcuLDMyaFC2xxEJBl4HDgNaAMMFZE2IcVOA47wXiOAJysyRmOqmh49evDmm2+Sk5PDzp07mTp1atSNVoETTzyRt99+m927d7Njx47Cp8gFtW3bNpo1awbA+PHjA8VYcPvupUuXsnjxYsBtsGvXrk29evVYv3594YOBoOitvGOVGzNmTJF6+QITJ05k5syZhbfxXrBgQWG7Q69evZg8eXLhLbPHjx9f2LZx3nnn8fnnnxdZJzNnzmTJkiVFpl9w5BDpFZoYAAYOHMgLL7wAwAsvvMCgQYMirquC6qaffvqJN954g6FDh9K+fXs2bNhQuCzNmzdn4cKFHHzwwfTr14/FixeTk5NDbm4uc+bMiTj/kqroI4duwCpV/QFARCYBgwD/07cHARPU7dp8ISIHikhTVV0XPjljTOfOnRk+fDjdunUD3C23jznmGLKysqKO07VrVwYOHEjHjh057LDDyMzMpF69eoHneeONNzJs2DAeeOCBYh+UA64x+eKLL6ZDhw506tSpMNaOHTtyzDHH0LZtWw4//HBOOOGEwnFGjBjBaaedRtOmTfn444+jlluyZEnY8xKysrL46aefilSptWzZkrp16/Lll19y+umns2DBArp06UJycjJ/+MMfCm+nXatWLaZPn86oUaMYNWoUqampdOjQgYcffjjw+olk9OjRnHPOOTz77LO0aNGC1157DXDP47jsssuYMWMG4G43vnnzZlJTU3n88ceLfQxo/fr1+ctf/kLXrl0REfr378+AAQPKFCtU8C27ReRs4FRVvczrvhA4VlVH+spMB+5T1c+87g+Bm1R1fsi0RuCOLGjRokWXH3/8sYKWwpiiquotu3fs2EGdOnXIycmhR48ejBs3rkhDcVXRr18/3nvvvUSHUelV9lt2R2rxCc1OQcqgquOAceCe51D20Iz5fRkxYgTLli1j9+7dDBs2rEomBsASQzmp6OSQDRzq624OrC1FGWNMGb3yyiuJDsFUYhV9Edw84AgRaSkiacAQIPTqlWnAReIcB2yz9gZT2VWHJyqa6qs0v88KPXJQ1VwRGQm8hzuV9TlV/UZErvSGPwXMwJ3Gugp3KuvFFRmjMSVVs2ZNNm/eTIMGDUp8rrwx5U1V2bx5MzVr1izRePYMaWPKaN++fWRnZ8flZmfGlIeaNWvSvHnzsHsuVaYGaWOqndTU1MILrYypLuzGe8YYY8JYcjDGGBPGkoMxxpgw1aJBWkS2ASuLLVg69YBtCZ5OacYt6ThByzcENhVb6vclXr+R8pKI+MprnlX1/1jS8UpStiz/ycNUtVHEIapa5V/AuMo+7bJMpzTjlnScoOWB+Yn+vivbqzx/f1U1vvKaZ1X9P5Z0vBKWLZf/ZHWpVnq7Cky7LNMpzbglHac812F1V9nXXSLiK695VtX/Y0nHS/hvqlpUK5mKIyLzNcp50caYilde/8nqcuRgKs64RAdgjCmiXP6TduRgjDEmjB05GGOMCWPJwRhjTBhLDsYYY8JYcjBxIyKHi8izIvJ6omMx5vdIRGqLyAsi8h8ROb8s07LkYAAQkedEZIOILA3pf6qIrBCRVSIyOtY0VPUHVb20fCM15velhP/Ns4DXVfVyYGBZ5mvJwRQYD5zq7yEiycDjwGlAG2CoiLQRkfYiMj3k1bjiQzbmd2E8Af+buMcqr/GK5ZVlpvY8BwOAqn4iIhkhvbsBq1T1BwARmQQMUtV/AKdXcIjG/C6V5L8JZOMSxCLKuPNvRw4mlmbs3wsB98NrFq2wiDQQkaeAY0RkTHkHZ8zvWLT/5hvAn0TkScp4Cw47cjCxRHogctSrJlV1M3Bl+YVjjPFE/G+q6k7g4njMwI4cTCzZwKG+7ubA2gTFYozZr9z/m5YcTCzzgCNEpKWIpAFDgGkJjskYUwH/TUsOBgARmQj8DzhKRLJF5FJVzQVGAu8By4FXVfWbRMZpzO9Nov6bduM9Y4wxYezIwRhjTBhLDsYYY8JYcjDGGBPGkoMxxpgwlhyMMcaEseRgjDEmjCUHU2oiMlxENMrr5BJMJ0tExpdjqKHz88eZKyKrReR5EWke5/lkePMY7us3XEQuiVC2YF1mxDOGYuLrFWFd/CQiT4hI/VJOc5SInBXvWE3Fs3srmXj4P9zl/H7LEhFICYwHnsb9BzoBfweOF5FOqrorTvNYB3QHvvf1G+7N87mQsu94ZdfFad4lcS3uitt0oA9wE+7WDGeUYlqjgM9wN4AzVZglBxMPi1R1VaKDKKGfVfUL7/NnIrIdlzBOI04bNlXdA3xRbEFXdiOwMR7zLYXlvnXxkfdsjstE5GBV/SVBMZkEs2olU25E5BQRmSEi60QkR0SWishfvQeVxBrvYO9Rh2tFZI83fpEHColIuoj806sS2uu93yIipf1Nz/PeW3nTbyoiE0RkkxfDYhG5oCRxhlYrichsoCdwgq8qZ7Y3rEi1krfeFkRYN0296p9Rvn4tReRlEdnoxbFIRAaXcj0ALPTeW/jm0VVEXvdu37BL3BPI7hWRWr4yWcBhwPm+5RvvG95RRKaJyBZvGv8VkZPKEKcpR3bkYOIhWUT8vyVV1TzgcOBD4FFgN5AJ3AE0AmI9cvRF3EbmBtw965vgqjvSAbx5vYd7AtZdwBLgOOBW4CDgr6VYhpbe+1YRqQ3MAeoDN3sxXAC8KCLpqjouSJwRXA28BCQDV3j9fotSdgIwUUTaqKq/iu48730igIgcCnwJbACuxx19nAtMEZEzVbU0N2PLwD1FLMvXrwXuATLjge1AW+A23Hc8xCszGJgBfI37nvHiQUQ6A58CXwGXAzm427t/ICLHq2pYIjQJpqr2slepXrj6c43w+ixCWcHtjNwCbAGSfMOygPG+7h3AtTHme6E3nx4h/W8B9gKNi4lbgXu8eGriEstyYCdwCO6GZgr0ChnvA9xGODlgnBnedIb7+s2Osn4K1mWG110L2Ab8I6TcImCGr/tZ3Aa4QUi593HVfbHWQy9vnqd46+IA4ExcwhobY7yC7/ICIN8/b++7fCnCOB966zjN1y/Z6/dmon/L9gp/WbWSiYfBQFff61IorAJ5WkR+xG209wF3AwcCsZ45PQ+4QUSuE/e86tAHm5wK/Ah8LiIpBS9gFpCK29gX52Yvnl24O17uA/qr6lqgB65NYnbIOC/hjnraBIyz1NQ1ik/BVdEIgIi0BzrijioKnIrbW98Wsi7eAzqKSN0As3sPt/y/AVOBT3BHQ4VEpK5Xjfc9sMcr/yIuURwRa+Je1VNP4DUg3xej4BJujwAxmgpmycHEw1JVne97rfDq/qfhnjV9N/BHXOK4xxunZozpneuNeyOwGPhZRG7ztSc0xlXn7At5zfWGNwgQ83NePMcADVW1g6rO8YYdROSzhn7xDQ8SZ1lNwJ011MvrvhBXpfOWr0xj4CLC18W/veFB1sU1uHVxMjAZGICrovN7HlcN9AjQ1yt/jTcs1ncJbn0le9MMjXMkUD+O68zEibU5mPLyB1wbw4Wq+lJBTxEp9vRIVd2A2/BcIyJHAcNwp5puBJ4ENgOrgXOiTCIrQHzrVHV+lGG/AkdF6H+w9745YJxlNQf4CbhAROYAQ4HXteiptptxdfn/jDKNIE8H+65gXYjIR7i2k5tF5HlVXSMiNXEPr79DVR8uGMk7kgliK6766XGKHvUUUtX8gNMyFcSSgykvBY2y+wp6iEgqcH5JJqKqK3AbqiuBdl7vmcCfgB2q+m0cYg01B/g/ETlBVf/r638ers1hecA4I9mDq9svlqqqiLyMS0BTcY+CDN24zsRdH/GNxuH6DG+eo3ANx6O9edfA7fnvCyk+PMIk9uDaS/zT3Ckin+KqxBZaIqgaLDmY8rIc1y5wj4jk4TYs1xc3kojUw9VDvwx86403CHfm0Cyv2Mu4h6h/KCL3486OScMdrQwEzlTVnDLEPh64DnhDRG7BXeB3Pq465QpVzQsYZyTLgKtF5FzcxXHbvcQSzQRgDPAU7oyoOSHDb8NVp30iIo/hjprq4xLU4aoadjV2cVT1axGZAlwqIveo6loR+QL4q4isAzYBlwDNoizfSSJyOq4abpOqZgF/wbVlvCciz+Kq7RoCnXEN/LHOXjOJkOgWcXtV3Rf7z7BpFWV4J9zVsjm4DeydwGX4zsrxymXhna2E20t9GvgGdzbQb7iG3/NCpl0Td7rkt7i91V+9cncAKcXErcDdxZRpimtw3eRNfzFwgW94sXES+Wylg3ENyNu9YbND1mVGhFjmecPujRJrc+AZ4Gdcw/863NlKFxSzjL286Z4cYdjRuNNZH/Yty7te3BuAx3BtE0XO6gJa46q5crxh40OmOckbf4/3m5iGOxEg4b9nexV92WNCjTHGhLEzBIwxxoSx5GCMMSaMJQdjjDFhLDkYY4wJY8nBGGNMGEsOxhhjwlhyMMYYE8aSgzHGmDCWHIwxxoT5/7ZBxrESkymUAAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fpr_logreg_scaled, tpr_logreg_scaled, thresholds = metrics.roc_curve(y_clf, ypred_logreg_scaled, pos_label=1)\n", "auc_logreg_scaled = metrics.auc(fpr_logreg_scaled , tpr_logreg_scaled )\n", "\n", "plt.semilogx(fpr_logreg_scaled, tpr_logreg_scaled, '-', color='blue', \n", " label='scaled data; AUC = %0.3f' % auc_logreg_scaled)\n", "plt.semilogx(fpr_logreg, tpr_logreg, '-', color='orange', \n", " label='original data; AUC = %0.3f' % auc_logreg)\n", "\n", "plt.xlabel('False Positive Rate', fontsize=16)\n", "plt.ylabel('True Positive Rate', fontsize=16)\n", "plt.title('ROC curve: Logistic regression', fontsize=16)\n", "plt.legend(loc=\"lower right\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In a cross-validation setting, we ignore the samples from the test fold when training the classifier. This also means that scaling should be done on the training data only. \n", "\n", "In scikit-learn, we can use a scaler to make centering and scaling happen independently on each feature by computing the relevant statistics on the samples *in the training set*. \n", "The mean and standard deviation will be stored to be used on the test data.\n", "\n", "**Question** Rewrite the cross_validate method to include a scaling step." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def cross_validate_clf_with_scaling(design_matrix, labels, classifier, cv_folds):\n", " \"\"\" Perform a cross-validation and returns the predictions.\n", " \n", " Parameters:\n", " -----------\n", " design_matrix: (n_samples, n_features) np.array\n", " Design matrix for the experiment.\n", " labels: (n_samples, ) np.array\n", " Vector of labels.\n", " classifier: sklearn classifier object\n", " Classifier instance; must have the following methods:\n", " - fit(X, y) to train the classifier on the data X, y\n", " - predict_proba(X) to apply the trained classifier to the data X and return probability estimates \n", " cv_folds: sklearn cross-validation object\n", " Cross-validation iterator.\n", " \n", " Return:\n", " -------\n", " pred: (n_samples, ) np.array\n", " Vectors of predictions (same order as labels).\n", " \"\"\"\n", " pred = np.zeros(labels.shape)\n", " for tr, te in cv_folds:\n", " scaler = preprocessing.StandardScaler()\n", " scaler.fit(design_matrix[tr,:])\n", " scaler.transform(design_matrix)\n", " classifier.fit(design_matrix[tr,:], labels[tr])\n", " pred[te] = classifier.predict_proba(design_matrix[te,:])[:,1]\n", " return pred" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question** Now use the cross_validate_with_scaling method to cross-validate the logistic regression on our data." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9538745387453874\n" ] } ], "source": [ "clf = linear_model.LogisticRegression(C=1e6) \n", "ypred_logreg_scaled_ = cross_validate_clf_with_scaling(X_clf, y_clf, clf, folds_clf.split(X_clf, y_clf))\n", "print(metrics.accuracy_score(y_clf, np.where(ypred_logreg_scaled_ > 0.5, 1, 0)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question** Again, compare the AUROC and ROC curves with those obtained previously. What do you conclude?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEhCAYAAACUW2yNAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAABILklEQVR4nO3dd3wVVfr48c+TQgld2oJIExEFEgxBhRUQEVBQEMUFFAWVVUDEyiL6W3VtX11ZsaxiByugKBZERaVbKSIgLE2CIggIgpCEkPL8/jiT680tyQ25SSA879drXsnMnDnzzM3NnJlzZs4RVcUYY4zxF1PWARhjjDnyWOFgjDEmiBUOxhhjgljhYIwxJogVDsYYY4JY4WCMMSaIFQ5lSESGiYj6TYdEZJOIPCgilcJs00FE3haRHSKSKSKpIvK0iBwfJn28iIwSkS9EZK+3zWYReUlEkkv2CI9s3mf3WinuL+/v3bSI21wdjbwMiMh8EZlf1nEcDeLKOgADwKXAVqAa0B8Y7/1+g38iEbkCmAwsBm4EtgGnAP8ABojIuaq60i99FeAjoAPwDPAgcABoAQwBPgdqleSBmXw+BDoC24uwzTDc/+lLUcjLwKiyDuBoIfYSXNkRkWG4k/1JqrrRb/mnwF+Bqqqa6y07GfgemAX8LW+5t6428A2QC7RW1Sxv+QvAFcDZqvpViP33V9WZJXR4hRKRiqqaWYb7TwUWq+qQsoqhMN5VbpyqnlWK+4zFnRuyS2FfZfodMOFZtdKRaTlQGajjt+wmIBa4wb9gAFDV3cAdwEnAxQAi0gB31fl8qILB267QgkFEuorIpyKyT0TSROR7EbnGb72KyD0B2zT1lg/zWzZFRLaKSEcR+VJEMoB/i8hsEVkWYr8NRCRbRG7yW9ZMRF4XkV1e9dgKEelf2DEUh4icLiKficgB7/g/F5HTQ6S70aumOigi34pIJ29+il+aoKogEblMRL7z8t8nIqtE5Dpv3XygK/BXv6rH+eHy8pb/XUSWi0iGiPwuIgtEpFMhx6gi8oCI3C4im4FDQFtvXVfvmPd7x/+JiLQJ2D5WRO4Xke0iki4ic0WkVeB3Q0Tu8Za18fI5ALzprUsQkYe9Ks9D3s87RSTGb/uqIvKkiPzk/f13eH+bVgF/h7V+x7/U/zsSqlpJRE4WkZniql0zRORrETkvIE1e7CeJyIfe32uLiNzlH2N5Ui4PqhxoCuwDdvst6w4sVdVw1Qgf4u4czvHmu+EKk/cPNwgR6YereqoAXAf0w1VvNDnMLGsA04CpwPnAG8ArQLKInBqQ9jLv51QvlhNwd0dJwM1AX1wh+raI9PWLOa9guucwY/QRkURgAa7qbRhwJVAdWCAiSX7phgOPAZ/hPqMp3rHVLCT/s4DXvH1chKtefN5vu1HAd8BKXBVSRwqoFhGRCcBzuM/lb7iqw4VA4wgOdxjQB7jN+7lNRPrg/v4HvLwuw1V3LvL+Hnn+hbs4eQV3/J9Q8PfuPe+Y+wITRSTO22Y48Djuu/EC8E/gEb/tJnrH9S+gBzACWIH3eYnI5cB/cN+Z3sDlwAzguHCBiEhDXDVtEjDay38v8KGInB9ik5nAXNzf610vlqEFHOvRS1VtKqMJ9w+pwMm4euVawNVANjA6IG0GMLWQ/H4FZnu/j8vL+zBjEyAVWArEFJBOgXsCljX1lg/zWzbFW9YvIG1lXEH4fwHLV+Qdizf/IrALqB2Q7lNghd98E+/zuyuCY0wFXitg/QzciaKm37LqwB7gHW8+BvjZP1Zv+cXe8U4J8fdu6s3fBuwpJMb5uKqvcN+dvLxaADnAo4fxt1Zc+1XlgOUbgc8DllUHfgMe8+Zr4QqPpwPS3RL43QDu8ZbdGJD2Cm95l4Dld+LuYup586sLOj7gv8DyCD7P+X7zE7zvSwu/ZbHAOv+8/GK/KiC/VcCcw/kfO9Inu3M4MvwPyMKddF4EnlXV/x5GPhLFmE7GnWhf0IBqrGLIxrWZ+KhqBvA2cLmICICItMVdyb3il/Q8YDawT0Ti8ibcFWeSiFT38tuiqnGqem8U4u0CzFLVvX7x/oG7Ku7qLWrkTW8FbPued7wFWQLUEpHXROQCEalZjFjPxRVUzx3m9h97fwsAROQk4ETg9YDPOx34CvfZgKt+qkLw8c8oYF+B1ZnnAVuALwP2NQeIB8700i0BhonIHSKSIq5txN8SoJ1X9XSuiCREcNxdgK/Vr81PVXNwdx/t8r5Xfj4MmF9NZHdmRx0rHI4M/XFPFPXGVU2MEpErA9JsxV2RhyTuyaQ6uKtY/H4ebhVQbb/9RstO7x8v0CvACcDZ3vwVwH7cCTZPPVy1TlbAlFftUJvoO47QTwP9yp9PeTXwfu70T+Ad528FZa6qC3BVSSfgTpi7vDr0xMOItbh/r8DjrOf9fJHgz/wCv/2FPH5gRxH31STEfr711uft6wbgWdzd9RJgp4hM9CsEXgFGAmfgLhr2iMg7ge0yAQr6GwvBT/PtCZjPBEI+dn60s0dZjwyr865cRGQuro75ERF5W1XTvDSfA9eISAMN3e7QB1fYz/Xm5+OqGS7EXYEVVd6JLeT7E34ycW0S/sKdqMM9GrcA+AkYIiILgMHADP8rWVz7yyLg4TB5bCskzsOxB/hLiOV/4c+TRN7fop5/Au+q1v+BgpBUdQYwQ0Sq4grHh4GPRaRREe/Y/P9e64qwnS+UgPm89q7xuAuWQIe8n/7H/4Pf+vpF3NdmXH1/KKkAqnrAi2e8iDQBBgAPebGMU1fP8yzwrIjUAnri2iCm4wqMUAr6GyvBhcExw+4cjjDqHusbi/tn8298fBzX4Pxk4NMRInIc7h2GjcA7Xj7bcPX814pIx1D7EpGLCghlPe6fcnhedU8YW4A2Acv6FJA+iPdP/Trun703rprmlYBkHwOJwA+qujTEVBKPQy4A+ohItbwF3u8XeuvAXalvxd0B+LuIIlx8qeoBVZ2FO7k14M8CNhPXLlOYz3Dfj2sj3Wch1uH+/q3DfN5579OsAtIIPv7A+YJ8jLt7OhBmX0F3YF714X+8/Qd+/1DV31V1Ou5pqKD1fhYAZ0r+J8higYHAd6q6vwjHUa7YncMRSFXfF5ElwG0i8l9VzVDVteIecXwB+FxEnsFdtbXCvQRXE+ih3jsOnpuAln7pP8M1HjbHPcmRgnviIlQMKu4x0neAud72u3Av3dVT1bu9pNOA/ycidwJfA51xV/5F9QruqvAZXJXYgoD1d+GqGRaKyH9xJ65auH/85qp6NYB3RbkJuDfCdofGIjIgxPKvgPtwVSifi8jDuCvJcUACcC+AquaKyL+A58W9V/IW7vO9HdfQHvbqX0TuxV1hz8Pd+TQCxuAa2Hd5ydbgqhkHese1X1WD7gxUdZOITARu8Qqw93F3jqcD//NOlBHz/v7XA++JSAXcSfY3L95OwE+q+qiq/i4ijwF3iMh+3HcsGch73DmSu5/Xgatwn/N/cO/zVMC1efQFLlLVdBH5yjuuVbjvcVdc29TLACLyHK468itcNVdLXBVlQXfOE3GN+5+KyN3AH7iLspYU8SKn3CnrFvFjeeLPJ05ahFjX01t3c8DyM/Hqp3G301twJ9QTwuwjHrge+BL3xT+Eu4V/AUiMIMZzcCevA970PX5PbODqWx/HFVT7cbfwpxP6aaWthexribfdg2HWN/Li/sU7ju24p5WG+KVpSognqMLkl+qlDTUN8NKcwZ+Fahqueu/0EHnd5P0tDuKe8DoL+B2YGOLv3dSb74OrG9+Ou0P4GVfH39Bvm7/gGuL3e9vOD5WXX/oRuGrJTFyVyHygYyGfgwL3h1nXEfcQwe/esaXiLgg6+qWJBR7A1dNnePvsRMCTSfz5xE9ciP1U8tb/zy/2Jd6yOC/Nw7hHe/d5f4tVwBi/PIZ6+97p5bEZd/Kv7pdmPn5PK3nLTsZdJO3zjvFr4LyANCFjx32vU8v6XFISk70hbUwJEJEOuDudK1X11bKOp7SJyKW4u40uqrqorOMxRWeFgzHFJCLNcHdni3B3Z6fgXgo7BLRR1fQyDK/EicgZuLugb3BX3u1x1WrrgE5qJ5mjkrU5GFN8Gbi2jytx7SC/46qibi/vBYPnAO59getxL8ntxN01jLeC4ehldw7GGGOC2KOsxhhjgljhYIwxJki5aHOoU6eONm3atKzDMMaYo8qyZct+U9W6odaVi8KhadOmLF26tKzDMMaYo4qIbAm3zqqVjDHGBLHCwRhjTBArHIwxxgSxwsEYY0yQUi0cROQlEdkpIqvDrBcReUJENorIShFJLs34jDHGOKV95zAFNyRgOOcDJ3nTtcCkUojJGGNMgFJ9lFVVFxYyZF8/4BWvP5avRaRmASOfGWNMqcnOzmbu3LkcPHiwxPelCjk5kJUFhw7BoUNKduZBsjMzyDmUQe6hDHKyDpKblUHb5A6cd2nfqMdwpL3ncDx/jn0MboSt4wkxxquIXIs36lXjxuVyfG9jTBRlZWVFfHLPzYXsbP+TM3z66SxmzHihFCItmsHnnHFMFA6hhqMM2TOgqj4HPAeQkpJivQcaU45kZ8PBg5CZ6X4WNoVL98cfB9i8eR6HDuWyZcsHbN78YrFju6prPxrUiCWhUhpVKqSTUCGNKpXSqRyfTpVKbr5yhQxiChpc109mTiUO5VYmK6cKhzSBbKqQTQK5JJAjCeTGVEFjEtDYBCQuAeKqEBOfgFRIIC4+geatCxoF9fAdaYXDVtxYsnkaUTIDxxtjwlB1V8pFOTlH40TuP+XkFByjSC5VKqZRrdJ+qlY6QNWK+8jRhVSI3UPlChlUij9I1YQMlmz6lK27f8m37WPD2tCkTjaV4tOpFJ9Oxbg0KsREdjKvVQUa1f2UnJhqaExVcmOrQVw1iP8LUqEaMRWqEluxGrGVqiHx1SC+GsRV9X6GmI+rCjGxxfhrlZwjrXB4HxgtItNwwzPus/YGc6zJzS36CTfaaYsrLg4qVXJTxYpQuXIutaqlcVy1/dSttp+atQ9Qs8p+qie4Ke8kv23nev448CsV4zKoEJtBhdiDxMdkEC8ZxEkGsRwkFvfT35fr4ZEPw8ez/JHjIK4KtWpWo+nxtcKfrH0/qwbMe+m8k/mRduIsCaV6jCIyFTgbqCMiW4G7cWMco6rP4MbK7Q1sBNJxg44bU2pU/6zSKMkTdEHpsrKKfxwVK/55cs47QfvPV68O9erlX+afLidnJzu2feZdhbupYtxBd9KOcSfuOMkgPuagd9J2J+wYzSAm9yCSmwE5edNBN4WT5aacfXDDE8U77tcm3Uvr1m0grgrEJUBsZRoc34j69esXL+NjUGk/rTS4kPWKG03KHKNUg0+cpX31nJtbvGMQgcqVw594K1WCatWC14c7kUeUrmIuleLTqBS7nwqyn5ic/ZC9H7IOeD/3+/08UOD8/v1/0Pa2dLb8Fp2/aVENHz6c668v+mmgWrVqnHjiiSUQ0bGpXNwdpaXBV1+53wMHtitovihpj7VtC0ubmxv65Frck3NmJsUWH1/wybRKFahduwgn3iKmi4tzBUSBNNc7KRf95E3WfkgP2C47LfIPKK6Kr757WWosP+2JhdjKEFsfYpvw0Kur2fJbOlUqV+CLaWPdFXhcFTfFJrjJ78o8mnXmsbGxnHrqqcTEWOcNZa1cDBMqkqJgXXYfKUrqpBtJ2ooVIbYk2vd8J/MwJ+vsA+HnQ5zkF61OY9f+CPcdU9E7eVeG2Ep//oyrDDF+y/zn4/zT+v0eUwm8E29WVhaDBg0Ku9s9e/ZQq1atKHx45kglIstUNSXUunJx53DSSfDkk3/OB161FTRflLTH2raFrQt1cq5QIYKr5tLgfzIv7OTtu/ou4KQf4sr8173w5YYQ+w57Mq8HMU3YvjeX0Q/PLcLBZHrT3sP6KAozYsQIRowYkW9ZgwYNrGA4xpWLwqF6dejVq6yjMMUSdDLfX/h8iJP8t2v38PPONMgpQv1UbEV3xR1XyfvpXWHH1YeYpm55bOCVeGWG3/Ff9u47ECLDyE7m999/PxdeeGHkcZaAuLg4WrVqZdU4Jki5KBxMGcjNgZw0yNpPzsG9fPrZ56Tt/909nZKdAbkZkH0Qcg+6+Ry/KfegW5dv2aHI9x1b6c8p5s/qlMzcOlx+78bDOJi8k3nR1a9fnzlz5hR5u4oVK9KyZUvkiLjNMiZY+SgcsvbDjgXeTGAbSgGtrMVJm299lNJGM74I0/687Te+WbbWO5mHOKn7lnsndN/8nyfzeWvg6c8oRQe9KbQxY8YwfPjwUomkadOmVKtWrVT2ZUxpKh+Fw/718PnZZR1FiVGFz1bDvvTo533ZU5BVyNuokXr7xX9xUsuT/3yaJTbBXdVL6VVZ5FWT2BW5McVTPgqHai2h+zN+CwJPDAW0sh5BaZcs/4Etv/watG7Vmo3c+8jzlJQWzZswc/pkiK3i6tcP42RevXp16wDRmHKkfBQO8dWgfreyjqJYcnNz6XzhWWQW8KD/888/z5lnnhn1fTdv3pyEhISo52uMOXqVj8KhHFBVMjMzuf7664MeKwSoUqUKzZo1K4PIjDHHIiscjjD169enTZuS6YLXGGMiZQ83G2OMCWKFgzHGmCBWOBhjjAlibQ6l6MCBA3zyySdkZ2cHrcstbj/RxhgTRVY4lKLJkyczZsyYAtPUrl27lKIxxpjwrHAoRQe98ReXLl0a8r2C2NhYTjrppNIOyxhjgljhUAZatWpFlSpVyjoMY4wJywqHCH3//fesWbOmWHl89913UYrGGGNKlhUOEerfvz+bN28udj7VqlWjQoUKUYjIGGNKjhUOETp48CCXXnop9913X7HyqVOnDvHx8VGKyhhjSoYVDkVQs2ZNTj755LIOwxhjStwxXTgcOnSIWbNmkZGRUWja9PQSGEzBGGOOUEUuHESkKlAb2KaqWdEPqej27NnDtGnTANe7qb+C5j/44AOmT58e8X7q1KlTjCiNMeboIYEnz7AJRS4A7gWScONMnq6qy0XkBWCuqr5RcmEWGltkBxHG/PnzadiwYWH7oFmzZsTGxhZnV8YYc8QQkWWqmhJqXUR3DiJyEfA28DkwDvi33+rNwFCgzAqH1q1bM2PGDN984BCRBc3XqFGDevXqlWyAxhhzlIm0WuluYLKqDheROPIXDquBUVGPrAgqVapEq1atyjIEY4wpVyLtlfUUIK9yPrAK53dcG4QxxphyItLC4Q8gXGtsU2BXVKIxxhhzRIi0cPgUGC8iNf2WqYhUBEYDH0U7MGOMMWUn0jaHO4FvgXXAbFzV0u1AIlADuKgkgjPGGFM2IrpzUNVUIBmYBfQAcoAuwNfAGaq6LdIdish5IrJORDaKyO0h1tcQkQ9E5HsR+UFEroo0b2OMMdER8UtwqroVuKY4OxORWOApXAGzFVgiIu+rqn93p9cDa1T1QhGpC6wTkddV9VBx9m2MMSZyEd05iMhcEQn5rKiItBSRuRHu73Rgo6r+6J3spwH9AtIoUE3cywhVgT1A8LiaxhhjSkykDdJnA9XDrKsGdI0wn+OBn/3mt3rL/P0X9+jsNmAVcKOqBg2wLCLXishSEVm6a5c9LGWMMdEUaeEAwe835DkROBBhHhJiWWC+vYAVQEOgHfBfEQkqmFT1OVVNUdWUunXrRrh7Y4wxkQjb5uA1BOc1BivwnIjsD0hWGWiD61YjEluBE/zmG+HuEPxdBTykrtOnjSKyGWiFe1rKGGNMKSjoziEX91RSDu6K338+b9oNTCLyhuolwEki0kxEKgCDgPcD0vwEdAcQkfrAycCPEeZvjDEmCsLeOajqy8DLACIyDxipqv8rzs5UNVtERgOfALHAS6r6g4iM8NY/A9wHTBGRVbhCaZyq/lac/RpjjCmaiLvsPpKlpKTo0qVLyzoMY4w5qhS7y26/jJJw1TyVAtep6iuHF54xxpgjTaTjOdQEPgTOzFvk/fS/7bDCwRhjyolIH2V9ENctdxdcwdAfOAd4HddYfHqJRGeMMaZMRFo49MIVEF9781tVdb6qXgl8BtxYEsEZY4wpG5EWDg2AH1U1BziIeys6zztAn2gHZowxpuxEWjj8CtT0ft8CdPRb1yKaARljjCl7kT6ttBhXIMwCXgXuFpGmuA7xhhL8IpsxxpijWKSFw79wfR0BPIJrnB4IJOAKhhuiH5oxxpiyElHhoKqbgE3e71nArd5kjDGmHCpKr6whichpIjIzGsEYY4w5MhR45+CN3NYeaAxsUtXv/NalAHcDvYHA3lqNMcYcxcLeOYhII+Ab4CvgTWCpiEwXkQoi8oK37hzgP0Dz0gjWGGNM6SjozuEh3DgK/wSWA82AO4AvcHcTLwO3q+qOkg7SGGNM6SqocOgO3KOqE/IWiMg63BvRT6qqvRVtjDHlVEEN0nX5s7uMPF95P98qmXCMMcYcCQoqHGKAQwHL8ubTSyYcY4wxR4LC3nO4UETa+M3H4Lrp7isi7fwTqupLUY7NGGNMGSmscLgzzPK7AuYVsMLBGGPKiYIKh2alFoUxxpgjStjCQVW3lGYgxhhjjhzF7j7DGGNM+WOFgzHGmCBWOBhjjAlihYMxxpggVjgYY4wJUqTCQURiRKSNiHQVkSolFZQxxpiyFXHhICLXA78C3wNzgZO95e+KyJiSCc8YY0xZiKhwEJG/A48D7+LGjha/1YuAS6IemTHGmDIT6Z3DLcB/VPVaIHBI0P/h3UUYY4wpHyItHJoBn4RZlwbUjEo0xhhjjgiRFg6/AU3DrDsZ+CXSHYrIeSKyTkQ2isjtYdKcLSIrROQHEVkQad7GGGOiI9LC4QPgLhHxHytaRaQOcDOuLaJQIhILPAWcD5wKDBaRUwPS1ASeBvqqamvg0ghjNMYYEyWRFg7/D8gEVuOGCVXgCWAtkAPcG2E+pwMbVfVHVT0ETAP6BaS5DHhHVX8CUNWdEeZtjDEmSiIqHFR1N5AC/B8QD2zC9ej6X6Cjqu6LcH/HAz/7zW/1lvlrCdQSkfkiskxErowwb2OMMVFS2GA/Pqq6H7jPmw6XhFimIWJqD3QHKgNficjXqro+X0Yi1wLXAjRu3LgYIRljjAkU6XsOjwYOC3qYtgIn+M03AraFSPOxqqap6m/AQiApMCNVfU5VU1Q1pW7dulEIzRhjTJ5I2xyuApaJyGoRGSsigVVBkVoCnCQizUSkAjAIeD8gzXtAZxGJE5EE4Axc24YxxphSEmnhUB/4G7ARV620RUQ+E5EritLHkqpmA6Nx70ysBd5U1R9EZISIjPDSrAU+BlYC3wIvqOrqiI/IGGNMsYlqYJV/IRuI1AIGA5cDHYF0YKaqXhH98CKTkpKiS5cuLavdG2PMUUlElqlqSqh1Re6yW1V/V9WnVfWvQDfgd9zjp8YYY8qJiJ9WyuNVIw0AhgBnA9nA29ENyxhjTFmK9GmlGK/bi9eBHcBLQEVgFPAXVf1bCcZojDGmlEV657ANqItrkH4YeFVVU0sqKGOMMWUr0sLhbeAVVf2mJIMxxhhzZIiocFDV60s6EGOMMUeOsIWDiHQBlqvqAe/3AqnqwqhGZowxpswUdOcwHzgT9yLafIL7QMoj3rrYaAZmjDGm7BRUOHQD1ni/n0P4wsEYY0w5E7ZwUNUFfr/PL5VojDHGHBEifc/hRxEJ6hnVW9dGRH6MbljGGGPKUqTdZzTFvfQWSiWgSVSiMcYYc0QoSt9K4docUoC9xQ/FGGPMkaKgR1lvBm72ZhX4QEQOBSSrDByHGwvaGGNMOVHQ00o/Ap97vw8FlgK7AtJk4p5oeiH6oRljjCkrBT2t9B5uVDZEBOBeVd1cSnEZY4wpQ5F2n3FVSQdijDHmyFFQm8NduCE6t3m/F0RV9b7ohmaMMaasFHTncA9uLOdt3u8FUdzY0sYYY8qBgtocYkL9bowxpvyzk74xxpggkXaf0VJETvebrywi/yciH4jI6JILzxhjTFmI9M7hv8AAv/kHgFuBhsBEEbHBgIwxphyJtHBIBL4AEJEY4EpgnKq2B+4Hri2Z8IwxxpSFSAuHmsBu7/fTgFrADG9+PtA8qlEZY4wpU5EWDjuAFt7vPYFNqvqzN18VyI52YMYYY8pORG9IA+8D/ycibYBhwLN+69ri+mEyxhhTTkRaONyOG7ehF66geNBvXV9gTpTjMsYYU4Yi7VspDfh7mHWdohqRMcaYMhfpnQMAInIc0BE3hsNu4GtV3VMSgRljjCk7ERcOInI/7t0G/+FCM0Vkgqr+M+qRGWOMKTORviF9E3AH8BrQDTjF+/kacIeIjIl0hyJynoisE5GNInJ7Aek6iEiOiAwIl8YYY0zJiPTOYQTwuKre7LdsHbBARA4Ao4AnCstERGKBp4AewFZgiYi8r6prQqR7GPgkwviMMcZEUaTvOTQFPgyz7kNvfSROBzaq6o+qegg39nS/EOluAN4GdkaYrzHGmCiKtHDYDbQJs641f749XZjjgZ/95rd6y3xE5HigP/BMQRmJyLUislRElu7aFTi0tTHGmOKItHCYCdwnIleISDyAiMSJyGDgXtxVfiQkxDINmH8M129TTkEZqepzqpqiqil169aNcPfGGGMiEWmbw3ggCXgZeElE9uAeZ40FFuMaqyOxFTjBb74RbqQ5fynANBEBqAP0FpFsVX03wn0YY4wppkhfgtsvIl2APkBnXMGwB1gAfKSqgVf/4SwBThKRZsAvwCDgsoB9Ncv7XUSmALOsYDDGmNJVYOEgInWAIbhO934H3lbVcYe7M1XN9gYH+gR31/GSqv4gIiO89QW2MxhjjCkdEu6iX0ROBhYC/hX6OcAAVX2vFGKLWEpKii5durSswzDGmKOKiCxT1ZRQ6wpqkL4fOAicDVTB9b76LfBotAM0xhhzZCmocDgDuEtVF6pqhqr+AFwHNBURezzIGGPKsYIKh+Nxb0H7W4d7HLVhiUVkjDGmzBVUOAiujcFfbgTbGWOMOcoV9ijrv0TkN7/5vJfY7vPedcijqjo0uqEZY4wpKwUVDj/hel8NtAXXZYa/SN9zMMYYcxQIWzioatNSjMMYY8wRxNoOjDHGBLHCwRhjTBArHIwxxgSxwsEYY0wQKxyMMcYEscLBGGNMkEgH+wFARBKBLkBt4FlV/VVEWgA7VHV/SQRojDGm9EVUOIhIReA14GLcW9IKfAD8CvwbWA/cXkIxGmOMKWWRVis9AJwLXAHUJ/9Y0B8BvaIclzHGmDIUabXSYOD/qeobIhIbsG4z0DSqURljjClTkd451AbWFpBHxeiEY4wx5kgQaeGwGegYZt3pBI/7YIwx5igWaeHwCnC7iFwOVPCWqYh0A24GXiqJ4IwxxpSNSAuHfwMfAq8CeeM4LAY+Az5W1SdLIDZjjDFlJKIGaVXNAQaJyFO4J5PqAbtxBcOCEozPGGNMGSjSS3CqughYVEKxGGOMOUJY9xnGGGOCRPqGdC6FDAWqqoHvPxhjjDlKRVqtdC/BhUNtoCfuHYcpUYzJGGNMGYu0QfqeUMu9t6U/APZFMSZjjDFlrFhtDt5TTE8DN0UlGmOMMUeEaDRIVwSOi0I+xhhjjhCRNkg3DrG4AtAGeAhYGs2gjDHGlK1IG6RTCf20kgCbgOsj3aGInAc8DsQCL6jqQwHrLwfGebMHgJGq+n2k+ZujV1ZWFlu3buXgwYNlHYox5UqlSpVo1KgR8fHxEW8TaeFwVYhlB4EtwBKv7aFQXgP2U0APYCuwRETeV9U1fsk2A11V9XcROR94DjgjwjjNUWzr1q1Uq1aNpk2bIiKFb2CMKZSqsnv3brZu3UqzZs0i3q7QwsE7oa8AtqnqrsMPEXA9uG5U1R+9vKcB/QBf4aCqX/ql/xpoVMx9mqPEwYMHrWAwJspEhNq1a7NrV9FO35E0SCuuTeG0wwkswPHAz37zW71l4VyDG2nOHCOsYDAm+g7n/6rQOwdVzRWRn4EqhxNUgFARhnzz2usO/BrgrDDrrwWuBWjcOFR7uTHGmMMV6aOszwI3iUiFQlMWbCtwgt98I2BbYCIRSQReAPqp6u5QGanqc6qaoqopdevWLWZYxkRHamoqbdq0KdI2w4YNY8aMGcXONzU1lTfeeKNI+y5Jb731FqeccgrdunVj6dKljBkzBoD58+fz5ZdfFrL1nyZOnEilSpXYt+/Pd22nTJnC6NGj86U7++yzWbrUPTh54MABrrvuOk488URat25Nly5d+Oabb4p1PKrKmDFjaNGiBYmJiSxfvjxkurlz55KcnEybNm0YOnQo2dnZgDvuGjVq0K5dO9q1a8e9997r26Zp06a0bduWdu3akZKSUqw4oyXSBulqwInAjyLyMbCd/Ff8qqp3R5DPEuAkEWkG/AIMAi7zT+A9NvsOcIWqro8wPmOOeXmFw2WXXVZ44hKkqqgqL774Ik8//TTdunUD8J305s+fT9WqVenUqVNE+U2dOpUOHTowc+ZMhg0bFtE2w4cPp1mzZmzYsIGYmBh+/PFH1q4NN9JxZD766CM2bNjAhg0b+Oabbxg5cmRQgZObm8vQoUP5/PPPadmyJXfddRcvv/wy11xzDQCdO3dm1qxZIfOfN28ederUKVaM0RT2zkFEfhSRJG/2DqChN10N3An8v4CpUKqaDYwGPsGNSf2mqv4gIiNEZISX7C5cv01Pi8gKEbF3KEypSEtLo0+fPiQlJdGmTRumT58OwJIlS+jUqRNJSUmcfvrp7N+/n9TUVDp37kxycjLJyckhr4RzcnIYO3YsHTp0IDExkWeffRZwJ8/Ro0dz6qmn0qdPH3bu3BkynmXLlpGUlETHjh156qmnfMvD7fv2229n0aJFtGvXjokTJ0YUI8Cjjz5KmzZtaNOmDY899hgA48aN4+mnn/alueeee/jPf/4DwCOPPOI7prvvvtsX0ymnnMKoUaNITk7mvvvuY/HixYwYMYKxY8cyf/58LrjgAlJTU3nmmWeYOHEi7dq1Y9GiRbz//vvcddddIWPbtGkTBw4c4P7772fq1Klh/3aB23zzzTfcf//9xMS4U1zz5s3p06dPRNuH895773HllVciIpx55pns3buX7du350uze/duKlasSMuWLQHo0aMHb7/9drH2W1YKunNoinv7GVWNWtfeqjobmB2w7Bm/34cDw6O1P3N0uukmWLEiunm2awfeuS+kjz/+mIYNG/Lhhx8CsG/fPg4dOsTAgQOZPn06HTp04I8//qBy5crUq1ePTz/9lEqVKrFhwwYGDx7sq9LI8+KLL1KjRg2WLFlCZmYmf/3rX+nZsyffffcd69atY9WqVezYsYNTTz2Vq6++Oiieq666iieffJKuXbsyduxY3/Jw+37ooYeYMGGC78o0PT290BiXLVvG5MmT+eabb1BVzjjjDLp27cqgQYO46aabGDVqFABvvvkmH3/8MXPmzGHDhg18++23qCp9+/Zl4cKFNG7cmHXr1jF58mRfoTJv3jwmTJhASkoK8+fPB1z1yYgRI6hatSq33XabL46+ffuG/JtMnTqVwYMH07lzZ9atW8fOnTupV69e+D8i8MMPP9CuXTtiYwvvKHrgwIGsW7cuaPktt9zClVdemW/ZL7/8wgkn/Fkr3qhRI3755RcaNGjgW1anTh2ysrJYunQpKSkpzJgxg59//vMZnK+++oqkpCQaNmzIhAkTaN26NeAajHv27ImIcN1113HttdcWGntJK9JgP8aUZ23btuW2225j3LhxXHDBBXTu3JlVq1bRoEEDOnToAED16tUBd5cxevRoVqxYQWxsLOvXB9eAzpkzh5UrV/raE/bt28eGDRtYuHAhgwcPJjY2loYNG3LOOecEbbtv3z727t1L165dAbjiiiv46CP34F5WVlah+4403eLFi+nfvz9VqrjnTS6++GIWLVrEmDFj2LlzJ9u2bWPXrl3UqlWLxo0b88QTTzBnzhxOO809vHjgwAE2bNhA48aNadKkCWeeeWaRPvPCTJs2jZkzZxITE8PFF1/MW2+9xfXXXx/26ZuiPpWTd3cYCdXgZ2cC9yciTJs2jZtvvpnMzEx69uxJXJw7zSYnJ7NlyxaqVq3K7Nmzueiii9iwYQMAX3zxBQ0bNmTnzp306NGDVq1a0aVLlyIdS7QVVjgUOIaDMSWloCv8ktKyZUuWLVvG7NmzGT9+PD179uSiiy4KecKZOHEi9evX5/vvvyc3N5dKlSoFpVFVnnzySXr16pVv+ezZsws9ialq2DSR7LsoMYYzYMAAZsyYwa+//sqgQYN86cePH891112XL21qaqqvgImWlStXsmHDBnr06AHAoUOHaN68Oddffz21a9fm999/z5d+z5491KlTh5o1a/qOOa9aKZyi3Dk0atQo313A1q1badiwYdC2HTt2ZNEiN2DmnDlzfIVy3oUFQO/evRk1ahS//fYbderU8eVTr149+vfvz7ffflvmhUNh1UX/EpFXIpheLpVojSlB27ZtIyEhgSFDhnDbbbexfPlyWrVqxbZt21iyZAkA+/fvJzs7m3379tGgQQNiYmJ49dVXyckJ7iSgV69eTJo0iaysLADWr19PWloaXbp0Ydq0aeTk5LB9+3bmzZsXtG3NmjWpUaMGixcvBuD111/3rQu372rVqrF///5C0/nr0qUL7777Lunp6aSlpTFz5kw6d+4MwKBBg5g2bRozZsxgwIABvmN66aWXOHDgAOCqWsK1mYQTGOfMmTMZP358ULqpU6dyzz33kJqaSmpqKtu2beOXX35hy5YtdOjQgS+++IJff/0VgKVLl5KZmckJJ5zAiSeeSEpKCnfffbev8NuwYQPvvfde0D6mT5/OihUrgqbAggFc1dcrr7yCqvL1119To0aNfFVKefI+j8zMTB5++GFGjHDNqb/++qsvnm+//Zbc3Fxq165NWlqa7/NIS0tjzpw5RX7irSQUdufQDsiMIB+7wzBHvVWrVjF27FhiYmKIj49n0qRJVKhQgenTp3PDDTeQkZFB5cqV+eyzzxg1ahSXXHIJb731Ft26dQt51Tx8+HBSU1NJTk5GValbty7vvvsu/fv3Z+7cubRt25aWLVv6qo4CTZ48mauvvpqEhIR8dx/h9p2YmEhcXBxJSUkMGzYsohiTk5MZNmwYp59+ui/mvCqj1q1bs3//fo4//njfSbBnz56sXbuWjh07AlC1alVee+21iOr381x44YUMGDCA9957jyeffJJNmzblu6rOM23aNF9VWp7+/fszbdo0xo0bx+OPP07v3r3Jzc2latWqTJ061Xen8MILL3DrrbfSokULEhISqF27No888kjEMYbSu3dvZs+e7ctz8uTJ+da98MILNGzYkEceeYRZs2aRm5vLyJEjfdWGM2bMYNKkScTFxVG5cmWmTZuGiLBjxw769+8PQHZ2NpdddhnnnXdesWKNBgl3W+kNDXqmqn5buiEVXUpKigY2tJmjz9q1aznllFPKOgxTyoYMGcLEiROx95VKVqj/LxFZpqohX6ywBmljTJl67bXXyjoEE0LUHlE1xhhTfljhYIwxJkjYaqVovvhmjDHm6GIFgDHGmCBWOBhjjAlihYMxh6F3797s3bu3wDR33XUXn3322WHln9dRXWH8u6kO57HHHiM9Pf2w4vDXr18/3/sNeUJ1N161alXf7+vXr6d37960aNGCU045hb/97W/s2LGjWHHs2bOHHj16cNJJJ9GjR4+gN6XzPP7447Rp04bWrVv7OhT0N2HCBESE3377zbds5cqVdOzYkdatW9O2bdtjejxzKxyMKQJVJTc3l9mzZ1OzZs0C0957772ce+65pRNYAaJROOzdu5fly5ezd+9eNm/eHNE2Bw8epE+fPowcOZKNGzeydu1aRo4cWeThKgM99NBDdO/enQ0bNtC9e3ceeuihoDSrV6/m+eef59tvv+X7779n1qxZvn6MAH7++Wc+/fTTfAOFZWdnM2TIEJ555hl++OEH5s+fT3x8fLFiPZpZ4WCMn1DdVwd2R/3zzz/TtGlT3xXnfffdR6tWrejRoweDBw9mwoQJQP6r6qZNm3L33XeTnJxM27Zt+d///ge4bhQ6derEaaedRqdOnUL28+MvIyODQYMGkZiYyMCBA8nIyPCtGzlyJCkpKbRu3drXlfYTTzzBtm3b6Natm29chVDpwN3pvP/++yH3+/bbb3PhhRf6utSIxBtvvEHHjh258MILfcu6detW7K4h3nvvPYYOHQrA0KFDeffdd4PSrF27ljPPPJOEhATi4uLo2rUrM2fO9K2/+eab+fe//52v/6o5c+aQmJhIUpIbqaB27dpFevO7vLGX4MyRadlN8PuK6OZZqx20fyz8LsN0X12rVq2g7qjzLF26lLfffpvvvvuO7OxskpOTad++fcj869Spw/Lly3n66aeZMGECL7zwAq1atWLhwoXExcXx2WefcccddxTY//+kSZNISEhg5cqVrFy5kuTkZN+6Bx54gOOOO46cnBy6d+/OypUrGTNmDI8++mi+gWRCpUtMTMw3MlmgqVOncvfdd1O/fn0GDBgQsi+kQKtXrw77Wfjbv3+/rz+nQG+88QannnpqvmU7duzwdefRoEGDkH07tWnThjvvvJPdu3dTuXJlZs+e7Rts6P333+f444/3FQJ51q9fj4jQq1cvdu3axaBBg/jHP/5RaPzllRUOxnjCdV/dt2/fsN1RL168mH79+lG5cmWAfFfJgS6++GIA2rdvzzvvvAO4zvGGDh3Khg0bEBFfJ33hLFy40DfcZmJiIomJib51b775Js899xzZ2dls376dNWvW5Ftf1HR5duzYwcaNGznrrLMQEeLi4li9ejVt2rQJ2XNsUbvNrlatGiuiPHjHKaecwrhx4+jRowdVq1YlKSmJuLg40tPTeeCBB5gzZ07QNtnZ2SxevJglS5aQkJBA9+7dad++Pd27d49qbEcLKxzMkamAK/ySUlD31eG6oy5om0AVK1YEIDY21jeu8D//+U+6devGzJkzSU1N5eyzzy40n1An382bNzNhwgSWLFlCrVq1GDZsWMjG1EjT+Zs+fTq///47zZo1A+CPP/5g2rRp3H///UFdZ+d1mw2u474FCxYUejxFvXOoX78+27dvp0GDBmzfvj3s4D/XXHONb3jOO+64g0aNGrFp0yY2b97su2vYunUrycnJfPvttzRq1IiuXbv64u/duzfLly8/ZgsHa3MwxlNQ99XhnHXWWXzwwQccPHiQAwcO+EaRi9S+ffs4/vjjAZgyZUpEMeZ137169WpWrlwJuBN2lSpVqFGjBjt27MjXm6l/F9kFpRs/fny+evk8U6dO5eOPP/Z1nb1s2TJfu8PZZ5/N9OnTOXTokO8Y8to2LrvsMr788st8n8nHH3/MqlWr8uWfd+cQagosGMB1nf3yy26UgJdffpl+/fqF/Kzyqpt++ukn3nnnHQYPHkzbtm3ZuXOn71gaNWrE8uXL+ctf/kKvXr1YuXIl6enpZGdns2DBgpD7P1bYnYMxnnDdV6empobdpkOHDvTt25ekpCSaNGlCSkoKNWrUiHif//jHPxg6dCiPPvpoyBHhAo0cOZKrrrqKxMRE2rVr54s1KSmJ0047jdatW9O8eXP++te/+ra59tprOf/882nQoAHz5s0Lm27VqlVBw3Wmpqby008/5atSa9asGdWrV+ebb77hggsuYNmyZbRv357Y2FhOPPFEnnnGjfpbuXJlZs2axU033cRNN91EfHw8iYmJPP744xF/PqHcfvvt/O1vf+PFF1+kcePGvPXWW4Abj2P48OHMnu1GIb7kkkvYvXs38fHxPPXUU9SqVavAfGvVqsUtt9xChw4dEBF69+5d7HGnj2Zhu+w+mliX3eXD0dpl94EDB6hatSrp6el06dKF5557Ll9D8dGiV69efPLJJ2Udhikh1mW3MaXs2muvZc2aNRw8eJChQ4celQUDYAWDyccKB2OK6Y033ijrEIyJOmuQNsYYE8QKB2OMMUGscDDGGBPECgdjjDFBrHAwJopSU1OL3LFcqG6vDyff1NTUctM4npmZybnnnku7du2YPn06w4cPZ82aNQA8+OCDEeeTnZ1NnTp1gvqC8u84EYK7SP/oo49ISUnhlFNOoVWrVtx2223FPCLXd1fbtm1p0aIFY8aMCfl2/aFDh7jqqqto27YtSUlJzJ8/H3Bvkbdr18431alTh5tuuglwn9XAgQNp0aIFZ5xxRoHv5RSFFQ7GlBPlpXDIzs7mu+++IysrixUrVjBw4EBeeOEF39vKRSkc5syZw8knn8ybb74ZcVcnq1evZvTo0bz22musXbuW1atX07x588M6Fn8jR47kueeeY8OGDWzYsIGPP/44KM3zzz8PuBcSP/30U2699VZyc3OD3iJv0qSJr6+uF198kVq1arFx40Zuvvlmxo0bV+xYwQoHY3zS0tLo06cPSUlJtGnThunTpwOwZMkSOnXqRFJSEqeffjr79+8nNTWVzp07k5ycTHJyMl9++WVQfjk5OYwdO5YOHTqQmJjIs88+C7j+mEaPHs2pp55Knz59QvYqCu5KMykpiY4dO/LUU0/5lofb9+23386iRYto164dEydOjCjGvO7I//73v9O6dWt69uzp6wZ8xYoVnHnmmSQmJtK/f/+Qg+ps2bKF7t27k5iYSPfu3fnpp5/Yt28fTZs2JTc3F4D09HROOOEEsrKy2LRpE+eddx7t27enc+fOvq7Lhw0bxi233EK3bt34+9//zpAhQ1ixYgXt2rVj06ZNvkGNbr/9djIyMmjXrh2XX3454PpA2rZtW8jPcOrUqdx44400btyYr7/+OmSaQP/+97+58847adWqFQBxcXGMGjUqom3D2b59O3/88QcdO3ZERLjyyitDdjW+Zs0aX19O9erVo2bNmkGDOW3YsIGdO3f6unbx78J8wIABfP7550Xq8yssVT3qp/bt26s5+q1Zs8b3+4033qhdu3aN6nTjjTcWuP8ZM2bo8OHDffN79+7VzMxMbdasmX777beqqrpv3z7NysrStLQ0zcjIUFXV9evXa953cPPmzdq6dWtVVX322Wf1vvvuU1XVgwcPavv27fXHH3/Ut99+W88991zNzs7WX375RWvUqKFvvfVWUDxt27bV+fPnq6rqbbfd5ss33L7nzZunffr08W0fLp2/zZs3a2xsrH733XeqqnrppZfqq6++GrT/f/7znyE/vwsuuECnTJmiqqovvvii9uvXT1VV+/btq3PnzlVV1WnTpuk111yjqqrnnHOOrl+/XlVVv/76a+3WrZuqqg4dOlT79Omj2dnZIY+la9euumTJElVVrVKlSlAcoaSnp2uDBg00LS1Nn332Wb3hhht865o0aaK7du3yzfvv77TTTtMVK1YUmv/cuXM1KSkpaOrYsWNQ2iVLlmj37t198wsXLsx3fHmeffZZHTBggGZlZemPP/6oNWrU0BkzZuRL869//UtvvfVW33zr1q31559/9s03b94837Hl8f//ygMs1TDnVXsJzhhP27Ztue222xg3bhwXXHABnTt3ZtWqVTRo0IAOHToAUL16dcDdZYwePZoVK1YQGxvL+vXrg/KbM2cOK1eu9LUn7Nu3jw0bNrBw4UIGDx5MbGwsDRs2DNmn0r59+9i7dy9du3YF4IorrvB1kpeVlVXovouSrlmzZrRr1w5w3YmnpqYG7X/o0KFceumlQdt+9dVXvu7Hr7jiCt/4BwMHDmT69Ol069aNadOmMWrUKA4cOMCXX36ZL5/MzEzf75deemlUB9eZNWsW3bp1IyEhgUsuuYT77ruPiRMnEhsbG5Wuxrt16xZxV+Ma4ko+1P6uvvpq1q5dS0pKCk2aNKFTp07ExeU/TU+bNo1XX321yHkXVakXDiJyHvA4EAu8oKoPBawXb31vIB0YpqrLSztOU7ZCjflb0lq2bMmyZcuYPXs248ePp2fPnlx00UUh/9EmTpxI/fr1+f7778nNzaVSpUpBaVSVJ598kl69euVbPnv27EL/eVU1bJpI9l2UdHldiYPrTtx/dLmiyou5b9++jB8/nj179rBs2TLOOecc0tLSqFmzZtgTarhu0Q/X1KlT+eKLL2jatCkAu3fvZt68eZx77rm+rsbzuucO7Go8r0qvIPPmzePmm28OWp6QkBBUhdeoUSO2bt3qm9+6dSsNGzYM2jYuLo6JEyf65jt16sRJJ53km//+++/Jzs7ON4hSo0aN+Pnnn2nUqBHZ2dns27eP4447rsDYI1GqbQ4iEgs8BZwPnAoMFpHAPnHPB07ypmuBSaUZozl2bdu2jYSEBIYMGcJtt93G8uXLadWqFdu2bWPJkiWAe2ok7x+wQYMGxMTE8Oqrr5KTkxOUX69evZg0aZJvAJ/169eTlpZGly5dmDZtGjk5OWzfvp158+YFbVuzZk1q1KjB4sWLAXzddANh9+3fNXdB6SJRo0YNatWqxaJFiwB49dVXfXcR/jp16uTrvvv111/nrLPOAqBq1aqcfvrp3HjjjVxwwQXExsZSvXp1mjVr5utFVVX5/vvvI44pT3x8fL5Bkbp3784vv/ySL80ff/zB4sWL+emnn3zdcz/11FNMnToVcF2N51195+Tk8Nprr/m6Gh87diwPPvig704rNzeXRx99NCiOvDuHwClU206DBg2oVq0aX3/9NarKK6+8ErKr8bzu4gE+/fRT4uLi8nUbPnXqVAYPHpxvG/8uzGfMmME555xzVN45nA5sVNUfAURkGtAPWOOXph/wilcf9rWI1BSRBqq6vZRjNceYVatWMXbsWGJiYoiPj2fSpElUqFCB6dOnc8MNN5CRkUHlypX57LPPGDVqFJdccglvvfUW3bp1C3nVO3z4cFJTU0lOTkZVqVu3Lu+++y79+/dn7ty5tG3blpYtW4Y86QJMnjyZq6++moSEhHx3H+H2nZiYSFxcHElJSQwbNiyiGAvy8ssvM2LECNLT02nevDmTJ08OSvPEE09w9dVX88gjj1C3bt18aQYOHMill17qexwTXAEycuRI7r//frKyshg0aFChV+iBrr32WhITE0lOTubVV19l48aNQVfK77zzDuecc06+u6J+/frxj3/8g8zMTP75z38ycuRIkpKSUFXOO+88hgwZArjP8bHHHmPw4MGkp6cjIlHpunvSpEkMGzaMjIwMzj//fM4//3zADVu6dOlS7r33Xnbu3EmvXr2IiYnh+OOPz1d9BG4Uv7wuyfNcc801XHHFFbRo0YLjjjsu4jG+C1OqXXaLyADgPFUd7s1fAZyhqqP90swCHlLVxd7858A4VV0akNe1uDsLGjdu3H7Lli2ldBSmpBytXXabsrN69WpeeumlkFf2Jr+idtld2o+yhrrXCSydIkmDqj6nqimqmlK3bt2oBGeMObq0adPGCoYSUtqFw1bgBL/5RkDgA8qRpDHGGFOCSrtwWAKcJCLNRKQCMAh4PyDN+8CV4pwJ7LP2hmNHaVZzGnOsOJz/q1JtkFbVbBEZDXyCe5T1JVX9QURGeOufAWbjHmPdiHuU9arSjNGUnUqVKrF7925q164dlactjDGuYNi9e3fYR5nDsTGkzREjKyuLrVu3cvDgwbIOxZhypVKlSjRq1Ij4+Ph8y20MaXNUiI+Pp1mzZmUdhjEG63jPGGNMCFY4GGOMCWKFgzHGmCDlokFaRHYBh/OKdA1gX5TDORL2H618Dzefom4XafpopasD/FbA+qOVfZ9LJp/y/H1uoqqh3yIO15f3sTABz5XH/Ucr38PNp6jbRZo+WukooA/7o3my73PJ5HOsfp+P9WqlD8rp/qOV7+HmU9TtIk0f7XTlTVkft32fi5b+iP4+l4tqJWOKQkSWaphnu4052pTU9/lYv3Mwx6bnyjoAY6KoRL7PdudgjDEmiN05GGOMCWKFgzHGmCBWOBhjjAlihYMxfkSkuYi8KCIzyjoWY4pKRKqIyMsi8ryIXF6cvKxwMOWGiLwkIjtFZHXA8vNEZJ2IbBSR2wvKQ1V/VNVrSjZSYyJXxO/1xcAMVf070Lc4+7XCwZQnU4Dz/BeISCzwFHA+cCowWEROFZG2IjIrYKpX+iEbU6gpRPi9xg2r/LOXLKc4O7XxHEy5oaoLRaRpwOLTgY2q+iOAiEwD+qnq/wEXlHKIxhRZUb7XwFZcAbGCYl78252DKe+O588rKXD/PMeHSywitUXkGeA0ERlf0sEZc5jCfa/fAS4RkUkUs9sNu3Mw5V2owajDvvmpqruBESUXjjFREfJ7rappwFXR2IHdOZjybitwgt98I2BbGcViTLSU+PfaCgdT3i0BThKRZiJSARgEvF/GMRlTXCX+vbbCwZQbIjIV+Ao4WUS2isg1qpoNjAY+AdYCb6rqD2UZpzFFUVbfa+t4zxhjTBC7czDGGBPECgdjjDFBrHAwxhgTxAoHY4wxQaxwMMYYE8QKB2OMMUGscDCHTUSGiYiGmc4tQj6pIjKlBEMN3J9/nNkisllEJotIoyjvp6m3j2F+y4aJyNUh0uZ9lk2jGUMh8Z0d4rP4SUSeFpFah5nnTSJycbRjNaXP+lYy0XAp7nV+f2vKIpAimAI8i/sfaAf8C+gkIu1UNSNK+9gOdAQ2+S0b5u3zpYC0H3ppt0dp30UxBvfGbQLQHRiH65rhwsPI6yZgMa4DOHMUs8LBRMMKVd1Y1kEU0S+q+rX3+2IR2Y8rMM4nSic2Vc0Evi40oUu7C9gVjf0ehrV+n8Vcb1yL4SLyF1X9tYxiMmXMqpVMiRGRniIyW0S2i0i6iKwWkVu9gUoK2u4v3lCH20Qk09s+32A8IpIgIg97VUKHvJ93isjhfqeXeD9bePk3EJFXROQ3L4aVIjKkKHEGViuJyHygK/BXv6qc+d66fNVK3ue2LMRn08Cr/rnJb1kzEXldRHZ5cawQkf6H+TkALPd+NvbbRwcRmeF135AhbgSyB0Wksl+aVKAJcLnf8U3xW58kIu+LyO9eHl+ISOdixGlKkN05mGiIFRH/75Kqag7QHPgceBI4CKQA9wB1gYKG63wVd5IZi+uzvj6uuiMBwNvXJ7gRsO4DVgFnAv8EjgNuPYxjaOb93CsiVYAFQC3gDi+GIcCrIpKgqs9FEmcIo4DXgFjgOm/ZH2HSvgJMFZFTVdW/iu4y7+dUABE5AfgG2AncjLv7GAi8LSIXqerhdMbWFDeKWKrfssa4AWSmAPuB1sBduL/xIC9Nf2A28D3u74wXDyKSDCwCvgP+DqTjukb/TEQ6qWpQQWjKmKraZNNhTbj6cw0xLQ6RVnAXI3cCvwMxfutSgSl+8weAMQXs9wpvP10Clt8JHALqFRK3Ag948VTCFSxrgTSgIa5DMwXODtjuM9xJODbCOJt6+QzzWzY/zOeT91k29eYrA/uA/wtItwKY7Tf/Iu4EXDsg3ae46r6CPoezvX329D6LasBFuAJrQgHb5f0thwC5/vv2/pavhdjmc+8zruC3LNZb9m5Zf5dtCp6sWslEQ3+gg990DfiqQJ4VkS24k3YWcD9QEyhovOYlwFgRuVHcWM+BA5ucB2wBvhSRuLwJmAPE4072hbnDiycD1+NlFtBbVbcBXXBtEvMDtnkNd9dzaoRxHjZ1jeJv46poBEBE2gJJuLuKPOfhrtb3BXwWnwBJIlI9gt19gjv+P4CZwELc3ZCPiFT3qvE2AZle+ldxBcVJBWXuVT11Bd4Ccv1iFFyB2yWCGE0ps8LBRMNqVV3qN63z6v7fx43TfD9wDq7geMDbplIB+Q30tv0HsBL4RUTu8mtPqIerzskKmL711teOIOaXvHhOA+qoaqKqLvDWHUfop4Z+9VsfSZzF9QruqaGzvfkrcFU67/mlqQdcSfBn8Yi3PpLP4nrcZ3EuMB3og6ui8zcZVw30BNDDS3+9t66gvyW4zyvWyzMwztFArSh+ZiZKrM3BlJQTcW0MV6jqa3kLRaTQxyNVdSfuxHO9iJwMDMU9aroLmATsBjYDfwuTRWoE8W1X1aVh1u0BTg6x/C/ez90RxllcC4CfgCEisgAYDMzQ/I/a7sbV5T8cJo9IRgdbn/dZiMhcXNvJHSIyWVV/FpFKuMHr71HVx/M28u5kIrEXV/30FPnvenxUNTfCvEwpscLBlJS8RtmsvAUiEg9cXpRMVHUd7kQ1AmjjLf4YuAQ4oKr/i0KsgRYAl4rIX1X1C7/ll+HaHNZGGGcombi6/UKpqorI67gCaCZuKMjAk+vHuPcjftAovJ/h7fMmXMPx7d6+K+Ku/LMCkg8LkUUmrr3EP880EVmEqxJbbgXB0cEKB1NS1uLaBR4QkRzcieXmwjYSkRq4eujXgf952/XDPTk0x0v2Om4Q9c9F5D+4p2Mq4O5W+gIXqWp6MWKfAtwIvCMid+Je8LscV51ynarmRBhnKGuAUSIyEPdy3H6vYAnnFWA88AzuiagFAevvwlWnLRSR/+LummrhCqjmqhr0NnZhVPV7EXkbuEZEHlDVbSLyNXCriGwHfgOuBo4Pc3ydReQCXDXcb6qaCtyCa8v4RERexFXb1QGScQ38BT29ZspCWbeI23T0Tvz5hE2LMOvb4d6WTcedYO8FhuP3VI6XLhXvaSXcVeqzwA+4p4H+wDX8XhaQdyXc45L/w12t7vHS3QPEFRK3AvcXkqYBrsH1Ny//lcAQv/WFxknop5X+gmtA3u+tmx/wWTYNEcsSb92DYWJtBLwA/IJr+N+Oe1ppSCHHeLaX77kh1p2Ce5z1cb9j+ciLeyfwX1zbRL6nuoBWuGqudG/dlIA8p3nbZ3rfifdxDwKU+ffZpvyTDRNqjDEmiD0hYIwxJogVDsYYY4JY4WCMMSaIFQ7GGGOCWOFgjDEmiBUOxhhjgljhYIwxJogVDsYYY4JY4WCMMSbI/wdd5DiiZykrVgAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fpr_logreg_scaled_, tpr_logreg_scaled_, thresholds = metrics.roc_curve(y_clf, ypred_logreg_scaled_, pos_label=1)\n", "auc_logreg_scaled_ = metrics.auc(fpr_logreg_scaled_, tpr_logreg_scaled_)\n", "\n", "plt.semilogx(fpr_logreg_scaled, tpr_logreg_scaled, '-', \n", " color='blue', label='scaled data overfit; AUC = %0.3f' % auc_logreg_scaled)\n", "plt.semilogx(fpr_logreg, tpr_logreg, '-', color='orange', \n", " label='original data; AUC = %0.3f' % auc_logreg)\n", "plt.semilogx(fpr_logreg_scaled_, tpr_logreg_scaled_, '-', color='black', \n", " label='scaled data no overfit; AUC = %0.3f' % auc_logreg_scaled_)\n", "\n", "\n", "plt.xlabel('False Positive Rate', fontsize=16)\n", "plt.ylabel('True Positive Rate', fontsize=16)\n", "plt.title('ROC curve: Logistic regression', fontsize=16)\n", "plt.legend(loc=\"lower right\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 2 }