{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "     \n", "     \n", "     \n", "     \n", "     \n", "   \n", "[Home Page](../START_HERE.ipynb)\n", "\n", "[Previous Notebook](04-Challenge.ipynb)\n", "     \n", "     \n", "     \n", "     \n", "[1](01-Intro_to_Dask.ipynb)\n", "[2](02-CuDF_and_Dask.ipynb)\n", "[3](03-CuML_and_Dask.ipynb)\n", "[4](04-Challenge.ipynb)\n", "[5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# K-Means Challenge - Solution\n", "\n", "KMeans is a basic but powerful clustering method which is optimized via Expectation Maximization. It randomly selects K data points in X, and computes which samples are close to these points. For every cluster of points, a mean is computed, and this becomes the new centroid.\n", "\n", "cuML’s KMeans supports the scalable KMeans++ intialization method. This method is more stable than randomnly selecting K points.\n", " \n", "The model can take array-like objects, either in host as NumPy arrays or in device (as Numba or cuda_array_interface-compliant), as well as cuDF DataFrames as the input.\n", "\n", "For information about cuDF, refer to the [cuDF documentation](https://docs.rapids.ai/api/cudf/stable).\n", "\n", "For additional information on cuML's k-means implementation: https://docs.rapids.ai/api/cuml/stable/api.html#cuml.KMeans.\n", "\n", "The given solution implements CuML on a single GPU. Your task is to convert the entire code using Dask so that it can run on Multi-node, Multi-GPU systems. Your coding task begins here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n", "\n", "Let's begin by importing the libraries necessary for this implementation." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import cudf\n", "import cupy\n", "import matplotlib.pyplot as plt\n", "from cuml.cluster import KMeans as cuKMeans\n", "from cuml.datasets import make_blobs\n", "from sklearn.cluster import KMeans as skKMeans\n", "from sklearn.metrics import adjusted_rand_score\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Parameters\n", "\n", "Here we will define the data and model parameters which will be used while generating data and building our model. You can change these parameters and observe the change in the results." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "n_samples = 10000\n", "n_features = 2\n", "\n", "n_clusters = 5\n", "random_state = 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate Data\n", "\n", "Generate isotropic Gaussian blobs for clustering." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "device_data, device_labels = make_blobs(n_samples=n_samples,\n", " n_features=n_features,\n", " centers=n_clusters,\n", " random_state=random_state,\n", " cluster_std=0.1)\n", "\n", "device_data = cudf.DataFrame(device_data)\n", "device_labels = cudf.Series(device_labels)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Copy dataset from GPU memory to host memory.\n", "# This is done to later compare CPU and GPU results.\n", "host_data = device_data.to_pandas()\n", "host_labels = device_labels.to_pandas()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Scikit-learn model\n", "\n", "Here we will use Scikit-learn to define our model. The arguments to the model include:\n", "\n", "- n_clusters: int, default=8\n", "The number of clusters to form as well as the number of centroids to generate.\n", "\n", "- init{‘k-means++’, ‘random’}, callable or array-like of shape (n_clusters, n_features), default=’k-means++’\n", "Method for initialization:\n", "\n", "- ‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. \n", "- max_iterint, default=300\n", "Maximum number of iterations of the k-means algorithm for a single run.\n", "\n", "- random_state: int, RandomState instance or None, default=None\n", "Determines random number generation for centroid initialization. Use an int to make the randomness deterministic. .\n", "\n", "- n_jobs: int, default=None\n", "The number of OpenMP threads to use for the computation. Parallelism is sample-wise on the main cython loop which assigns each sample to its closest center. None or -1 means using all processors.\n", "\n", "### Fit" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/envs/rapids/lib/python3.7/site-packages/sklearn/cluster/_kmeans.py:974: FutureWarning: 'n_jobs' was deprecated in version 0.23 and will be removed in 0.25.\n", " \" removed in 0.25.\", FutureWarning)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 42 s, sys: 1.77 s, total: 43.8 s\n", "Wall time: 731 ms\n" ] }, { "data": { "text/plain": [ "KMeans(n_clusters=5, n_jobs=-1, random_state=0)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "kmeans_sk = skKMeans(init=\"k-means++\",\n", " n_clusters=n_clusters,\n", " n_jobs=-1,\n", " random_state=random_state)\n", "\n", "kmeans_sk.fit(host_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## cuML Model\n", "\n", "### Fit" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 2.45 s, sys: 86.8 ms, total: 2.53 s\n", "Wall time: 36.1 ms\n" ] }, { "data": { "text/plain": [ "KMeans(handle=, n_clusters=5, max_iter=300, tol=0.0001, verbose=4, random_state=0, init='k-means||', n_init=1, oversampling_factor=40, max_samples_per_batch=32768, output_type='cudf')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "kmeans_cuml = cuKMeans(init=\"k-means||\",\n", " n_clusters=n_clusters,\n", " oversampling_factor=40,\n", " random_state=random_state)\n", "\n", "kmeans_cuml.fit(device_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize Centroids\n", "\n", "Scikit-learn's k-means implementation uses the `k-means++` initialization strategy while cuML's k-means uses `k-means||`. As a result, the exact centroids found may not be exact as the std deviation of the points around the centroids in `make_blobs` is increased.\n", "\n", "*Note*: Visualizing the centroids will only work when `n_features = 2` " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA6oAAAJOCAYAAAC+3vo+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABChklEQVR4nO3dd5gdV30//vdnV9WSi2TLvRcMhhiDhR1wABNKgACm905i4BcIJKRQ8iWkkACBEBIDDj30bjBgiiF0MEYyNu7G3cJNbpKtrt3z+2OvySJ2bWl3pZ1dvV7Ps8/ee+fMOZ97R7Pa986ZmWqtBQAAALqib7ILAAAAgOEEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBWDcqqpV1aHbYJwXVtWPxrDe96rqT7Z02bZUVcdX1bLJrmNr69L7rKqvV9ULJrsOAH6XoAoATDlVdWVVPWI8fbTWHtNa+5+JqgmAiSOoAsBWUFUzJrsGRlZD/A4E0GF+SANMY1W1X1V9saqWV9XNVXVS7/U3VdXHh7U7sDd9d0bv+feq6p+r6idVdUdVfaWqdq2qT1TVyqr6eVUduJk1vKiqLqyq26vq8qp66bBlx1fVsqp6TVXdWFXXVdWLhi3ftapO7Y15ZpJD7mKcOVX18d77vK1X4x4jtNurqn5ZVX81Sj8v7tV7a1V9s6oOGLbsXVV1Ta+epVX14GHL3lRVn+/VsDLJC3uf4z9V1Y977/9bVbXbZn5uf15VF1TVvsM+p78Z9jk9saoeW1WXVNUtVfX6Yev2VdVrq+qy3ufx2apaOGz556rq+qpaUVU/qKp7D1v2kap6d1V9rVfzz6rqkN6yqqp39mpY0fsc7zNK/Qur6sNVdW3vs/zSKO1+a9p4b/x/7j3eraq+2tuet1TVD3vv7WNJ9k/yld6/z7/ptf/93r/Z26rqnKo6fli/36uqN1fVj5OsTnJwDZv2Xb1p5VX19l69V1TVY4atf1Dvs7q9qr7d+4x+sw8BMLEEVYBpqqr6k3w1yVVJDkyyT5JPb0EXz0zyvN56hyT5aZIPJ1mY5MIkf7+Z/dyY5HFJdkryoiTvrKr7D1u+Z5Kde+O8JMm7q2pBb9m7k6xNsleSF/e+RvOCXj/7Jdk1ycuSrBneoIbC9feTnNRae/umHVTVE5O8PsmTkyxK8sMknxrW5OdJjsrQZ/DJJJ+rqjnDlp+Q5PNJdknyid5rz+69792TzEoyYkDepI7/l+SFSR7aWrvzfM49k8zJ0Of0xiTvT/LcJEcneXCSN1bVwb22f57kiUkemmTvJLdm6LO809eTHNar6axhtd7pWUn+IcmCJJcmeXPv9UcleUiSe/Te4zOS3DzK2/hYkh2S3Ls3zjvv7n2P4DVJlmVoW+yRoW3TWmvPS3J1kse31ua31t5WVfsk+VqSf87Q9vmrJF+oqkXD+ntekhOT7Jih/WJTxya5OMluSd6W5INVVb1ln0xyZob+bb2p1xcAW4mgCjB9HZOhkPLXrbVVrbW1rbUtuRDRh1trl7XWVmQo2FzWWvt2a21jks8lud/mdNJa+1qvn9Za+36Sb2UoWN1pQ5J/bK1taK2dluSOJIf3gvZTkryxV/95Se7qfMINGQoRh7bWBlprS1trK4ctPyLJ95L8fWvtfaP08dIk/9pau7D3Pv8lyVF3HlVtrX28tXZza21ja+0dSWYnOXzY+j9trX2ptTbYWrszJH+4tXZJ7/lnMxR0R1NV9e9J/ijJw1pryzd5f29urW3I0B8cdkvyrtba7a2185Ocn+TIYe/jDa21Za21dRkKVk+t3hHz1tqHeuvduey+VbXzsLG+2Fo7s/cZfGJYzRsyFPLumaR6n9N1I7yJvZI8JsnLWmu39rbt9+/ifY9mQ4b+SHFAr48fttbaKG2fm+S01tppvc//9CRLkjx2WJuPtNbO722/DSP0cVVr7f2ttYEM/VvbK8keVbV/kgdk6N/i+t5+dOoY3g8Am0lQBZi+9svQL94bx7j+DcMerxnh+fzN6aSqHlNVZ/Smbt6WoeAwfPrrzZvUuLrX96IkM5JcM2zZSEfB7vSxJN9M8unedNO3VdXMYcufk+TXGTriOZoDkryrN3X0tiS3JKkMHcVMDU1RvrA37fW2DB3BHf5ersnvun6E9zaaXTJ0xO9fe38gGO7mXoBK/u9I8Wjb5IAkpwx7HxcmGchQ6Oqvqrf0pgWvTHJlb53h72PEmltr/5vkpAwdnb2hqt5XVTuN8D72S3JLa+3Wu3ivm+PfMnRE91s1NG38tXfR9oAkT7vzPffe9x9kKGzeaaTtM9xv3ndrbXXv4fwM/cHnlmGvbU5fAIyDoAowfV2TZP8a+aI+qzI0LfNOe26NAqpqdpIvJHl7kj1aa7skOS1D4e/uLE+yMUOh5077j9a4d8TtH1prRyR5UIamGz9/WJM3JbkpySd7R2tHck2Sl7bWdhn2Nbe19pMaOh/1b5M8PcmC3ntZscl7Ge1o3+a6tVf3h6vquHH0c02Sx2zyPua01n6doanIJyR5RIaC9oG9dTZnm6S19p+ttaMzNKX3Hkn+epTxF1bVLpvR5eqM8m+xd9T3Na21g5M8PslfVtXD71w8wpgf2+Q9z2utvWV4+ZtRz0iuy9D7GV7nfqM1BmD8BFWA6evMDP2C/ZaqmldDFxu6M/ycneQhVbV/b8rn67ZSDbMyND12eZKNvYvTPGpzVuwdPfxikjdV1Q5VdUSGzkMdUVU9rKp+rxdCV2Zo2ujAsCYbkjwtybwkH6uRr/p6cpLX3Xlxoarauaqe1lu2Y4aC8/IkM6rqjRk673ZCtda+l6Gjv6dU1bFj7ObkJG++c8pyVS2qqhN6y3ZMsi5D55bukKHpzZulqh5QVcf2jlSvytD5wwObtutNB/56kvdU1YKqmllVDxml27OTPLt3pPfRGTqv9s7xHldVh/bOE13ZG+vO8W5IcvCwfj6e5PFV9Ue9vubU0EWo9t3c9zea1tpVGZpG/KaqmlVVD8xQcAZgKxFUAaapXtB7fJJDM3ThmWUZuvhNeufvfSbJL5MszdBFl7ZGDbdn6MI+n83Q0cJnZ8vO7XtFhqZeXp/kIxm6mNNo9szQtN6VGZrq+v0MhZfh9azP0IWSdk/yoU3DamvtlCRvzdD04ZVJzsvQuZbJ0LTirye5JENTkNdmK03/7G2fFyU5taqOHkMX78rQ5/ytqro9yRkZulBQknw0Q/X/OskFvWWba6cMXcTp1l4fN2foaPlInpehPw5clKELar16lHavytC/09syFNC/NGzZYUm+naHzln+a5D29IJ8k/5rk73rTfP+qtXZNho4Uvz5Df0y4JkNHeyfqd53nJHlght7zP2do/1k3QX0DsIka/ZoEAACMpKo+k+Si1trmXv0agC3giCoAwN3oTXs+pIbu4/roDB29/dIklwUwbY10gQ0AAH7bnhk6Z3rXDE2jf3lr7ReTWxLA9GXqLwAAAJ1i6i8AAACd0umpv7vttls78MADJ7sMAAAAJtjSpUtvaq0tGmlZp4PqgQcemCVLlkx2GQAAAEywqrpqtGWm/gIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApmx1Uq+pDVXVjVZ037LWFVXV6Vf2q933BKOs+uqourqpLq+q1E1E4AAAA09OWHFH9SJJHb/Laa5N8p7V2WJLv9J7/lqrqT/LuJI9JckSSZ1XVEWOqFgAAgGlvs4Nqa+0HSW7Z5OUTkvxP7/H/JHniCKsek+TS1trlrbX1ST7dWw8AAAB+x3jPUd2jtXZdkvS+7z5Cm32SXDPs+bLeayOqqhOraklVLVm+fPk4ywMAAGCq2RYXU6oRXmujNW6tva+1tri1tnjRokVbsSwAAAC6aLxB9Yaq2itJet9vHKHNsiT7DXu+b5JrxznutLZhw0BuuGllVq9ZP9mlAAAAbHMzxrn+qUlekOQtve9fHqHNz5McVlUHJfl1kmcmefY4x52WBgYG86HP/iSf+9pZGRwczMBAy4OOPjh/9dJHZsHOO0x2eQAAANvEZgfVqvpUkuOT7FZVy5L8fYYC6mer6iVJrk7ytF7bvZN8oLX22Nbaxqp6RZJvJulP8qHW2vkT+za6Z/WadfnmDy7Mz866IoOtZf99Fmbt2g3ZZ69dssOcmfn4KT/P9TeuSEuyx27z89BjD88vL16Wy65cng0bB3/Tzw/O/FUuuuz6fPK/XpLZs8b7dwUAAIDuq9ZGPV100i1evLgtWbJkssvYIkvPvTqv/ddTsmbdhgnt9/fvd1D+8TWPzw5zZ01ovwAAAJOhqpa21haPtGxbXExpu/HX//LFvOpNn53wkJokZ/ziipzwJ+/NORcum/C+AQAAukRQnQAXX35Dnv3KD+anSy/fquOsWbshf/3mL7rIEgAAMK056XEM1qxdn89+ZWm+8I1f5NYVq7MtZ0+vX78xf/uvp+SIw/bMIx98RA490C18AACA6cU5qlto3boNOfF1n8hVy27JxoHBu19hK+nrq8yc0Z/HPOzeec2fPiJVI92uFgAAoJucozqBvvKdc7Ps+tsmNaQmyeBgy7r1G/ON712Q//3JxZNaCwAAwEQSVLfQV79zbtat2zjZZfzG2nUb8skv/XyyywAAAJgwzlHdQuvXjzGktpYHXX9hnnT5T3PUTZdlxuBgLt9pz5x60LH5+gGLs75/5phruvHm28e8LgAAQNc4orqFDjtg9y1ep9pgXnfW5/LS876RL89/RI7a7/Qcst8Z+fsdXpdjr7gi7/7ee7PTulVjrmnfvRaMeV0AAICuEVS3wB2r1uV7Z16yxes99+LvZq8Vt+WRC7+QD284MXfMnJs2d31+OP8BedpOH8n320Pyd2d8bsx1Peb4e495XQAAgK4RVLfAqaefk4GBLbtK8syBjXnqZT/OK+e9LWtnzMzsHW5Pf//GVCX9/Rsze94defNur8iBty3PATfdPKa6PvzZH6fLV28GAADYEoLqFvjqd87d4nWOXn5prpi1by7tPyQzZq4fsU3NGsxn5p2Q4y8d29V7b751VX5x/jVjWhcAAKBrBNUtcN2NK7d4nZ3Xr8pV7cDMmL3mrvues2vm3TG2W94MtmTZdbeNaV0AAICuEVS3wMDglgfJFbPmZZ8N16ev766vFrzvxutyc3Yda2nZa/edx7wuAABAlwiqW2DOrC2/m8/SRYfmkI1X5aC1y0ZtM6NtzNNWnpYv7/zwMdVVlRz9e/uPaV0AAICuEVS3wOIjD9jidTb0z8jH93503nr9WzN3cITpv63ltTe+NxfNOCzX7jtnTHXNnNGf21etHdO6AAAAXSOoboFnnfCAMa33mSOPzZUz98uXrzgxT11xWuYNrMqswfV58Koz85Flf5VjV5+T/2/3N2fHXa8dU//rNwzkM19ZMqZ1AQAAukZQ3QIH77/bmNabMXt93v7AP86bF/55/vjW7+WsSx+fCy95RF5343tz2qxH5ol7fjA73PPSzJw19qOip//wojGvCwAA0CVbftLldmz9hoExrzt3xxW55AFz8sqb/zK3Ld8/Axtmp3/muuyy6OrsvuvScYXUZGwXegIAAOgiQXUL7LLT3FSSNsb1Z85am4V7XZ6Fe10+kWVlRn9fHnrsYRPaJwAAwGQx9XcLVFUOO3j3yS7jt1Qlc+fMzLPHeP4sAABA1wiqW+h1f/bozOjvxsfW39+XB9z3wLz/rc/Nol13nOxyAAAAJoSpv1vosAN3z7ve9PT863u+kV9fvyKtjXUi8JapGroNzfEPvEcec/wROfzgPTN71ozMnj1zm4wPADBdDQ4OpqpSVZNdCtAjqI7BfY/YN58+6U9y2VXL8+p//FxuvW31Vh1v3tyZefgf3CsvfNoDs7sjpwAAE+KCn16c9/31x3LBGZekr6+y+NH3y8ve/vzse4+9J7s02O7VtjoiOBaLFy9uS5Z0+/6gt9y2Kv/wH1/L0nOvntB++yp50188Ln943D0ntF8AAJJf/uCCvP6xb8661et/6/Xqq9z3oUfkOX/31Bz1sPv81rLBwcF891M/zhff9dXcct1tOeSoA/Os1z05937Q4duydJg2qmppa23xiMsE1Ynx3Z9ekn87+ZvZsHEwfX2V9es3ZtbMGVmzbkMGB0f+jO+cXTJ8E8yc0Zf+/r78y988McccdeDWLxwAYDv0p0f+Za4875q7b1jJ4Ucfkpe+4/k59T3fzI+//PNsWLthaFEls+bOyivf/Sf5oxc8bCtXDNOPoLqNbBwYzEWXXp/16zfmnofumSR528nfyg9+9qvMmNGf9Rs2Zv+9FuSA/XbNvnsuyJH32ifHHnVQrrtxRU49/Zxcc92tOezA3fP4Rx6Z3RbMn+R3AwAwffz60uvyuXecml98+9ysuPn2rNrCU7f6+vsyODDyfetn7zArn7v+A5k7f+5ElArbDUF1kt2xal2W33J7dls4PzvOmzPZ5QAAbDduuvaWvPq4N+SGq27aamPM3XFOXvP+l+ehT3/QVhsDpqO7CqouprQNzJ83O/PnzZ7sMgAAthu3XH9rPveOU/P5f/9qspWPy6y5fW2uvujXW3cQ2M5044agAAAwQa674oa8+F6vyuffsfVD6p0+8eYvZM2qtdtmMNgOCKoAAEwbq1auzluf/59ZtWLNNh13YMNAnrDj8/KCe7wyP/3K1D91DSabqb8AAEx5v/zBBXn3qz6Uq85floGNA5NWx7WXXp83nvDW9PVVFu61IE/688fmya/+48yY6ddu2BKOqAIAMKWd96ML8/rHvjmXn3PVpIbU4QYHW2769S356D98Nn/3+LdkcHDkKwYDI/OnHQAAprT3/uX/ZN3q9ePqY5e2No/KVdk3t2dD+rM0u+eM7JXBGt9xnXWr1+f8n1ycX3zn3Bz9yPuOqy/YnjiiCgDAlLTmjjU5/WPfz6+WXj7mPqq1vKidlw/nm9kts/P9PC5L86g8Prfmw/nfHNa27H6rI1l7x9r87yd/NO5+YHviiCoAAFPOD79wRt76gpNSlbQ29kv7vjjn5fdyW47Lx7M8u2Zmbk9f1ucjeVQemvPzzvxbXtMenV/X+KYUj/eIL2xvHFEFAGBKueqCa/LWF/xX1q1el7Wr1o25n0VtdR6TK/OMnJRbMzdzcnP6sz6VpD/r86MclnfkWXlursiGNn9cNR929EHjWh+2N4IqAABTyuf//avZuH7juPt5bK7IV3N0bsmCzMjIt7P5Uh6U++eizM2icY216z4Lx7U+bG8EVQAAppRfnXV5BjaO/yq6B2ZFfpiHZGZuH7XN2szOeTkou2Z8R1Tb4NinJ8P2SFAFAGBK2eOA8R3dvNPG9KU/lb7c9fmjs7M+67PDuMbadW9HVGFLCKoAAEwpT/rzx2bODrPH3c85WZQ/zvczmFmjtlmUW3PPXJ3zs9+Yx5kzb3bue/wRY14ftkeCKgAAU8pRD7tPTnjFozN77ugBc3N8J/vnuPwih+f6Udu8PF/KqTk+MzP2W+C86Yt/nf7+/jGvD9sjQRUAgCnnT97y3PzHj/45Oy4Y+7mja2pm/j3H5mN5Q47Puan833mvO+eOvD4fzR/k3Lw1L8wuuWRMY9z/kb+Xox953zHXCNsr91EFAGBKOvR+B+VP3/bcnPTnH8z6NRvG1MdParfc3o7LX+ZTeWNW55c5JPOyOotzcU7PA/OkvDM75/uZWXdscd9z5s3OY1788DHVBds7QRUAgCnrUS84Pj865cycedpZY+7j3JqdV7Q/yN6Zl10zN2syP6/JXya5OrvmtDGF1P4ZfdnjgEU57knHjLku2J6Z+gsAwJTVP6M//3Tq3+Yl//KscfUzs1Zled2Yi+qqXFXnZ1F9KYvqrDGF1FTyiOc+JP/xo3/OzFkzx1UXbK8cUQUAYErr6+vLM1/75DzwhGPyzhP/O+f/+KJJq2Xm7Jn5+BUnZeGebkcD4+GIKgAA08IB99o3//HDf8onrz45/TO2/VV2j3zoEfnUspOFVJgAgioAANPKon13zUln/us2DavH/vHRecd3/yE777rTNhsTpjNBFQCAaefQow7Kv3//H3LwfQ/YquP09w9dNOk1H3z5Vh0HtjeCKgAA09IRDzw8//2Lt+ez138gf/icP0hq4vqeMWtGDj/m0Lzq5BPzgfPfmQW77zxxnQMupgQAwPS2YPed89J/e37O/s55WXnLHdm4fuOY+zr8AYfkbz/6yux3+D4TWCGwKUdUAQCY9hbuuSAnn/32nPBnj87CvXbJ/AXzs8dBi9LXv/m/Dj/0GQ/Kf53xr0IqbAPVWpvsGka1ePHitmTJkskuAwCAaeq///qj+cp7vpkN6zZkcLBl9g6zkyRz58/OqpVrstOCebnv8ffOiW9/QXbda8EkVwvTS1Utba0tHmmZqb8AAGy3Xvpvz8/Dn/PgfPND381ty1fkqIfdJ3/4nAdn7rw5k10abNcEVQAAtmuHHnVQDv3Pgya7DGAY56gCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANAp4w6qVXV4VZ097GtlVb16kzbHV9WKYW3eON5xAQAAmJ7GfXua1trFSY5KkqrqT/LrJKeM0PSHrbXHjXc8AAAApreJnvr78CSXtdaumuB+AQAA2E5MdFB9ZpJPjbLsgVV1TlV9varuPVoHVXViVS2pqiXLly+f4PIAAADougkLqlU1K8kTknxuhMVnJTmgtXbfJP+V5Euj9dNae19rbXFrbfGiRYsmqjwAAACmiIk8ovqYJGe11m7YdEFrbWVr7Y7e49OSzKyq3SZwbAAAAKaJiQyqz8oo036ras+qqt7jY3rj3jyBYwMAADBNjPuqv0lSVTskeWSSlw577WVJ0lo7OclTk7y8qjYmWZPkma21NhFjAwAAML1MSFBtra1Osusmr5087PFJSU6aiLEAAACY3ib6qr8AAAAwLoIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdMiFBtaqurKpzq+rsqloywvKqqv+sqkur6pdVdf+JGBcAAIDpZ8YE9vWw1tpNoyx7TJLDel/HJnlv7zsAAAD8lm019feEJB9tQ85IsktV7bWNxgYAAGAKmaig2pJ8q6qWVtWJIyzfJ8k1w54v6732O6rqxKpaUlVLli9fPkHlAQAAMFVMVFA9rrV2/wxN8f2zqnrIJstrhHXaSB211t7XWlvcWlu8aNGiCSoPAACAqWJCgmpr7dre9xuTnJLkmE2aLEuy37Dn+ya5diLGBgAAYHoZd1CtqnlVteOdj5M8Ksl5mzQ7Ncnze1f//f0kK1pr1413bAAAAKafibjq7x5JTqmqO/v7ZGvtG1X1siRprZ2c5LQkj01yaZLVSV40AeMCAAAwDY07qLbWLk9y3xFeP3nY45bkz8Y7FgAAANPftro9DQAAAGwWQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4Zd1Ctqv2q6rtVdWFVnV9VrxqhzfFVtaKqzu59vXG84wIAADA9zZiAPjYmeU1r7ayq2jHJ0qo6vbV2wSbtfthae9wEjAcAAMA0Nu4jqq2161prZ/Ue357kwiT7jLdfAAAAtk8Teo5qVR2Y5H5JfjbC4gdW1TlV9fWquvdd9HFiVS2pqiXLly+fyPIAAACYAiYsqFbV/CRfSPLq1trKTRafleSA1tp9k/xXki+N1k9r7X2ttcWttcWLFi2aqPIAAACYIiYkqFbVzAyF1E+01r646fLW2srW2h29x6clmVlVu03E2AAAAEwvE3HV30rywSQXttb+fZQ2e/bapaqO6Y1783jHBgAAYPqZiKv+HpfkeUnOraqze6+9Psn+SdJaOznJU5O8vKo2JlmT5JmttTYBYwMAADDNjDuottZ+lKTups1JSU4a71gAAABMfxN61V8AAAAYL0EVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE6ZMdkFAMB00DZelQzemPQfnPQtHHqewdSMg5JsSNKfbLwobe3pSQZSsx+WzLxfqmpyCweADhJUAWCM2uCtaevPSVa+ORm8NkP/rW5IMtj7Stpo6676n2TW/ZIF70/VrG1TMABMEYIqAGyhNnhb2m1/k6z/Qe4MpEM2bEEva5P1P01b/vS0GQcltS6Z9eDU3Cek+uZNcMUAMLUIqgCwBVobSLv5GcnAFRPT4eAFyfoLhh6v+27aHe9Kdv1sasb+E9M/AExBLqYEAFugrfvexIXU3zGQtFvSbnvFVuofAKYGR1QB4G4MrvtJsuLvksFl22bAjRdncP3F6Zt1+LYZDwA6xhFVALgLg2tOT2594bYLqUmSltzy+Ayu+co2HBMAukNQBYBRDK7+XLLizyavgBV/lbbx0skbHwAmiaAKACMYvO3vk5VvmOQqWtpNT0lrGye5DgDYtpyjCgDDtIHr025+QTI4jgsmDbbkh6uTX6wben6/2cmDd0j6agydrUm79dWphSeNvR4AmGIEVQDoaYO3pd30xKTdMvZOfrQ67a9uyh2D8/PzBcdm3fqZue/7z8ouM5dn4F/2yI6PbVve5/pvpbV1qZo99roAYAoRVAGgp63+dNJWjL2DH67O4Ik35t1H/2XO2e3+2W3X2zJnzrp8f81jsscFV+dP/7+T8uu3HZp9nn77lte2bmlqzoPGXhsATCHOUQWAO639RpKBsa072DL4NzflPff/i1xx8OHZb98bMnfuulQlc3dYn5WL98wnH/aSzPx/1+WWm3fa8v5X/k1aWzu22gBgihFUAeBOg7eOfd0frcmqgfk5e9HR2XmnO0Zsct0RB2Z936xc8tE9xlDbzYnb1QCwnRBUASBJG7w9Gbxx7B2cszZLdjkmu+162+htqnL5gYfn5u/OHcMAA2lrvjzW6gBgShFUASBJ1v80yfguVrRuw8zMmbPuLtv09w9k7boxjtPWjG09AJhiBFUASJIMJjWW28f03G9OjrrhrKxdM2v0Nq3l0CsuzI377jO2Mfr3G9t6ADDFCKoAkCSzjknahrGvf9zc7Dj79ux9/pWjNjnsiguSDcl+Tx/jubAzjxzbegAwxQiqAJCk+hYmOzwvqbGcP5qkKoNv2SMv/tl7ss+5lyZt2P1SW8s9LjsvJ3z9k/nk778oDzrm7DEM0Jea88ix1QYAU8yE3Ee1qh6d5F1J+pN8oLX2lk2WV2/5Y5OsTvLC1tpZEzE2AEyU2vFv0vr3SG4/KcnKLV5/xz9qWfaOw/L4138+Az/rz2UH3jP9MwZy6BUXZXBj5X0P/vM87k0/y8IFY7hXa/8+qRmm/gKwfRh3UK2q/iTvTvLIJMuS/LyqTm2tXTCs2WOSHNb7OjbJe3vfAaAzqio174VpOzw/7YYjkgxucR/7PmVlbnnYXrn8Y3tmxf/ukLXrZuXsBx6d/Z92a15wzDfHFlKTZIZpvwBsPybiiOoxSS5trV2eJFX16SQnJBkeVE9I8tHWWktyRlXtUlV7tdaum4DxAWBCVfWl9e+bDFw9pvUXLlyZB75qZfKqCSxq7lMnsDMA6LaJOEd1nyTXDHu+rPfalrZJklTViVW1pKqWLF++fALKA4AxmPeyJGM8X3Wi9R+amv2gya4CALaZiQiqI13Lv42hzdCLrb2vtba4tbZ40aJF4y4OAMai5j4l2eHpmaDLOYzdrONTu30xNZ5b5wDAFDMRQXVZkuFXd9g3ybVjaAMAnVFV6dvpDclu30j6tvVFjGYms5+Q7Pbz9C18X6rmbOPxAWByTURQ/XmSw6rqoKqaleSZSU7dpM2pSZ5fQ34/yQrnpwIwFfTN2D+16JvJ/Ddm5AlCE6gOTRZ+Nn17np++BW9P34ydt+54ANBR457P1FrbWFWvSPLNDN2e5kOttfOr6mW95ScnOS1Dt6a5NEO3p3nReMcFgG2lakZq/nMzOPOw5NbnTXDvM5JZv5/a+c2p/r0muG8AmJqqtRFPFe2ExYsXtyVLlkx2GQDwG4OrPpnc/k9JBsbZ0+xk539MzX5Eqm/HiSgNAKaUqlraWls80rJJvkIEAEwtffOenTbnD9PWfCkZuC41897JnMeltQ3Jms8na76VDJybZOPondTcZM4TUnOe6CJJADACQRUAtlD175ma/7Lffi1J5r8kmf+SDA5uTG57ZbL+f/N/F7nfOZmxXzJj/9TcZyazjhVSAWAUgioATLC+vhnJwvemtXXJxquSvp1S/XtOdlkAMGUIqgCwlVTNTmbeY7LLAIApZyJuTwMAAAATRlAFAACgUwRVAAAAOkVQBQAAoFNcTAkAAGCSrdu4MT+6+qqsWLc2R+25Vw5esHCyS5pUgioAAMAk+uolF+WvT/9GNg4OZrAN3X97p9mz8/cPfXhOOPye2+V9t6u1dvetJsnixYvbkiVLJrsMAACACXf1itty0s/OyOcvOv8u2x22YGHe+sg/yk6z52S3HeZlp9mzt1GFW1dVLW2tLR5pmSOqAAAA28gta1bn25ddmnedeUauu+P2zVrnV7fekid/9lO/eb7r7Dn578c/Mfffe5+tVeakE1QBAAC2ssHW8uYffi8fPfusDIyzr5vXrc1TP//pJMlRe+yZhx90SC68aXn2mD8/T7/37+XwXXcbd72TTVAFAADYyt51xo/zkbPPykSfeHn2Ddfn7Buu/83zj//y7Lxs8TH5i98/boJH2rYEVQAAgAmwYWAgS6+7Nqs3bMhRe+6ZhXN3SJLcuOqO/NfPf7ZtahgczH+deUZWrluXP7nf4uyz007bZNyJJqgCAACM0+mXXZq//vY3MjDYUpWsHxjIvjvulCtuu3XCj6Jujv855xf55C/PzvOPun9e/wcPnXJXDu6b7AIAAACmsl/ecH1e9c2vZeW6dVm1YX3uWL8+6wcGcvkkhdQ7bWgtn/jlOfnSRRdOYhVjI6gCAACMw0lnnpF1GzdOdhkjWjuwMe9Zsm2mHU8kU38BAADG4ewbrpuwI6eHX3tdnvHTn+WgG5dnw4z+/PCeh+eUBxydO+bMGXOfV6+4bYKq23YcUQUAABiHnWePPUTeaebGjXn7xz+VD733g7nxjn3ynzP+v3xk7Ytz1Pduz/fe+JYcv+RXY+574+Bg1m7cMO4atyVHVAEAAMbhuUcelbf++AdZO47pv2/+zOez420bc+S9vp61mZe+XdakZm7MVzY8OPe74dJ89tOvzMvzJ/nF4t23uO/+vr5887JLc8Lh9xpzfduaI6oAAADj8Kz7HJn7LNpjzOsfcv0NeciFl+TZC0/O+jkzMmPB7embtTFVSd+sjTlnvwPz+kNenVee+r0MrNryo7cbBwdz46o7xlzfZBBUAQAAxmFWf3/e+8ePH/P6z/jpz/KJQx41dCR17voR25y6/0Nyr9WXZa+zx3abmUMW7Drm+iaDoAoAADBOC+fuMOZ1D7jp5ixZf2z65q0Ztc2Gvpm5YP4h2evCsV226f577TXW8iaFoAoAADBO47k9zdqZMzNv7brUzLvuY/7gHVmzcWyB+L+X/HxM600WQRUAAGAcNgwM5Emf+eSY1//+EYfnKbd9PW3D6Ne6PXD1shy0+tr8Ytd7jGmMz15w7ljLmxSCKgAAwDicctEFufTWm8e8/teOOir3X31ejrt2lDDZWv7fJe/Lx3Z7UupeN45pjPUDA2OubzIIqgAAAOPw8V+enYE2tnNHk2TdrJl55XNekI9e+ld54WVfydyBtb9Zdsiqq/Ohc96YvVbflH/d/08z+6BrxzTGUXtOrXNU3UcVAABgHFauXzfuPs48cv887/mvzCtP+X7+3xXvzUXzDsz8wdXZc+1N+Z9FT8mJ93pz5jzi/PTPW3v3nY3gH49/+Lhr3JYEVQAAgHE4dp/9smzFigyOs5+L77tzXn7o07PgvOdlr/Nb1m7cIectPDj997w+Oxy0dMwh9Un3PCIHLVg4zuq2LUEVAABgHF569APy1UsuyppxXPn3Tv3z1mblsWuz8tgkWZkdc/24+jtm733yb4989Ljr2tacowoAADAOBy9YmA+f8JTsMW9+dpg5MzvOmpXZ/f05bMGuk1rXWx/+qHzqKc9IX9Wk1jEWjqgCAACM0zH77JufvPjEnLf8xtyxbl2OWLR7WlqO/cDJ2TA43knBW+4vjn1Qnnbv39vm404UQRUAAGACVFV+b/c9fuu1zz/92XnG5z6Vtdvo9jAzknzu6c/OfafYVX43ZeovAADAVvJ7u++Rc1/+57nPot236jh9SZ58zyNyxp++fMqH1MQRVQAAgK2qv68vb3vko/Pkz3xiqxxZ3W+nnXPas5+XebNmT3jfk8URVQAAgK3snrstygef8OTsMW9eZvZNTAzrr8oJh98zpz7zudMqpCaOqAIAAGwTD9xv//zkxS/NRTctzwU3Lc/7l/48l916S1prv3MP1kryiIMOyfWr7kilct899szj7nF4jt57n1SS1Rs2ZM6MGemfoNDbNYIqAADANlJVudei3XOvRbvnKfe6d5avWpXVGzZkwZw5+cZlv8qvbrk5++20c044/F7Zec6cUfuZN2vWNqx62xNUAQAAJsmiefN+8/jpU/h2MhNteh4nBgAAYMoSVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOgUQRUAAIBOEVQBAADoFEEVAACAThFUAQAA6BRBFQAAgE4RVAEAAOiUGeNZuar+Lcnjk6xPclmSF7XWbhuh3ZVJbk8ykGRja23xeMYFAABg+hrvEdXTk9yntXZkkkuSvO4u2j6stXaUkAoAAMBdGVdQba19q7W2sff0jCT7jr8kAAAAtmcTeY7qi5N8fZRlLcm3qmppVZ14V51U1YlVtaSqlixfvnwCywMAAGAquNtzVKvq20n2HGHRG1prX+61eUOSjUk+MUo3x7XWrq2q3ZOcXlUXtdZ+MFLD1tr7krwvSRYvXtw24z0AAAAwjdxtUG2tPeKullfVC5I8LsnDW2sjBsvW2rW97zdW1SlJjkkyYlAFAABg+zauqb9V9egkf5vkCa211aO0mVdVO975OMmjkpw3nnEBAACYvsZ7jupJSXbM0HTes6vq5CSpqr2r6rRemz2S/KiqzklyZpKvtda+Mc5xAQAAmKbGdR/V1tqho7x+bZLH9h5fnuS+4xkHAACA7cdEXvUXAAAAxk1QBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgU8YVVKvqTVX166o6u/f12FHaPbqqLq6qS6vqteMZEwAAgOltxgT08c7W2ttHW1hV/UneneSRSZYl+XlVndpau2ACxgYAAGCa2RZTf49Jcmlr7fLW2vokn05ywjYYFwAAgCloIoLqK6rql1X1oapaMMLyfZJcM+z5st5rI6qqE6tqSVUtWb58+QSUBwAAwFRyt0G1qr5dVeeN8HVCkvcmOSTJUUmuS/KOkboY4bU22nittfe11ha31hYvWrRo894FAAAA08bdnqPaWnvE5nRUVe9P8tURFi1Lst+w5/smuXazqgMAAGC7M96r/u417OmTkpw3QrOfJzmsqg6qqllJnpnk1PGMCwAAwPQ13qv+vq2qjsrQVN4rk7w0Sapq7yQfaK09trW2sapekeSbSfqTfKi1dv44xwUAAGCaGldQba09b5TXr03y2GHPT0ty2njGAgAAYPuwLW5PAwAAAJtNUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTBFUAAAA6RVAFAACgUwRVAAAAOkVQBQAAoFMEVQAAADpFUAUAAKBTZkx2AQAAbFuttfzqjkuz9Naz0lrLQxY9OPvusM9vll12x2VZcutZuX3DHTlyl/vkmIUPSFVNctXA9qRaa5Ndw6gWL17clixZMtllAABMG2sG1uSfzvuX/Hrdtb/1+m4zds1RC++bJbcszW0bV/zWsv705w92fVBWDKzMLjN3zvGLHpqD5h+4DasGpqOqWtpaWzziMkEVAGD78bqz/y7Xrr9uXH1UKofOOyTPO/A52XeHfdJf/RNUHbA9uaugOq6pv1X1mSSH957ukuS21tpRI7S7MsntSQaSbBytGAAAto4fL/9p3n/FB9My/oMULS2/WnVp3nj+P2RWzczDdj8+KzaszLkrzk1VX45ZuDhP2PtxWTBrwQRUDmyPxhVUW2vPuPNxVb0jyYq7aP6w1tpN4xkPAIAtMzA4kFcsfXVWt9Vbpf/1bUO+ecPpv/Xa92/8YX5+y5L8w73/PrvOXrhVxgWmtwm5mFINnV3/9CR/OBH9AQAwdpfffkU+euXHc9WaqzOYwW0+/kAGsmrj6nxh2Sk58ZCXbPPxgalvoq76++AkN7TWfjXK8pbkW1XVkvx3a+19o3VUVScmOTFJ9t9//wkqDwBgemut5ddrrs1/XfLuXL/+hskuJ4MZzM9vXZITI6gCW+5ug2pVfTvJniMsekNr7cu9x89K8qm76Oa41tq1VbV7ktOr6qLW2g9GatgLse9Lhi6mdHf1AQBs7y674/K877IP5Pp1kx9QhxtoA5NdAjBF3W1Qba094q6WV9WMJE9OcvRd9HFt7/uNVXVKkmOSjBhUAQDYfDesvSFvvejtWTe4brJL+R33mH/YZJcATFETMfX3EUkuaq0tG2lhVc1L0tdau733+FFJ/nECxgUA2O599bqvZ8Pg+nH1scPt63Lc1y7Ng75xWXa+ZU3u2Gl2znzEQfn+E+6R2xfOHXO/T933yeOqC9h+9U1AH8/MJtN+q2rvqjqt93SPJD+qqnOSnJnka621b0zAuAAA273zV1yQwXHccmb/S27OvzzrlBz0i1vy7oc8I09/wHvzV4v+OfWNRfmnp34lB//wtjH3/ZObzxjzusD2rVrr7mmgixcvbkuWLJnsMgAAOutlS1+RNQNrxrTu/NvW5p+fc0o+8LxH5QNXviqDG/sza+c70j97QwbWzczRl16cd571j/m7dzwjq4/ZMKYxXnOPV+fIXX5vTOsC01tVLW2tLR5p2UQcUQUAYBLcvuH2rB1YO+b1H/rli3PO/Q/MB658Vfpmr88Oe9yaGXM2pCqZMWdDzrnPwfnCIY/K7//H8qxbMW9MY/z7Je/KR6742JhrBLZPgioAwBT1qzsuzazMHPP6x512WT53j0cPHUmdP3Lg/eKRj8gJV5+eG885eExjtLR8d/n38rErPzHmOoHtj6AKADBFzayZ6esf+69zu9y0OkuXHZdZO98xapvrdtw9Owysyc1nHjrmcZLkf2/8Xq5adfW4+gC2H4IqAMAUdfhOh2c8lxtZM39mdr51Tfpnj37+6Y7r7kiryqo1O419oCSDGcz3b3R3QmDzCKoAAFPUrL6Zef6Bzxnz+ksedmCeesNXM7Bu9OnDT/jV/+bb+z0oM0eZGrwlrl59zbj7ALYPgioAwBR23G4Pyj3m32NM637nKffK0675ag6++voRl+9z+w154S+/lA/v+8zstfjC8ZSZJFk3uG7cfQDbB0EVAGCKO/GQF6dSW7zejfvtlA//xUPykTNek2cv+Vp2Wnt7kmTuhjV52oXfyEe+8rq854jn5NxFh2f3Iy8bd507ztxx3H0A24cZk10AAADjs2j2ojxp7xPyxWu/tMXrnv34vbJsxyfmISddkVd86uNZN2Nm5m5cmx/tfXT+/Kh/yC/2uHeOePa3MnvnVeOqcVbNyuIFR4+rD2D7IagCAEwDM/rG/mvdTcf35VP3u1fes/SPsupn++W2dQszsHPLXosvzP2OPGXcIbUvfZk3Y16O2+2B4+oH2H4IqgAA08DKjSvHtf7snVdlzz+8MPnD8Z+L2pe+zOwbukDTQBvIvXa6Z15y0Aszu3/2uPsGtg+CKgDANHDADgekL30ZzOBkl5KWlpPu/67cuv6WzJ8xP/NmzJvskoApxsWUAACmgcULj86c/jmTXUaSZLfZu2VW38zsMWcPIRUYE0EVAGAamNU3M2884g3ZacZOk1zHrJyw9+MmtQZg6jP1FwBgmthr7p75z/v9ey6+/ZJ89dqv5dyV52+Tce88J3WwDeSRezw8f7DbcdtkXGD6ElQBAKaRqso9dzo899zp8KzeuDpfWHZKfnrTz7JqcHxX7h3JzMzIojmL8oe7H5/Z/bNz5M5HZpdZO0/4OMD2p1prk13DqBYvXtyWLFky2WUAAExpGwY35BVnvTprB9dOSH996csh8w/OH+35yNxvl6PGdWscYPtVVUtba4tHWuanCgDANDezb2ZefY9X5p2X/GcG2kA2to2pVGbUjDxl3yfl6AX3z+k3fCcXr7w4V625+nfXr5l54YHPy95z986MvhnZd+4+6SuXOgG2HkdUAQC2E7euvzXfvfH7uWLVldlt9q75w90flv122Pe32ly56qp85MqP5urV16Qvfdlx5vw8c79n5NhdHzBJVQPT1V0dURVUAQD4HXdsuCMb2obsMnOXVNVklwNMQ6b+AgCwRebPnD/ZJQDbMScXAAAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACdIqgCAADQKYIqAAAAnSKoAgAA0CnVWpvsGkZVVcuTXDXB3e6W5KYJ7pOtz3abmmy3qcl2m5pst6nJdpuabLepy7brlgNaa4tGWtDpoLo1VNWS1triya6DLWO7TU2229Rku01NttvUZLtNTbbb1GXbTR2m/gIAANApgioAAACdsj0G1fdNdgGMie02NdluU5PtNjXZblOT7TY12W5Tl203RWx356gCAADQbdvjEVUAAAA6TFAFAACgU6ZdUK2qp1XV+VU1WFWLN1n2uqq6tKourqo/GmX9hVV1elX9qvd9wbapnOGq6jNVdXbv68qqOnuUdldW1bm9dku2cZlsoqreVFW/HrbtHjtKu0f39sNLq+q127pOfltV/VtVXVRVv6yqU6pql1Ha2d864O72nxryn73lv6yq+09Gnfyfqtqvqr5bVRf2fkd51Qhtjq+qFcN+fr5xMmrlt93dzz37W/dU1eHD9qOzq2plVb16kzb2tylgxmQXsBWcl+TJSf57+ItVdUSSZya5d5K9k3y7qu7RWhvYZP3XJvlOa+0tvV8AXpvkb7d+2QzXWnvGnY+r6h1JVtxF84e11ty4uTve2Vp7+2gLq6o/ybuTPDLJsiQ/r6pTW2sXbKsC+R2nJ3lda21jVb01yesy+s89+9sk2sz95zFJDut9HZvkvb3vTJ6NSV7TWjurqnZMsrSqTh/h594PW2uPm4T6uGt39XPP/tYxrbWLkxyV/OZn5q+TnDJCU/tbx027I6qttQt7/0A3dUKST7fW1rXWrkhyaZJjRmn3P73H/5PkiVulUDZLVVWSpyf51GTXwoQ5JsmlrbXLW2vrk3w6Q/sdk6S19q3W2sbe0zOS7DuZ9XCXNmf/OSHJR9uQM5LsUlV7betC+T+ttetaa2f1Ht+e5MIk+0xuVUwQ+1u3PTzJZa21qya7ELbctAuqd2GfJNcMe74sI/8nsUdr7bpk6D+WJLtvg9oY3YOT3NBa+9Uoy1uSb1XV0qo6cRvWxehe0Zv+9KFRps5v7r7I5Hhxkq+Pssz+Nvk2Z/+xj3VYVR2Y5H5JfjbC4gdW1TlV9fWquve2rYxR3N3PPftbtz0zox/ssL913JSc+ltV306y5wiL3tBa+/Joq43wmnvzTKLN3I7Pyl0fTT2utXZtVe2e5PSquqi19oOJrpX/c1fbLUNTnv4pQ/vWPyV5R4aCz291McK69sWtbHP2t6p6Q4amKH5ilG7sb5Nvc/Yf+1hHVdX8JF9I8urW2spNFp+V5IDW2h298/u/lKHppEyuu/u5Z3/rqKqaleQJGTqdZVP2tylgSgbV1tojxrDasiT7DXu+b5JrR2h3Q1Xt1Vq7rjd148ax1Mjdu7vtWFUzMnS+8dF30ce1ve83VtUpGZoW5xfnrWhz97+qen+Sr46waHP3RSbQZuxvL0jyuCQPb6PcYNv+1gmbs//YxzqoqmZmKKR+orX2xU2XDw+urbXTquo9VbWbc8In12b83LO/dddjkpzVWrth0wX2t6lhe5r6e2qSZ1bV7Ko6KEN/NTlzlHYv6D1+QZLRjtCy9T0iyUWttWUjLayqeb2LUqSq5iV5VIYupsUk2eS8nCdl5O3x8ySHVdVBvb92PjND+x2TpKoenaGLJz2htbZ6lDb2t27YnP3n1CTP712N9PeTrLjzlBYmR+96Cx9McmFr7d9HabNnr12q6pgM/Y5287arkk1t5s89+1t3jTorz/42NUzJI6p3paqelOS/kixK8rWqOru19kettfOr6rNJLsjQ1LY/u/OKv1X1gSQnt9aWJHlLks9W1UuSXJ3kaZPyRkhGOK+gqvZO8oHW2mOT7JHklN7PmRlJPtla+8Y2r5Lh3lZVR2Vo2tOVSV6a/PZ2611Z9hVJvpmkP8mHWmvnT1K9DDkpyewMTWtLkjNaay+zv3XPaPtPVb2st/zkJKcleWyGLhq4OsmLJqtefuO4JM9Lcm793+3WXp9k/+Q32+2pSV5eVRuTrEnyzNFmN7DNjPhzz/7WfVW1Q4aujv7SYa8N3272tymgbBMAAAC6ZHua+gsAAMAUIKgCAADQKYIqAAAAnSKoAgAA0CmCKgAAAJ0iqAIAANApgioAAACd8v8DGbPhr/+zDrsAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig = plt.figure(figsize=(16, 10))\n", "plt.scatter(host_data.iloc[:, 0], host_data.iloc[:, 1], c=host_labels, s=50, cmap='viridis')\n", "\n", "#plot the sklearn kmeans centers with blue filled circles\n", "centers_sk = kmeans_sk.cluster_centers_\n", "plt.scatter(centers_sk[:,0], centers_sk[:,1], c='blue', s=100, alpha=.5)\n", "\n", "#plot the cuml kmeans centers with red circle outlines\n", "centers_cuml = kmeans_cuml.cluster_centers_\n", "plt.scatter(cupy.asnumpy(centers_cuml[0].values), \n", " cupy.asnumpy(centers_cuml[1].values), \n", " facecolors = 'none', edgecolors='red', s=100)\n", "\n", "plt.title('cuml and sklearn kmeans clustering')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare Results" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 27.5 ms, sys: 961 µs, total: 28.4 ms\n", "Wall time: 27.7 ms\n" ] } ], "source": [ "%%time\n", "cuml_score = adjusted_rand_score(host_labels, kmeans_cuml.labels_.to_array())\n", "sk_score = adjusted_rand_score(host_labels, kmeans_sk.labels_)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "compare kmeans: cuml vs sklearn labels_ are equal\n" ] } ], "source": [ "threshold = 1e-4\n", "\n", "passed = (cuml_score - sk_score) < threshold\n", "print('compare kmeans: cuml vs sklearn labels_ are ' + ('equal' if passed else 'NOT equal'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# K-Means Multi-Node Multi-GPU (MNMG) Solution\n", "\n", "K-Means multi-Node multi-GPU implementation leverages Dask to spread data and computations across multiple workers. cuML uses One Process Per GPU (OPG) layout, which maps a single Dask worker to each GPU.\n", "\n", "The main difference between cuML's MNMG implementation of k-means and the single-GPU is that the fit can be performed in parallel for each iteration, sharing only the centroids between iterations. The MNMG version also provides the same scalable k-means++ initialization algorithm as the single-GPU version.\n", "\n", "Unlike the single-GPU implementation, The MNMG k-means API requires a Dask Dataframe or Array as input. `predict()` and `transform()` return the same type as input. The Dask cuDF Dataframe API is very similar to the Dask DataFrame API, but underlying Dataframes are cuDF, rather than Pandas. Dask cuPy arrays are also available.\n", "\n", "For information about cuDF, refer to the [cuDF documentation](https://docs.rapids.ai/api/cudf/stable).\n", "\n", "For additional information on cuML's k-means implementation: \n", "https://docs.rapids.ai/api/cuml/stable/api.html#cuml.dask.cluster.KMeans." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n", "\n", "Let's begin by importing the libraries necessary for this implementation." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from cuml.dask.cluster.kmeans import KMeans as cuKMeans\n", "from cuml.dask.common import to_dask_df\n", "from cuml.dask.datasets import make_blobs\n", "from cuml.metrics import adjusted_rand_score\n", "from dask.distributed import Client, wait\n", "from dask_cuda import LocalCUDACluster\n", "from dask_ml.cluster import KMeans as skKMeans\n", "import cupy as cp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Start Dask Cluster\n", "\n", "We can use the `LocalCUDACluster` to start a Dask cluster on a single machine with one worker mapped to each GPU. This is called one-process-per-GPU (OPG). " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.\n", "Perhaps you already have a cluster running?\n", "Hosting the HTTP server on port 33117 instead\n", " http_address[\"port\"], self.http_server.port\n" ] } ], "source": [ "cluster = LocalCUDACluster(threads_per_worker=1)\n", "client = Client(cluster)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Parameters\n", "\n", "Here we will define the data and model parameters which will be used while generating data and building our model. You can change these parameters and observe the change in the results." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "n_samples = 100000\n", "n_features = 2\n", "\n", "n_total_partitions = len(list(client.has_what().keys()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate Data\n", "\n", "Generate isotropic Gaussian blobs for clustering.\n", "\n", "### Device\n", "\n", "We can generate a Dask cuPY Array of synthetic data for multiple clusters using `cuml.dask.datasets.make_blobs`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "X_dca, Y_dca = make_blobs(n_samples, \n", " n_features,\n", " centers = 5, \n", " n_parts = n_total_partitions,\n", " cluster_std=0.1, \n", " verbose=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Host\n", "\n", "We collect the Dask cuPy Array on a single node as a cuPy array. Then we transfer the cuPy array from device to host memory into a Numpy array." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "X_cp = X_dca.compute()\n", "X_np = cp.asnumpy(X_cp)\n", "del X_cp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Scikit-learn model\n", "\n", "The arguments to the model object include:\n", "\n", "- n_clusters: int, default=8\n", "The number of clusters to form as well as the number of centroids to generate.\n", "\n", "- init{‘k-means++’, ‘random’}, callable or array-like of shape (n_clusters, n_features), default=’k-means++’\n", "Method for initialization:\n", "\n", "- ‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. \n", "- max_iterint, default=300\n", "Maximum number of iterations of the k-means algorithm for a single run.\n", "\n", "- random_state: int, RandomState instance or None, default=None\n", "Determines random number generation for centroid initialization. Use an int to make the randomness deterministic. .\n", "\n", "- n_jobs: int, default=None\n", "The number of OpenMP threads to use for the computation. Parallelism is sample-wise on the main cython loop which assigns each sample to its closest center. None or -1 means using all processors.\n", "\n", "### Fit and predict\n", "\n", "Since a scikit-learn equivalent to the multi-node multi-GPU K-means in cuML doesn't exist, we will use Dask-ML's implementation for comparison." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 9.07 s, sys: 442 ms, total: 9.51 s\n", "Wall time: 18 s\n" ] }, { "data": { "text/plain": [ "KMeans(n_clusters=5, n_jobs=-1, random_state=100)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "kmeans_sk = skKMeans(init=\"k-means||\",\n", " n_clusters=5,\n", " n_jobs=-1,\n", " random_state=100)\n", "\n", "kmeans_sk.fit(X_np)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 340 ms, sys: 22.6 ms, total: 362 ms\n", "Wall time: 544 ms\n" ] } ], "source": [ "%%time\n", "labels_sk = kmeans_sk.predict(X_np).compute()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## cuML Model\n", "\n", "### Fit and predict" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 25.2 ms, sys: 4.1 ms, total: 29.3 ms\n", "Wall time: 130 ms\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "kmeans_cuml = cuKMeans(init=\"k-means||\",\n", " n_clusters=5,\n", " random_state=100)\n", "\n", "kmeans_cuml.fit(X_dca)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 28.3 ms, sys: 2.86 ms, total: 31.2 ms\n", "Wall time: 255 ms\n" ] } ], "source": [ "%%time\n", "labels_cuml = kmeans_cuml.predict(X_dca).compute()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare Results" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "score = adjusted_rand_score(labels_sk, labels_cuml)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "compare kmeans: cuml vs sklearn labels_ are equal\n" ] } ], "source": [ "passed = score == 1.0\n", "print('compare kmeans: cuml vs sklearn labels_ are ' + ('equal' if passed else 'NOT equal'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "Using Dask, we were able to reduce the computation time from 18 seconds to 130 milliseconds, which is around 140th of the time. Thus we have learnt how to effectively use Dask to optimize and accelerate our data science pipeline. If you want to explore Dask in detail, refer to the documentation [here](https://docs.dask.org/en/latest/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Licensing\n", " \n", "This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Previous Notebook](04-Challenge.ipynb)\n", "     \n", "     \n", "     \n", "     \n", "[1](01-Intro_to_Dask.ipynb)\n", "[2](02-CuDF_and_Dask.ipynb)\n", "[3](03-CuML_and_Dask.ipynb)\n", "[4](04-Challenge.ipynb)\n", "[5]\n", "     \n", "     \n", "     \n", "     \n", "\n", "\n", "     \n", "     \n", "     \n", "     \n", "     \n", "   \n", "[Home Page](../START_HERE.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 4 }