{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "     \n", "     \n", "     \n", "     \n", "     \n", "   \n", "[Home Page](../Start_Here.ipynb)\n", "\n", "\n", "[Previous Notebook](Manipulation_of_Image_Data_and_Category_Determination_using_Text_Data.ipynb)\n", "     \n", "     \n", "     \n", "     \n", "[1](The_Problem_Statement.ipynb)\n", "[2](Approach_to_the_Problem_&_Inspecting_and_Cleaning_the_Required_Data.ipynb)\n", "[3](Manipulation_of_Image_Data_and_Category_Determination_using_Text_Data.ipynb)\n", "[4](Countering_Data_Imbalance.ipynb)\n", "[5]\n", "     \n", "     \n", "     \n", "     \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Competition :\n", "\n", "In this exercise participant need to tune and work on improving overall acuracy of our model. \n", "\n", "To help you get started by pointing out some obvious ways in which you can make the model more efficient. \n", "\n", "- Epochs \n", "- Batch Size \n", "- Optimizers : We have used SGD as a optimizer. Participant can try applying other optimizer and test to obtain quick convergence.\n", "- Data Augmentation : Remember, we mentioned we have an imbalanced dataset. You could try differnet augmentation techniques for the minority classes.\n", "- Model : If you have exploited all the bbove methods to improve your model, you can change the model by adding more Layers to it and see if that improves that accuracy.\n", "\n", "Note, before you start tweaking and training your model ,it would be worthwhile to refer to these to see how they affect your model : \n", "\n", "[Epochs impact on Overfitting](https://datascience.stackexchange.com/questions/27561/can-the-number-of-epochs-influence-overfitting ) \n", "\n", "\n", "[Effect of Batch Size on Training Dynamics](https://medium.com/mini-distill/effect-of-batch-size-on-training-dynamics-21c14f7a716e)\n", "\n", "[Introduction to Optimizers](https://algorithmia.com/blog/introduction-to-optimizers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Training the Model with Data Augmentation : \n", "\n", "\n", "We created a new function called `augmentation(name,category,filenames,labels,i)` and here you can add more samples to Category which have imbalanced data. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "sys.path.append('/workspace/python/source_code')\n", "\n", "from utils import * \n", "import os\n", "os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\"\n", "\n", "def augmentation(name,category,filenames,labels,i):\n", " # Important Constants\n", " file_path = \"Dataset/Aug/\"\n", " images = []\n", " (h, w) = (232,232)\n", " center = (w / 2, h / 2)\n", " angle90 = 90\n", " angle180 = 180\n", " angle270 = 270\n", " scale = 1.0\n", " img = load_image(name , interpolation = cv2.INTER_LINEAR)\n", " \n", " ## ~~ Add Augmentations here ~~\n", " if category == 0 :\n", " images.append(cv2.flip(img,0))\n", " elif category == 1 :\n", " pass\n", " elif category == 2 :\n", " pass\n", " elif category == 3 :\n", " pass\n", " elif category == 4 :\n", " pass\n", " elif category == 5 :\n", " pass\n", " elif category == 6 :\n", " pass\n", " elif category == 7 :\n", " images.append(cv2.flip(img,0))\n", " \n", " ## ~~ Augmentation ends here ~~\n", " for j in range(len(images)):\n", " cv2.imwrite(file_path+str(i+j)+'.jpeg',images[j])\n", " filenames.append(file_path+str(i+j)+'.jpeg')\n", " labels.append(category)\n", " i = i + len(images)\n", " return i" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### We pass this function to our `load_dataset()` function to generate these augmentations. \n", "\n", "Kindly wait for a couple of minutes while it takes to augment the images." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "filenames,labels = load_dataset(augment_fn = augmentation)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set the Size of the Validation set\n", "val_filenames , val_labels = make_test_set(filenames,labels,val=0.1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Make train test set \n", "test = 0.1\n", "from sklearn.model_selection import train_test_split\n", "x_train, x_test, y_train, y_test = train_test_split(filenames, labels, test_size=test, random_state=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "y_train = tf.one_hot(y_train,depth=8)\n", "y_test = tf.one_hot(y_test,depth=8)\n", "val_labels = tf.one_hot(val_labels,depth=8)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make Dataset compatible with Tensorflow Data Pipelining.\n", "\n", "# ~~ Change the batch Size here ~~\n", "batch_size = 64\n", "# ~~ Change the batch Size here ~~\n", "\n", "train,test,val = make_dataset((x_train,y_train,batch_size),(x_test,y_test,32),(val_filenames,val_labels,32))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Architecture :\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "tf.random.set_seed(1337)\n", "\n", "import tensorflow.keras\n", "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Dense, Conv2D, Flatten ,Dropout, MaxPooling2D\n", "from tensorflow.keras import backend as K \n", "\n", "#Reset Graphs and Create Sequential model\n", "\n", "K.clear_session()\n", "model = Sequential()\n", "\n", "## ~~ Change Model or Parameters Here\n", "#Convolution Layers\n", "\n", "model.add(Conv2D(64, kernel_size=10,strides=3, activation='relu', input_shape=(232,232,3)))\n", "model.add(MaxPooling2D(pool_size=(3, 3),strides=2))\n", "model.add(Conv2D(256, kernel_size=5,strides=1,activation='relu'))\n", "model.add(MaxPooling2D(pool_size=(3, 3),strides=2))\n", "model.add(Conv2D(288, kernel_size=3,strides=1,padding='same',activation='relu'))\n", "model.add(MaxPooling2D(pool_size=(2, 2),strides=1))\n", "model.add(Conv2D(272, kernel_size=3,strides=1,padding='same',activation='relu'))\n", "model.add(Conv2D(256, kernel_size=3,strides=1,activation='relu'))\n", "model.add(MaxPooling2D(pool_size=(3, 3),strides=2))\n", "model.add(Dropout(0.5))\n", "model.add(Flatten())\n", "\n", "#Linear Layers \n", "\n", "model.add(Dense(3584,activation='relu'))\n", "model.add(Dense(2048,activation='relu'))\n", "model.add(Dense(8, activation='softmax'))\n", "\n", "\n", "## ~~ Change Model or Parameters Here\n", "\n", "# Print Model Summary\n", "\n", "model.summary()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import functools\n", "\n", "#Define Number of Epochs\n", "\n", "## ~~ Change Number of Epochs Here ~~\n", "epochs = 24\n", "## ~~ Change Number of Epochs Here ~~\n", "\n", "\n", "# Include Top-2 Accuracy Metrics \n", "top2_acc = functools.partial(tensorflow.keras.metrics.top_k_categorical_accuracy, k=2)\n", "top2_acc.__name__ = 'top2_acc'\n", "\n", "\n", "## ~~ Change Optimizer or Parameters Here\n", "# Optimizer\n", "sgd = tensorflow.keras.optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9)\n", "## ~~ Change Optimizer or Parameters Here\n", "\n", "#Compile Model with Loss Function , Optimizer and Metrics\n", "model.compile(loss=tensorflow.keras.losses.categorical_crossentropy, \n", " optimizer=sgd,\n", " metrics=['accuracy',top2_acc])\n", "\n", "# Train the Model \n", "trained_model = model.fit(train,\n", " epochs=epochs,\n", " verbose=1,\n", " validation_data=val)\n", "\n", "# Test Model Aganist Validation Set\n", "score = model.evaluate(test, verbose=0)\n", "print('Test loss:', score[0])\n", "print('Test accuracy:', score[1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualisations" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "f = plt.figure(figsize=(15,5))\n", "ax = f.add_subplot(121)\n", "ax.plot(trained_model.history['accuracy'])\n", "ax.plot(trained_model.history['val_accuracy'])\n", "ax.set_title('Model Accuracy')\n", "ax.set_ylabel('Accuracy')\n", "ax.set_xlabel('Epoch')\n", "ax.legend(['Train', 'Val'])\n", "\n", "ax2 = f.add_subplot(122)\n", "ax2.plot(trained_model.history['loss'])\n", "ax2.plot(trained_model.history['val_loss'])\n", "ax2.set_title('Model Loss')\n", "ax2.set_ylabel('Loss')\n", "ax2.set_xlabel('Epoch')\n", "ax2.legend(['Train', 'Val'],loc= 'upper left')\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import seaborn as sn\n", "from sklearn.metrics import confusion_matrix\n", "import pandas as pd\n", "\n", "#Plotting a heatmap using the confusion matrix\n", "pred = model.predict(val)\n", "p = np.argmax(pred, axis=1)\n", "y_valid = np.argmax(val_labels, axis=1, out=None)\n", "results = confusion_matrix(y_valid, p) \n", "classes=['NC','TD','TC','H1','H3','H3','H4','H5']\n", "df_cm = pd.DataFrame(results, index = [i for i in classes], columns = [i for i in classes])\n", "plt.figure(figsize = (15,15))\n", "\n", "sn.heatmap(df_cm, annot=True, cmap=\"Blues\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us now save our Model and the trained Weights for Future usage :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Save Our Model \n", "model.save('cyc_pred_comp.h5')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other Bootcamps\n", "The contents of this Bootcamp originates from [OpenACC GPU Bootcamp Github](https://github.com/gpuhackathons-org/gpubootcamp). Here are some additional Bootcamp which might be of interest: \n", "\n", "- [Physics Informed Neural Network](https://github.com/gpuhackathons-org/gpubootcamp/tree/master/hpc_ai/PINN)\n", "\n", "## License\n", "This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0).\n", "\n", "[Previous Notebook](Manipulation_of_Image_Data_and_Category_Determination_using_Text_Data.ipynb)\n", "     \n", "     \n", "     \n", "     \n", "[1](The_Problem_Statement.ipynb)\n", "[2](Approach_to_the_Problem_&_Inspecting_and_Cleaning_the_Required_Data.ipynb)\n", "[3](Manipulation_of_Image_Data_and_Category_Determination_using_Text_Data.ipynb)\n", "[4](Countering_Data_Imbalance.ipynb)\n", "[5]\n", "     \n", "     \n", "     \n", "     \n", "\n", "     \n", "     \n", "     \n", "     \n", "     \n", "   \n", "[Home Page](../Start_Here.ipynb)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }