|
2 年 前 | |
---|---|---|
.. | ||
README.md | 2 年 前 | |
mhsa.py | 2 年 前 | |
train.py | 2 年 前 | |
vit.py | 2 年 前 | |
vitconfigs.py | 2 年 前 |
Implementation of Vision Transformer in PyTorch
mhsa.py
: Implementation of Multi Head Self Attention layervitconfigs.py
: Configs for base (ViT-B), large (ViT-L) and huge (ViT-H) models as described by Dosovitskiy et. al.vit.py
: Implementation of Vision Transformertrain.py
: Training script for ViT on imagenet dataset using DarkLightSet up an environment with pytorch and TensorRT. The easiest way is to use an NGC container like this (note that a CUDA GPU is required for training):
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:23.01-py3
python3 vit.py #this will print a verification message if fwd pass is successful.
In the docker container, load an external volume which contains imagenet dataset. The dataset should have the format:
- root
|
|
|--- train
| |
| |_ timg1.jpg
| |_ timg2.jpg
| ...
|
|--- val
| |
| |_ vimg1.jpg
| |_ vimg2.jpg
| ...
The image names contain the class label in imagenet.
Provide the path of the root dir in train.py
Run training with
python3 train.py
Visualize training progress with tensorboard.
tensorboard --logdir=./runs
Want to become an expert in AI? AI Courses by OpenCV is a great place to start.