Deep Learning for Computer Vision

Companion repository for 4F12: Computer Vision (Engineering Tripos Part IIB, University of Cambridge).

Lectures

#	Topic	Video
1	Multi-layer Perceptrons	YouTube
2	Convolutional Neural Nets	YouTube

Notebooks

Interactive lecture notes are provided as Jupyter Notebooks:

lecture_notes_0.ipynb — Perceptrons, Classification, Backpropagation, Gradient Descent, Optimisation
lecture_notes_1.ipynb — Convolution, Pooling, ResNet, Segmentation, VAE, Object Detection, Face Recognition, Adversarial Attacks
lecture_notes_2.ipynb — RNNs, LSTM, Transformers, Vision Transformers, Generative Pretraining, CLIP, DINOv2

Topics Covered

Topic	Module(s)
Perceptrons & MLPs	`perceptron.py`, `multiperceptron.py`, `mlp.py`
Activations	`activations.py`
Gradient Descent & Optimisation	`gradient_descent.py`
Convolution & Pooling	`filters.py`, `cnn.py`, `pooling.py`
ResNet	`resnet.py`
Salience Maps	`salience.py`
Datasets (MNIST, CIFAR-10, LFW, EMNIST, COCO)	`datasets.py`
Variational Autoencoders	`vae.py`
FaceNet & Triplet Loss	`facenet.py`
Object Detection (YOLO)	`yolo.py`
Adversarial Examples (FGSM)	`adversarial.py`
Recurrent Neural Networks	`rnn.py`
Transformers (ViT, GPT, Encoder-Decoder)	`transformers_model.py`
Patch Clustering (K-means Tokenisation)	`patch_clustering.py`
CLIP (Zero-shot Classification)	`clip.py`
DINO / DINOv2 (Self-supervised Vision)	`dino.py`

Getting Started

It is recommended to use a virtual environment:

python -m venv .env
source .env/bin/activate   # Linux/macOS
# .env\Scripts\activate    # Windows

Install dependencies:

python -m pip install pip --upgrade
pip install -r requirements.txt

Then follow the instructions to install PyTorch.

Pre-computed Results

The notebooks rely on pre-computed training results (saved weights, embeddings, etc.). To download them:

python download_results.py

This downloads ~69 MB of results files into results/. Use --force to re-download existing files.

If you prefer to generate the results yourself, run the corresponding Python modules directly (e.g. python facenet.py), though this requires a GPU and considerably more time.

Warning — regeneration dependencies: Some data and results files depend on each other. Regenerating them will invalidate downstream results, requiring those to be retrained as well.

If you regenerate… …you must also regenerate

data/emnist_patch128.npz transformer_emnist_gpt_128.results, transformer_emnist_128_finetune.results, rnn_emnist_patch_128.results

data/lfw_minitrain.npz facenet.results

transformer_language_128.results transformer_language_128_finetune.results

transformer_emnist_gpt_128.results transformer_emnist_128_finetune.results

The scripts can be run standalone, but are primarily designed to be called from the Jupyter notebooks. Learn more about Jupyter.

Requirements

See requirements.txt. Key dependencies: PyTorch ≥ 2.0, torchvision, matplotlib, numpy, ultralytics (YOLO), open-clip-torch (CLIP).

License

See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning for Computer Vision

Lectures

Notebooks

Topics Covered

Getting Started

Pre-computed Results

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
images		images
networks		networks
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
Plan.md		Plan.md
README.md		README.md
activations.py		activations.py
adversarial.py		adversarial.py
api_key.txt		api_key.txt
clip.py		clip.py
cnn.py		cnn.py
datasets.py		datasets.py
dino.py		dino.py
download_results.py		download_results.py
facenet.py		facenet.py
filters.py		filters.py
gradient_descent.py		gradient_descent.py
lecture_notes_0.ipynb		lecture_notes_0.ipynb
lecture_notes_1.ipynb		lecture_notes_1.ipynb
lecture_notes_2.ipynb		lecture_notes_2.ipynb
mlp.py		mlp.py
multiperceptron.py		multiperceptron.py
network_diagram.py		network_diagram.py
patch_clustering.py		patch_clustering.py
perceptron.py		perceptron.py
pooling.py		pooling.py
requirements.txt		requirements.txt
resnet.py		resnet.py
rnn.py		rnn.py
salience.py		salience.py
transformers_model.py		transformers_model.py
vae.py		vae.py
yolo.py		yolo.py

If you regenerate…	…you must also regenerate
`data/emnist_patch128.npz`	`transformer_emnist_gpt_128.results`, `transformer_emnist_128_finetune.results`, `rnn_emnist_patch_128.results`
`data/lfw_minitrain.npz`	`facenet.results`
`transformer_language_128.results`	`transformer_language_128_finetune.results`
`transformer_emnist_gpt_128.results`	`transformer_emnist_128_finetune.results`

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Computer Vision

Lectures

Notebooks

Topics Covered

Getting Started

Pre-computed Results

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages