Image Colorizer using Deep Learning

This repository contains the core technical specifications, implementation frameworks, and performance evaluations for the Image Colorizer application. The system leverages Deep Learning techniques and the OpenCV library to automatically transform gray-scale (black and white) photos and videos into high-quality colorized versions by predicting semantic colors and tones based on visual features.

👥 Contributers

Anik Sur – Department of Computer Science & Engineering, Institute of Engineering & Management, Kolkata, India (2019-23)
Abhirup Dutta – Department of Computer Science & Engineering, Institute of Engineering & Management, Kolkata, India (2019-23)

📝 Abstract

Image colorization is the process of taking an input gray-scale image and producing an output colorized image that represents the semantic colors and tones of the input. In this paper, we use Deep Learning techniques, more specifically a Convolutional Neural Network (CNN) capable of colorizing black and white images. Moreover, the OpenCV library has been used to colorize the black and white images as well as videos. The major utilization of our application lies in the need to colorize any black and white images or videos that were taken during the times when cameras could not capture colored media, as well as any black and white sketches. With the availability of more training data, this application can provide highly accurate results for random user inputs.

Index Terms: Colorization, Vision for Graphics, CNNs, Self-supervised learning.

🛠️ System Dependencies

The deployment code requires a working environment with the following frameworks and libraries:

Core Network Framework: Caffe

Python Scientific Computing Libraries: * numpy (Numerical arrays processing)

pyplot / matplotlib (Data and image visualization)

skimage (Image processing algorithms)

scipy (Advanced scientific computing)

OpenCV (Real-time computer vision execution)

📐 System Architecture & Methodology

A. Lab Color Space Implementation

Unlike conventional literature methods that encode photos via the RGB model (an additive light matrix) , this project leverages the CIELAB ("Lab") color space.

L Channel (Lightness): Matches the human perception of lightness and serves as the exact standalone input vector passed into the AI neural network model.

A & B Channels: Represent the green-red and blue-yellow color spectrum components. The AI model is trained explicitly to estimate and predict these remaining components.

Advantage: Lab color space approximations prioritize human visual perception and uniform perceptual consistency far better than standard RGB layouts.

B. The AI Training Pipeline

Model Framework: The network utilizes a pre-trained CNN to map gray-scale inputs directly to a distribution over quantized color value outputs.

Layer Structural Design: Each convolution block repeatedly chains 2 or 3 Convolution and ReLU layers, followed by a BatchNorm layer. Crucially, the fundamental layout contains no explicit pooling layers; all resolution shifts are managed directly via spatial downsampling or upsampling between conv blocks.

Dataset Scale: The network undergoes a feed-forward pass training on an extensive dataset containing over 1.3 million photos from ImageNet.

Target Mapping: Source colored photos are systematically disintegrated using the CIELAB model; the L channel acts as the input feature, while A and B vectors serve as classification labels.

📊 Performance Analysis

1. Pooling Layers Optimization Analysis

To evaluate model improvements, structural pooling experiments were integrated into the core network mapping pipeline to evaluate feature extraction:

Average Pooling: Operates by extracting patch blocks and calculating the uniform mean value of the features in the selected region. If the overall activation magnitudes are low, the computed mean contracts, resulting in reduced contrast. It achieved a testing accuracy of 0.9842.

Max Pooling: Slides a 2D filter over maps to discriminate against less dominant features, selecting strictly the highest activation value within the region. This feeds only the most critical, high-magnitude parameters into subsequent layers. It achieved a testing accuracy of 0.9831.

Pooling Strategy Configuration	Testing Accuracy
Average Pooling Layer	0.9842
Max Pooling Layer	0.9831

2. Baseline Model Performance Benchmarking

The document reviews legacy architectures to showcase performance constraints on alternative image recognition benchmarks:

LeNet-5 Model [6]: Yields a low testing accuracy of 66% on the CIFAR-10 dataset. Given human-level accuracy sits near 94%, it lacks sufficient recognition capabilities for color tracking.

DanNet Model [7]: Recognized as the first pure deep convolutional neural network to win computer vision contests in 2011, establishing structural foundations for AlexNet (2012), Highway Net (2015), and ResNet layouts.

AlexNet Model [8]: Possesses 60 million parameters spanning 5 convolutional and 3 fully connected layers , logging historic top results of 78.1% and 60.9% on early ImageNet subsets.

🏆 Key Experimental Results

The application yielded highly successful results:

Achieved an outstanding 32% improvement in accuracy compared to legacy baseline models in delivering the authentic, ground-truth colors of an image.
Visual desaturation artifacts were significantly diminished compared to older processing paradigms.
Demonstrated that colorization functions as a highly robust task for self-supervised feature learning, operating as an efficient cross-channel encoder.

🎓 Acknowledgment

This work was supported in part by the Grant-in-Aid Projects of the faculty members, Institute of Engineering & Management, Kolkata, West Bengal, India.

📚 Document References

[1] Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision. (2015) 415-423.
[2] https://pyimagesearch.com/2019/02/25/black-and-white-image-colorization-with-opencv-and-deep-learning/.
[3] Y. Lecun, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, pp. 2278-2324, 1998.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," In Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.

(Delete the place holder .txt file in /models and replace it with the .caffemodel file downloaded from https://drive.google.com/file/d/1N5CxEKOS5jec10I16oWUuxJYnLaKywNh/view?usp=sharing)

[5] D. C. Cireşan, U. Meier, L. M. Gambardella, and J. Schmidhuber, "Deep, big, simple neural nets for handwritten digit recognition," Neural Computation, vol. 22, pp. 3207-3220, 2010.
[6] Zhang, R., Isola, P., Efros, A.A. (2016). Colorful Image Colorization. In: Computer Vision – ECCV 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
input		input
models		models
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
colourizer.py		colourizer.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Colorizer using Deep Learning

👥 Contributers

📝 Abstract

🛠️ System Dependencies

📐 System Architecture & Methodology

A. Lab Color Space Implementation

B. The AI Training Pipeline

📊 Performance Analysis

1. Pooling Layers Optimization Analysis

2. Baseline Model Performance Benchmarking

🏆 Key Experimental Results

🎓 Acknowledgment

📚 Document References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Colorizer using Deep Learning

👥 Contributers

📝 Abstract

🛠️ System Dependencies

📐 System Architecture & Methodology

A. Lab Color Space Implementation

B. The AI Training Pipeline

📊 Performance Analysis

1. Pooling Layers Optimization Analysis

2. Baseline Model Performance Benchmarking

🏆 Key Experimental Results

🎓 Acknowledgment

📚 Document References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages