Automatic colorization using deep neural networks. Colorful Image Colorization. In ECCV, 2016.

Overview

Colorful Image Colorization [Project Page]

Richard Zhang, Phillip Isola, Alexei A. Efros. In ECCV, 2016.

+ automatic colorization functionality for Real-Time User-Guided Image Colorization with Learned Deep Priors, SIGGRAPH 2017!

[Sept20 Update] Since it has been 3-4 years, I converted this repo to support minimal test-time usage in PyTorch. I also added our SIGGRAPH 2017 (it's an interactive method but can also do automatic). See the Caffe branch for the original release.

Teaser Image

Clone the repository; install dependencies

git clone https://github.com/richzhang/colorization.git
pip install requirements.txt

Colorize! This script will colorize an image. The results should match the images in the imgs_out folder.

python demo_release.py -i imgs/ansel_adams3.jpg

Model loading in Python The following loads pretrained colorizers. See demo_release.py for some details on how to run the model. There are some pre and post-processing steps: convert to Lab space, resize to 256x256, colorize, and concatenate to the original full resolution, and convert to RGB.

import colorizers
colorizer_eccv16 = colorizers.eccv16().eval()
colorizer_siggraph17 = colorizers.siggraph17().eval()

Original implementation (Caffe branch)

The original implementation contained train and testing, our network and AlexNet (for representation learning tests), as well as representation learning tests. It is in Caffe and is no longer supported. Please see the caffe branch for it.

Citation

If you find these models useful for your resesarch, please cite with these bibtexs.

@inproceedings{zhang2016colorful,
  title={Colorful Image Colorization},
  author={Zhang, Richard and Isola, Phillip and Efros, Alexei A},
  booktitle={ECCV},
  year={2016}
}

@article{zhang2017real,
  title={Real-Time User-Guided Image Colorization with Learned Deep Priors},
  author={Zhang, Richard and Zhu, Jun-Yan and Isola, Phillip and Geng, Xinyang and Lin, Angela S and Yu, Tianhe and Efros, Alexei A},
  journal={ACM Transactions on Graphics (TOG)},
  volume={9},
  number={4},
  year={2017},
  publisher={ACM}
}

Misc

Contact Richard Zhang at rich.zhang at eecs.berkeley.edu for any questions or comments.

Comments
  • the problem of caffe_traininglayers.py

    the problem of caffe_traininglayers.py

    Hi, we take our own color image data set to train the model.but we encounter a question,which is your caffe_traininglayers.py. When running ./train/train_model.sh, I got the following error:TypeError: 'float' object cannot be interpreted as an index .As follows: I0612 09:26:00.048768 1935 caffe.cpp:221] Starting Optimization I0612 09:26:00.048841 1935 solver.cpp:279] Solving LtoAB I0612 09:26:00.048853 1935 solver.cpp:280] Learning Rate Policy: step Traceback (most recent call last): File "/home/joy/colorization-master/resources/caffe_traininglayers.py", line 72, in forward top[0].data[...] = self.nnenc.encode_points_mtx_nd(bottom[0].data[...],axis=1) File "/home/joy/colorization-master/resources/caffe_traininglayers.py", line 329, in encode_points_mtx_nd (dists,inds) = self.nbrs.kneighbors(pts_flt) File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/neighbors/base.py", line 399, in kneighbors for s in gen_even_slices(X.shape[0], n_jobs) File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 800, in call while self.dispatch_one_batch(iterator): File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 658, in dispatch_one_batch self._dispatch(tasks) File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 566, in _dispatch job = ImmediateComputeBatch(batch) File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 180, in init self.results = batch() File "/home/dpa/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 72, in call return [func(*args, **kwargs) for func, args, kwargs in self.items] File "sklearn/neighbors/binary_tree.pxi", line 1310, in sklearn.neighbors.ball_tree.BinaryTree.query (sklearn/neighbors/ball_tree.c:10592) File "sklearn/neighbors/binary_tree.pxi", line 588, in sklearn.neighbors.ball_tree.NeighborsHeap.init (sklearn/neighbors/ball_tree.c:4931) TypeError: 'float' object cannot be interpreted as an index Can anyone help me about this issue? Thanks appreciately!

    opened by crazyzsy 7
  • Colorization on 1929 movie

    Colorization on 1929 movie

    Hi

    Thanks for this script ! I tried your colorization (v2 with class rebalancing) on some frames from Man With a Movie Camera (Vertov, 1929) here are the results : https://drive.google.com/open?id=0B5-1OeNPsecwdTZ4dS0wUUhIaUU

    man with a movie camera the cinematic orchestra soundtrack dvdrip_scene19_7655 man with a movie camera the cinematic orchestra soundtrack dvdrip_scene19_7655

    • The results are all oranges :( what do you think I can do to have a better quality ?
    • Should I train my own network with frames from documentaries ?
    • Do you think I should start training from scratch or start from your network v2 ?
    • Could we introduce some kind of temporal consistency across a bunch of frames for movie colorization ?

    Thanks in advance ! Antoine.

    opened by ttoinou 7
  • Question regarding capability

    Question regarding capability

    Hey guys! I am really impressed by this library.

    So I have a video fingerprinting technology (no, not histogram, perceptual identification) which is pretty freaking accurate to a high resolution of precision (10 seconds from the source video is enough to match). The only "weak point" I have is when color saturation is changed from the source video.

    Something I've always really wanted was a way to "normalize" the color. In fact, since my system works way better with color than black-and-white video, colorizing a black and white video would be awesome! This is something I absolutely must play with.

    The question I have that you may be able to answer is whether this would produce reasonably consistent results:

    1. Take source image, copy it, make it black and white
    2. Take source image, copy it, saturate the color a bit, make it black and white
    3. Colorize the result of #1
    4. Colorize the result of #2

    When you compare those two, how similar are they? Assume reasonable saturation change so it does not harm the user perception of the original image too much.

    If you know this off-hand, that'd be amazing! (I can't test right now)

    --Collin

    opened by rafajafar 5
  • Pretrained weights are inaccessible

    Pretrained weights are inaccessible

    opened by IRDonch 4
  • Segmentation fault when running ./train/train_model.sh

    Segmentation fault when running ./train/train_model.sh

    When running ./train/train_model.sh, I got the following error:

    I0418 09:54:54.054507 20872 layer_factory.hpp:77] Creating layer data I0418 09:54:54.054597 20872 db_lmdb.cpp:35] Opened lmdb ./caffe-colorization/examples/imagenet/ilsvrc12_train_lmdb I0418 09:54:54.054622 20872 net.cpp:84] Creating Layer data I0418 09:54:54.054630 20872 net.cpp:380] data -> data I0418 09:54:54.055681 20872 data_layer.cpp:45] output data size: 40,3,176,176 I0418 09:54:54.076776 20872 net.cpp:122] Setting up data I0418 09:54:54.076802 20872 net.cpp:129] Top shape: 40 3 176 176 (3717120) I0418 09:54:54.076805 20872 net.cpp:137] Memory required for data: 14868480 I0418 09:54:54.076812 20872 layer_factory.hpp:77] Creating layer img_lab *** Aborted at 1492480494 (unix time) try "date -d @1492480494" if you are using GNU date *** PC: @ 0x7f3516e6e873 std::_Hashtable<>::clear() *** SIGSEGV (@0x9) received by PID 20872 (TID 0x7f352eb42740) from PID 9; stack trace: *** @ 0x7f352bd584b0 (unknown) @ 0x7f3516e6e873 std::_Hashtable<>::clear() @ 0x7f3516e60346 google::protobuf::DescriptorPool::FindFileByName() @ 0x7f3516e3eac8 google::protobuf::python::cdescriptor_pool::AddSerializedFile() @ 0x7f352c3c17d0 PyEval_EvalFrameEx @ 0x7f352c4ea01c PyEval_EvalCodeEx @ 0x7f352c4403dd (unknown) @ 0x7f352c4131e3 PyObject_Call @ 0x7f352c433ae5 (unknown) @ 0x7f352c3ca123 (unknown) @ 0x7f352c4131e3 PyObject_Call @ 0x7f352c3be13c PyEval_EvalFrameEx @ 0x7f352c4ea01c PyEval_EvalCodeEx @ 0x7f352c3b8b89 PyEval_EvalCode @ 0x7f352c44d1b4 PyImport_ExecCodeModuleEx @ 0x7f352c44db8f (unknown) @ 0x7f352c44f300 (unknown) @ 0x7f352c44f5c8 (unknown) @ 0x7f352c4506db PyImport_ImportModuleLevel @ 0x7f352c3c7698 (unknown) @ 0x7f352c4131e3 PyObject_Call @ 0x7f352c4e9447 PyEval_CallObjectWithKeywords @ 0x7f352c3bc5c6 PyEval_EvalFrameEx @ 0x7f352c4ea01c PyEval_EvalCodeEx @ 0x7f352c3b8b89 PyEval_EvalCode @ 0x7f352c44d1b4 PyImport_ExecCodeModuleEx @ 0x7f352c44db8f (unknown) @ 0x7f352c44f300 (unknown) @ 0x7f352c44f5c8 (unknown) @ 0x7f352c4506db PyImport_ImportModuleLevel @ 0x7f352c3c7698 (unknown) @ 0x7f352c4131e3 PyObject_Call ./train/train_model.sh: line 2: 20872 Segmentation fault ./caffe-colorization/build/tools/caffe train -solver ./train/solver.prototxt -weights ./models/init_v2.caffemodel -gpu $1

    My GPU is GTX1070-8G, memory is 32G and system is Ubuntu16.04. The ImageNet-lmdb-file was created with ./caffe-colorization/examples/imagenet/create_imagenet.sh without resizing, resized to 256x256 and resized to 176x176, but all three cases came to the same error. I think maybe it relates to protobuf, so I tried protobuf-v3.2.0, protobuf-v3.2.0-rc.1, protobuf-v3.2.0rc2 and protobuf-v3.2.1, but again all cases came to the same error. With the lastest caffe-1.0, the error is same. Can anyone help me about this "Segmentation fault" issue? Thanks appreciately!

    opened by musicrainie 4
  • Help me

    Help me

    [libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 213:13: Message type "caffe.ConvolutionParameter" has no field named "dilation". WARNING: Logging before InitGoogleLogging() is written to STDERR F0402 13:04:55.491230 21321 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: colorization_deploy_v0.prototxt *** Check failure stack trace: ***

    Ubuntu14.04 GTX9602G I installed NVIDIA's caffe following the steps below https://github.com/NVIDIA/DIGITS/blob/master/docs/UbuntuInstall.md

    Thank you richzhang

    opened by tzatter 4
  • Need a clarification why exactly the number 313

    Need a clarification why exactly the number 313

    Hi,

    I know that this is not the place for asking questions about the theory behind your research, but just I wanted to clarify the derivation of the number 313. I am trying to do a similar experiment on my own (https://github.com/preslavrachev/nn-photo-colorization), but I am approaching it from the ground up, basically setting up my own model and everything. I ended up reading your paper, and I got really interested in integrating some of your findings into my experiment. I understand the logic behind the ab space classification, but I can't exactly figure out why 313. Is this a heuristic you came up with, or is it something trivial that I'm just missing on?

    I have to say, I am very interested to ML, but pretty new to the field, so please, excuse my ignorance, in case I have missed something extremely trivial.

    opened by preslavrachev 3
  • possible SciKit Learn version issue

    possible SciKit Learn version issue

    When training the network from scratch using ./train/train_model.sh 0, the following error happens at the "Solving LtoAB" step of the training:

    caffe_traininglayers.py", line 331, in encode_points_mtx_nd
        (dists,inds) = self.nbrs.kneighbors(pts_flt)
    
    ....
    
      File "sklearn/neighbors/binary_tree.pxi", line 1309, in sklearn.neighbors.ball_tree.BinaryTree.query (sklearn/neighbors/ball_tree.c:11514)
      File "sklearn/neighbors/binary_tree.pxi", line 587, in sklearn.neighbors.ball_tree.NeighborsHeap.__init__ (sklearn/neighbors/ball_tree.c:5582)
    TypeError: 'float' object cannot be interpreted as an index
    

    The data pts_flt is a float32 numpy ndarray. Could this be due to a version problem in sklearn itself (I am using 0.18.1)? Please let me know what version of scikit is used for this codebase and I'll match that and try training again.

    thanks, Aruni

    opened by AruniRC 3
  • Division by zero when class rebalancing ?

    Division by zero when class rebalancing ?

    Hey there ! I've generated my own priors for training based on my own dataset. The problem is that they contain zeros and when I put the exponent -1, like this self.prior_factor = self.mix ** -self.alpha, it generates Infinity values inside my vector because I'm essentially trying to divide by zero Any idea how to fix it would be much appreciated. Thanks !

    opened by nasdenkov 2
  • A strange question about the difference between train and deploy in prototxt

    A strange question about the difference between train and deploy in prototxt

    Hi, Richard, I have some question about difference between ./colorization/model/colorization_deploy_v2.prototxt and ./colorization/model/colorization_train_val_v2.prototxt

    in the end of colorization_deploy_v2.prototxt , you setup such a layer :

    **********************

    ***** Decoding *****

    **********************

    layer { name: "class8_ab" type: "Convolution" bottom: "class8_313_rh" top: "class8_ab" convolution_param { num_output: 2 kernel_size: 1 stride: 1 dilation: 1 } } but this layer seems not be trained in colorization_train_val_v2.prototxt , so I can not understand how it maps the 313x64x64 to 2x224x224 , could you tell me why does it work ?

    Thank you very much

    opened by Xinian 2
  • Are the probabilities in prior_probs.npy smoothed with a Gaussian kernel?

    Are the probabilities in prior_probs.npy smoothed with a Gaussian kernel?

    Hi,

    I didn't find the code to generate the prior_probs.npy file, so I decided to ask here. Are the probabilities in the prior_probs.npy file raw empirical probabilities? Or are they smoothed probabilities?

    Many thanks!

    opened by mingo-x 2
  • How to realize image segmentation?

    How to realize image segmentation?

    Hello, I'm very sorry to bother you. I want to use your network for testing image segmentation, please tell me how to do it. Looking forward to your reply.

    opened by zouzhiwei217 0
  • colorization is not working properly, output is not good

    colorization is not working properly, output is not good

    Is there nay change about the model that the repo download from the amazon server? it is not working properly, output is not good. I used this model earlier, it was OK. However, it is not working properly. Is there something wrong with the amazon model?

    sonuc

    opened by celikmustafa89 0
  • Model architecture does not have H*W*Q size tensors for (a,b) probability

    Model architecture does not have H*W*Q size tensors for (a,b) probability

    As mentioned in the paper, the loss function uses Z, Z^ which are of shape H*W*Q. However the model architecture computes the (a,b) probability distribution in a tensor of shape H/4 * W/4 * Q. How are we computing Z^ then? Is it not supposed to be predicted by the model and we use it for calculation of the loss?

    opened by tathagatv 1
  • annealed mean implementation

    annealed mean implementation

    Hi,

    I have a question regarding the annealed mean implementation (PyTorch code). If I understand correctly this step is implemented, for testing, as an additional convolutional layer applied to the distribution tensor (after the softmax layer). If this is correct, I do not understand how can the function f_T (equation (5) in the paper) be implemented with a convolutional layer. Or is it only for the case where the temperature T is equal to 1 (hence taking the mean and not the annealed mean)?

    Thanks in advance!

    opened by laaRaa 0
Owner
Richard Zhang
Research Scientist @adobe | PhD UC Berkeley || Computer vision | deep learning | graphics | machine learning
Richard Zhang
DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning.

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning. It furthermore has automatic differentiati

DL4S Team 2 Dec 5, 2021
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

mtcnn-caffe Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks. This project provide you a method to update mu

Weilin Cong 500 Oct 30, 2022
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
Artificial intelligence/machine learning data structures and Swift algorithms for future iOS development. bayes theorem, neural networks, and more AI.

Swift Brain The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X develo

Vishal 331 Oct 14, 2022
A toolbox of AI modules written in Swift: Graphs/Trees, Support Vector Machines, Neural Networks, PCA, K-Means, Genetic Algorithms

AIToolbox A toolbox of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Al

Kevin Coble 776 Dec 18, 2022
Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.

Bender Bender is an abstraction layer over MetalPerformanceShaders useful for working with neural networks. Contents Introduction Why did we need Bend

xmartlabs 1.7k Dec 24, 2022
Automatic spoken language identification (LID) using deep learning.

iLID Automatic spoken language identification (LID) using deep learning. Motivation We wanted to classify the spoken language within audio files, a pr

Thomas Werkmeister 85 Apr 3, 2022
Shallow and Deep Convolutional Networks for Saliency Prediction

Shallow and Deep Convolutional Networks for Saliency Prediction Paper accepted at 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVP

Image Processing Group - BarcelonaTECH - UPC 183 Jan 5, 2023
The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

Deep Image Analogy The major contributors of this repository include Jing Liao, Yuan Yao, Lu Yuan, Gang Hua and Sing Bing Kang at Microsoft Research.

MSRA CVer 1.4k Jan 6, 2023
BrainCore is a simple but fast neural network framework written in Swift.

BrainCore is a simple but fast neural network framework written in Swift. It uses Metal which makes it screamin' fast. If you want to see it

Alejandro Isaza 377 Jun 29, 2022
DeepInfant® is a Neural network system designed to predict whether and why your baby is crying.

DeepInfant DeepInfant® is a Neural network system designed to predict whether and why your baby is crying. DeepInfant uses artificial intelligence and

Skytells AI Research 14 Oct 19, 2022
Takes those cursed usernames you see on social networks and lets them be accessible to screen readers.

AccessibleAuthorLabel ?? Takes those cursed usernames you see on social networks and lets them be accessible to screen readers so everyone can partake

Christian Selig 40 Jan 25, 2022
Deeper Depth Prediction with Fully Convolutional Residual Networks (FCRN)

Deeper Depth Prediction with Fully Convolutional Residual Networks By Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir

Iro Laina 1.1k Dec 22, 2022
Running Swift automatic differentiation on iOS

Differentiation Demo This is an example of Swift's automatic differentiation running on iOS. It is a modified version of the game from ARHeadsetKit tu

Philip Turner 7 Apr 27, 2022
On-device wake word detection powered by deep learning.

Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening

Picovoice 2.8k Dec 30, 2022
A Swift deep learning library with Accelerate and Metal support.

Serrano Aiming to offering popular and cutting edge techs in deep learning area on iOS devices, Serrano is developed as a tool for developers & resear

pcpLiu 51 Nov 17, 2022
A simple deep learning library for estimating a set of tags and extracting semantic feature vectors from given illustrations.

Illustration2Vec illustration2vec (i2v) is a simple library for estimating a set of tags and extracting semantic feature vectors from given illustrati

Masaki Saito 661 Dec 12, 2022
This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

The Victory Group of Besti 102 Dec 15, 2022
🖼 iOS11 demo application for image style classification.

Styles Vision Demo A Demo application using Vision and CoreML frameworks to detect the most likely style of the given image. Model This demo is based

Cocoa AI 47 Oct 22, 2022