This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

The Victory Group of Besti

Last update: Dec 15, 2022

Related tags

Machine Learning ILGnet

Overview

ILGnet

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

In this paper we investigate the image aesthetics classification problem, aka, automatically classifying an image into low or high aesthetic quality, which is quite a challenging problem beyond image recognition. Deep convolutional neural network (DCNN) methods have recently shown promising results for image aesthetics assessment. Currently, a powerful inception module is proposed which shows very high performance in object classification. However, the inception module has not been taken into consideration for the image aesthetics assessment problem. In this paper, we propose a novel DCNN structure codenamed ILGNet for image aesthetics classification, which introduces the Inception module and connects intermediate Local layers to the Global layer for the output. Besides, we use a pre-trained image classification CNN called GoogLeNet on the ImageNet dataset and fine tune our connected local and global layer on the large scale aesthetics assessment AVA dataset [1]. The experimental results show that the proposed ILGNet outperforms the state of the art results in image aesthetics assessment in the AVA benchmark.

The AVA dataset

For a fair comparison, we adopted same strategy to construct two sub datasets of AVA as the previous work.

[1] Naila Murray, Luca Marchesotti, Florent Perronnin. AVA: A Large-Scale Database for Aesthetic Visual Analysis. Computer Vision and Pattern Recognition (CVPR), 2012.

• AVA1: We chose the score of 5 as the boundary to divide the dataset into high quality class and low quality class. In this way, there are 74,673 images in low quality and 180,856 images in high quality. the training and test sets contain 235,529 and 20000 images.

• AVA2: to increase the gap between images with high aesthetic quality and images with low aesthetic quality, we firstly sort all images by their mean scores. Then we pick out the top 10% images as good and the bottom 10% images as bad. Thus, we select 51,106 images form the AVA dataset. And all images are evenly and randomly divided into training set and test set, which contains 25,553 images.

The way of test

please use caffe test tools to test accuracy.

The Accuracy of this random partition in the './data'

The accuracy we achieve in the AVA1 dataset is 81.68% with δ=0.And the accuracy is up to 82.66% using Inception V4.

The accuracy we achieve in the AVA2 dataset is 85.50%.And the accuracy is up to 85.53% using Inception V4.

We achieve the state of the art of the aesthetic classification accuracy.

The random partition programs are in the './src'

The Trained Models

The size of the trained model is above 500MB.

You can download them from the BaiduYun cloud disk or Google Drive:

BaiduYun Links:

ILGnet-AVA1.caffemodel

ILGnet-AVA2.caffemodel

Google Drive Links:

ILGnet-AVA1.caffemodel

ILGnet-AVA2.caffemodel

Plus:The deploy.prototxt before is wrong. Now we upload the correct file, and thanks for your suggestion.

Our paper

Xin Jin, Jingying Chi, Siwei Peng, Yulu Tian, Chaochen Ye and Xiaodong Li. Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer. The 8th International Conference on Wireless Communications and Signal Processing (WCSP), Yangzhou, China, 13-15 October, 2016 pdf(5.94MB) oral presentation(19.1MB) arXiv(1610.02256) [Project]

If you find our model/method/dataset useful, please cite our work:

@inproceedings{DBLP:conf/wcsp/JinCPTYL16,

author = {Xin Jin and Jingying Chi and Siwei Peng and Yulu Tian and Chaochen Ye andXiaodong Li},

title = {Deep image aesthetics classification using inception modules and fine-tuning connected layer},

booktitle = {8th International Conference on Wireless Communications {&} Signal Processing, {WCSP} 2016, Yangzhou, China, October 13-15, 2016},

pages = {1--6},

year = {2016},

crossref = {DBLP:conf/wcsp/2016},

url = {http://dx.doi.org/10.1109/WCSP.2016.7752571},

doi = {10.1109/WCSP.2016.7752571},

timestamp = {Fri, 16 Dec 2016 12:48:17 +0100},

biburl = {http://dblp.uni-trier.de/rec/bib/conf/wcsp/JinCPTYL16},

bibsource = {dblp computer science bibliography, http://dblp.org}

}

Latest edit

Jan 15, 2017

Comments

why different output of the same image in two different test?

I have used the pretrained model you offered(https://pan.baidu.com/s/1slMv4yp), and just modify the model name and image name of your test code. But in two different test, I have got different output results, for example, {good:0.6, bad:0.4} {good:0.4, bad:0.6}. It makes me confused and expects your answers~

opened by guoxiaolu 4
想请教一下AVA1的具体训练参数

您好，您的train.prototxt是AVA2使用的，那AVA1训练使用的train.prototxt是否也完全相同呢？我用AVA1_solver.prototxt加上train.prototxt进行训练很快会出现loss=87.3365的现象，即便将学习率调小，使用batchsize=48训练了10W个iteration之后准确率依旧只有75%左右。

opened by xujinheng 0
difference between train.prototxt and ILGNet_v4.prototxt

I want to know what is the difference between train.prototxt and ILGNet_v4.prototxt. I would appreciate it if you could provide me with more information.

opened by leeqiaogithub 1
用自己的数据集fine-tune时，预训练模型用哪一个好？
你好，谢谢你的论文以及代码，有学到很多。我是初次使用caffe，所以有些问题不太懂，想请教下：

关于数据的输入：我是不是应该先根据train.txt/val.txt + 类似create_imagenet.sh，生成lmdb文件呢？caffe可以直接输入图片吗？

如果我想训练自己的数据集，预训练模型是使用你给的ILGnet-AVA1.caffemodel，还是仅在imagenet上预训练的caffemodel呢？（如果使用仅在imagenet上预训练的caffemodel的话，去哪里下载呢？）期待回复！祝好！
opened by YanZhiyuan0918 0
Random output value for same image

Upon running the test.py on the same image over multiple times, I get random output numbers as results. Variations are quite big ranging from 0.1 to 0.8 and 0.9. I'm using caffe 1.0.0

opened by brunnoattorre 3
why the image numbers of test set in this repository and in the paper are different ?

In the paper , image number of test set is 19930, but in this repository the number is 20000. And in readme.md , it is said that the test set in this repository is random partition, so the test accuracy is different , 81.68% in this repository and 79.25% in your paper . Could you please provide the image id of the test set in your paper ? Thank you very much.

opened by liuwenran 1
How to deploy? : )
I am running into some issues trying to deploy the code. When I try to deploy the code, the temp_wl and loss1/classifier_wl layers are initialized randomly, so the output is random and doesn't work. As you suggested, I removed the following code:

weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 }

But then, all the weights and biases were 0.

Could you advise me on how to properly deploy your pre-trained model? Do I need to modify deploy.prototxt? Currently, I am using deploy.prototxt and ILGnet-AVA1.caffemodel. Your test.py did not seem to work for me.
opened by richdu 5

Owner

The Victory Group of Besti

The VIsual CompuTing and infORmation securitY (Victory) lab of Beijing Electronic Science and Technology Institute (Besti)

GitHub

Visual Memorability with Caffe Model

Visual Memorability with Caffe Model @inproceedings{ICCV15_Khosla, author = "Aditya Khosla and Akhil S. Raju and Antonio Torralba and Aude Oliva", tit

17 Feb 18, 2022

On-device wake word detection powered by deep learning.

Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening

2.8k Dec 30, 2022

A Swift deep learning library with Accelerate and Metal support.

Serrano Aiming to offering popular and cutting edge techs in deep learning area on iOS devices, Serrano is developed as a tool for developers & resear

51 Nov 17, 2022

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning.

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning. It furthermore has automatic differentiati

2 Dec 5, 2021

Automatic spoken language identification (LID) using deep learning.

iLID Automatic spoken language identification (LID) using deep learning. Motivation We wanted to classify the spoken language within audio files, a pr

85 Apr 3, 2022

A simple deep learning library for estimating a set of tags and extracting semantic feature vectors from given illustrations.

Illustration2Vec illustration2vec (i2v) is a simple library for estimating a set of tags and extracting semantic feature vectors from given illustrati

661 Dec 12, 2022

The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

Deep Image Analogy The major contributors of this repository include Jing Liao, Yuan Yao, Lu Yuan, Gang Hua and Sing Bing Kang at Microsoft Research.

1.4k Jan 6, 2023

Shallow and Deep Convolutional Networks for Saliency Prediction

Shallow and Deep Convolutional Networks for Saliency Prediction Paper accepted at 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVP

Image Processing Group - BarcelonaTECH - UPC

183 Jan 5, 2023

Automatic colorization using deep neural networks. Colorful Image Colorization. In ECCV, 2016.

Colorful Image Colorization [Project Page] Richard Zhang, Phillip Isola, Alexei A. Efros. In ECCV, 2016. + automatic colorization functionality for Re

3k Dec 27, 2022

MLKit is a simple machine learning framework written in Swift.

MLKit (a.k.a Machine Learning Kit) ?? MLKit is a simple machine learning framework written in Swift. Currently MLKit features machine learning algorit

152 Nov 17, 2022

Text-cli - Command line tool for extracting text from images using Apple's Vision framework

text-cli Command line tool for extracting text from images using Apple's Vision

San Francisco International Airport Museum

9 Aug 29, 2022

The Swift machine learning library.

Swift AI is a high-performance deep learning library written entirely in Swift. We currently offer support for all Apple platforms, with Linux support

5.9k Jan 2, 2023

Artificial intelligence/machine learning data structures and Swift algorithms for future iOS development. bayes theorem, neural networks, and more AI.

Swift Brain The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X develo

331 Oct 14, 2022

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

Related tags

Overview

ILGnet

Comments

why different output of the same image in two different test?

想请教一下AVA1的具体训练参数

difference between train.prototxt and ILGNet_v4.prototxt

用自己的数据集fine-tune时，预训练模型用哪一个好？

Random output value for same image

why the image numbers of test set in this repository and in the paper are different ?

How to deploy? : )

Owner

The Victory Group of Besti

Visual Memorability with Caffe Model

On-device wake word detection powered by deep learning.

A Swift deep learning library with Accelerate and Metal support.

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning.

Automatic spoken language identification (LID) using deep learning.

A simple deep learning library for estimating a set of tags and extracting semantic feature vectors from given illustrations.

The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

Shallow and Deep Convolutional Networks for Saliency Prediction

Automatic colorization using deep neural networks. Colorful Image Colorization. In ECCV, 2016.

MLKit is a simple machine learning framework written in Swift.

Text-cli - Command line tool for extracting text from images using Apple's Vision framework

The Swift machine learning library.

Artificial intelligence/machine learning data structures and Swift algorithms for future iOS development. bayes theorem, neural networks, and more AI.

Generate sniglets with machine learning

This repo contains beginner examples to advanced in swift. Aim to create this for learning native iOS development.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Mobile-ios-ml - SBB Mobile Machine Learning for iOS devices

Scutil - The swift version of my ASOC scutilUtil application. An interesting learning excercise

Conjugar is an app for learning Spanish verb conjugations.