Accelerated tensor operations and dynamic neural networks based on reverse mode automatic differentiation for every device that can run Swift - from watchOS to Linux

Overview

DL4S

License Releases Documentation
Supports Linux, macOS, iOS, tvOS and watchOS Build Status

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning. It furthermore has automatic differentiation builtin, which allows you to create and train neural networks without needing to manually implement backpropagation - without needing a special Swift toolchain.

Features include implementations for many basic binary and unary operators, broadcasting, matrix operations, convolutional and recurrent neural networks, commonly used optimizers, second derivatives and much more. DL4S provides implementations for common network architectures, such as VGG, AlexNet, ResNet and Transformers.

While its primary purpose is deep learning and optimization, DL4S can be used as a library for vectorized mathematical operations like numpy.

Read the full documentation

Overview

  1. Installation
  2. Features
    1. Layers
    2. Optimizers
    3. Losses
    4. Tensor Operations
    5. Engines
    6. Architectures
  3. Examples

Installation

iOS / tvOS / macOS

  1. In Xcode, select "File" > "Swift Packages" > "Add Package Dependency"
  2. Enter https://github.com/palle-k/DL4S.git into the Package URL field and click "Next".
  3. Select "Branch", "master" and click "Next".
  4. Enable the Package Product DL4S, your app in the "Add to Target" column and click "Next".

Note: Installation via CocoaPods is no longer supported for newer versions.

Swift Package

Add the dependency to your Package.swift file:

.package(url: "https://github.com/palle-k/DL4S.git", .branch("master"))

Then add DL4S as a dependency to your target:

.target(name: "MyPackage", dependencies: ["DL4S"])

MKL / IPP / OpenMP Support

DL4S can be accelerated with Intel's Math Kernel Library, Integrated Performance Primitives and OpenMP (Installation Instructions).

On Apple devices, DL4S uses vectorized functions provided by the builtin Accelerate framework by default. If no acceleration library is available, a fallback implementation is used.

Compiling with MKL/IPP:

# After adding the APT repository as described in the installation instructions
sudo apt-get install intel-mkl-64bit-2019.5-075 intel-ipp-64bit-2019.5-075 libiomp-dev

export MKLROOT=/opt/intel/mkl
export IPPROOT=/opt/intel/ipp
export LD_LIBRARY_PATH=${MKLROOT}/lib/intel64:${IPPROOT}/lib/intel64:${LD_LIBRARY_PATH}

swift build -c release \
    -Xswiftc -DMKL_ENABLE \
    -Xlinker -L${MKLROOT}/lib/intel64 \
    -Xlinker -L${IPPROOT}/lib/intel64

TensorBoard Support

DL4S-Tensorboard provides a summary writer that can write tensorboard compatible logs.

LLDB Extension

DL4S includes a LLDB python script that provides custom descriptions for Tensors (util/debugger_support/tensor.py).

To use enhanced summaries, execute command script import /path/to/DL4S/util/debugger_support/tensor.py either directly in LLDB or add the command to your ~/.lldbinit file.

Then you can use the print or frame variable commands to print human-readable descriptions of tensors.

Features

Layers

Core:

  • Convolution
  • Transposed Convolution
  • Dense/Linear/Fully Connected
  • LSTM
  • Gated Recurrent Unit (GRU)
  • Vanilla RNN
  • Embedding
  • Multi-head Attention
  • Transformer Block

Pooling:

  • Max Pooling
  • Average Pooling
  • Adaptive Max Pooling
  • Adaptive Average Pooling

Norm:

  • Batch Norm
  • Layer Norm

Utility:

  • Bidirectional RNNs
  • Sequential
  • Lambda
  • Dropout
  • Lambda

Activation:

  • Relu
  • LeakyRelu
  • Gelu
  • Tanh
  • Sigmoid
  • Softmax
  • Log Softmax
  • Dropout
  • Gelu
  • Swish
  • Mish
  • LiSHT

Transformer:

  • Positional Encoding
  • Scaled Dot Product Attention
  • Multihead Attention
  • Pointwise Feed Forward
  • Transformer Encoder Block
  • Transformer Decoder Block

Optimizers

  • SGD
  • Momentum
  • Adam
  • AMSGrad
  • AdaGrad
  • AdaDelta
  • RMSProp

Losses

  • Binary Cross-Entropy
  • Categorical Cross-Entropy
  • Negative Log Likelihood (NLL Loss)
  • MSE
  • L1 & L2 regularization

Tensor Operations

Behavior of broadcast operations is consistent with numpy rules.

  • broadcast-add
  • broadcast-sub
  • broadcast-mul
  • broadcast-div
  • matmul
  • neg
  • exp
  • pow
  • log
  • sqrt
  • sin
  • cos
  • tan
  • tanh
  • sum
  • max
  • relu
  • leaky relu
  • gelu
  • elu
  • elementwise min
  • elementwise max
  • reduce sum
  • reduce max
  • scatter
  • gather
  • conv2d
  • transposed conv2d
  • max pool
  • avg pool
  • subscript
  • subscript range
  • transpose
  • axis permute
  • reverse
  • im2col
  • col2im
  • stack / concat
  • swish activation
  • mish activation
  • lisht activation
  • diagonal matrix generation
  • diagonal extraction
  • band matrix generation

Engines

  • CPU (Accelerate framework for Apple Devices)
  • CPU (Intel Math Kernel Library and Integrated Performance Primitives)
  • CPU (Generic)
  • GPU (ArrayFire: OpenCL, CUDA)

For an experimental, early stage GPU accelerated version, check out feature/arrayfire.

Architectures

Default implementations are provided for the following architectures:

  • ResNet18
  • VGG (11, 13, 16, 19)
  • AlexNet
  • Transformer

Examples

Some high level examples have been implemented in other repositories:

Arithmetic & Differentiation

DL4S provides a high-level interface to many vectorized operations on tensors.

let a = Tensor<Float, CPU>([[1,2],[3,4],[5,6]], requiresGradient: true)
let prod = a.transposed().matrixMultipled(with: a)
let s = prod.reduceSum()
let l = log(s)
print(l) // 5.1873856

When a tensor is marked to require a gradient, a compute graph will be captured. The graph stores all operations, which use that tensor directly or indirectly as an operand.

It is then possible to backpropagate through that graph using the gradients(of:) function:

// Backpropagate
let dl_da = l.gradients(of: [a])[0]

print(dl_da)
/*
[[0.034, 0.034]
 [0.078, 0.078]
 [0.123, 0.123]]
*/

Second derivatives

The operations used during backpropagation are themselves differentiable. Therefore, second derivatives can be computed by computing the gradient of the gradient.

When higher order derivatives are required, the compute graph of the backwards pass has to be explicitly retained.

let t = Tensor<Float, CPU>([1,2,3,4], requiresGradient: true)

let result = t * t * t
print(result) // [1, 8, 27, 64]

let grad = result.gradients(of: [t], retainBackwardsGraph: true)[0]
print(grad) // [3, 12, 27, 48]

let secondGrad = grad.gradients(of: [t], retainBackwardsGraph: true)[0]
print(secondGrad) // [6, 12, 18, 24]

let thirdGrad = secondGrad.gradients(of: [t])[0]
print(thirdGrad) // [6, 6, 6, 6]

Convolutional Networks

Example for MNIST classification

// Input must be batchSizex1x28x28
var model = Sequential {
   Convolution2D<Float, CPU>(inputChannels: 1, outputChannels: 6, kernelSize: (5, 5))
   Relu<Float, CPU>()
   MaxPool2D<Float, CPU>(windowSize: 2, stride: 2)
   
   Convolution2D<Float, CPU>(inputChannels: 6, outputChannels: 16, kernelSize: (5, 5))
   Relu<Float, CPU>()
   MaxPool2D<Float, CPU>(windowSize: 2, stride: 2)
   
   Flatten<Float, CPU>()
   
   Dense<Float, CPU>(inputSize: 256, outputSize: 120)
   Relu<Float, CPU>()
   
   Dense<Float, CPU>(inputSize: 120, outputSize: 10)
   LogSoftmax<Float, CPU>()
}

var optimizer = Adam(model: model, learningRate: 0.001)

// Single iteration of minibatch gradient descent
let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 1, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]

// use optimizer.model, not model
let pred = optimizer.model(batch)
let loss = categoricalNegativeLogLikelihood(expected: y_true, actual: pred)

let gradients = loss.gradients(of: optimizer.model.parameters)
optimizer.update(along: gradients)

Recurrent Networks

Example for MNIST classification

The Gated Reccurent Unit scans the image from top to bottom and uses the final hidden state for classification.

let model = Sequential {
    GRU<Float, CPU>(inputSize: 28, hiddenSize: 128, direction: .forward)
    Lambda<GRU<Float, CPU>.Outputs, Tensor<Float, CPU>, Float, CPU> { inputs in
        inputs.0
    }
    Dense<Float, CPU>(inputSize: 128, outputSize: 10)
    LogSoftmax<Float, CPU>()
}

var optimizer = Adam(model: model, learningRate: 0.001)

let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]

let x = batch.permuted(to: 1, 0, 2) // Swap first and second axis
let pred = optimizer.model(x)
let loss = categoricalNegativeLogLikelihood(expected: y_true, actual: pred)

let gradients = loss.gradients(of: optimizer.model.parameters)
optimizer.update(along: gradients)
You might also like...
Deeper Depth Prediction with Fully Convolutional Residual Networks (FCRN)
Deeper Depth Prediction with Fully Convolutional Residual Networks (FCRN)

Deeper Depth Prediction with Fully Convolutional Residual Networks By Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir

Automatic spoken language identification (LID) using deep learning.

iLID Automatic spoken language identification (LID) using deep learning. Motivation We wanted to classify the spoken language within audio files, a pr

Resource monitor - A flutter plugin for Android and IOS to monitor CPU and RAM usage of device.

resource_monitor A flutter plugin for Android and IOS to monitor CPU and RAM usage of device. TODO Implement Android Side of this plugin. Add listener

On-device wake word detection powered by deep learning.
On-device wake word detection powered by deep learning.

Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening

Spiral is a SwiftUI shape for macOS, iOS and watchOS.
Spiral is a SwiftUI shape for macOS, iOS and watchOS.

Spiral is a SwiftUI shape for macOS, iOS and watchOS. A spiral is a component that includes a point generator, Shape and View of the spiral. The point

A note on the Construction of the watchOS App Notes

This document is a note on the Construction of the watchOS App "Notes" Learn about the main topics of this watchOS project In this SwiftUI tutorial, w

An AI that can play Quess

Quesstionable Intelligence An AI that can play Quess, by Jake Uskoski Commands s|show c|compact Show the current board. Include `c` or `compact` f

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.
This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

This is an open-source project for the aesthetic evaluation of images based on the deep learning-caffe framework, which we completed in the Victory team of Besti.

A Swift library for creating and exporting CoreML Models in Swift

SwiftCoreMLTools A Swift Library for creating CoreML models in Swift. Work in progress This library expose a (function builder based) DSL as well as a

Comments
  • Conv1d Support

    Conv1d Support

    Heyo @palle-k ! I've started this pull request even though I'm in the middle of, and no where close to being done with adding 1d convs... I'm just a bit stuck on the 1d equivalent for im2col and col2im (if that even exists). I don't want to change the code you wrote for im2col / col2im too much to make it 1d, however I'm having trouble understanding the code you wrote for im2col and col2im... Could you give me some guidance on what exactly everything is doing for im2col and col2im in the file "Sources/DL4S/Engine/CPU/Numeric/CPUGeneric.swift" please? Also, I apolagise if I don't know the name of the 1d equivilent of im2col, i just call it im2col1d lol.

    opened by ryendu 8
  • Feature Request: Add Mish activation

    Feature Request: Add Mish activation

    Mish is a new novel activation function proposed in this paper. It has shown promising results so far and has been adopted in several packages including:

    All benchmarks, analysis and links to official package implementations can be found in this repository

    It would be nice to have Mish as an option within the activation function group.

    This is the comparison of Mish with other conventional activation functions in a SEResNet-50 for CIFAR-10: se50_1

    opened by digantamisra98 2
  • Old issue getting too long

    Old issue getting too long

    I just finished a script for changing the image. I couldn't find any utility online that let me just change individual pixels' colors - so frustrating. Anyway, now all you have to do is type in the 8-bit color values and the script will modify the image. Luckily, the edges of letters all had the same RGB, just different alpha.

    The script produces the same output as it got as input by default. I'm giving this to you in case you want to experiment with the color scheme yourself, but I'll experiment myself and see what looks good.

    main.swift.zip

    opened by philipturner 1
  • You need some help?

    You need some help?

    Hey there @palle-k, I really love this project and I think it is really awesome! I'd really like to help out and contribute, but DL4S seems pretty complete to me. Is there anything that I could help with? I'm really familiar with Swift, and I'm a bit new to open source.

    opened by ryendu 58
Owner
Palle
CS Master Student @ TUM; WWDC Student Challenge Winner '20
Palle
Automatic colorization using deep neural networks. Colorful Image Colorization. In ECCV, 2016.

Colorful Image Colorization [Project Page] Richard Zhang, Phillip Isola, Alexei A. Efros. In ECCV, 2016. + automatic colorization functionality for Re

Richard Zhang 3k Dec 27, 2022
Running Swift automatic differentiation on iOS

Differentiation Demo This is an example of Swift's automatic differentiation running on iOS. It is a modified version of the game from ARHeadsetKit tu

Philip Turner 7 Apr 27, 2022
Artificial intelligence/machine learning data structures and Swift algorithms for future iOS development. bayes theorem, neural networks, and more AI.

Swift Brain The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X develo

Vishal 331 Oct 14, 2022
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

mtcnn-caffe Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks. This project provide you a method to update mu

Weilin Cong 500 Oct 30, 2022
A toolbox of AI modules written in Swift: Graphs/Trees, Support Vector Machines, Neural Networks, PCA, K-Means, Genetic Algorithms

AIToolbox A toolbox of AI modules written in Swift: Graphs/Trees, Linear Regression, Support Vector Machines, Neural Networks, PCA, KMeans, Genetic Al

Kevin Coble 776 Dec 18, 2022
Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.

Bender Bender is an abstraction layer over MetalPerformanceShaders useful for working with neural networks. Contents Introduction Why did we need Bend

xmartlabs 1.7k Dec 24, 2022
DeepInfant® is a Neural network system designed to predict whether and why your baby is crying.

DeepInfant DeepInfant® is a Neural network system designed to predict whether and why your baby is crying. DeepInfant uses artificial intelligence and

Skytells AI Research 14 Oct 19, 2022
BrainCore is a simple but fast neural network framework written in Swift.

BrainCore is a simple but fast neural network framework written in Swift. It uses Metal which makes it screamin' fast. If you want to see it

Alejandro Isaza 377 Jun 29, 2022
Takes those cursed usernames you see on social networks and lets them be accessible to screen readers.

AccessibleAuthorLabel ?? Takes those cursed usernames you see on social networks and lets them be accessible to screen readers so everyone can partake

Christian Selig 40 Jan 25, 2022
Shallow and Deep Convolutional Networks for Saliency Prediction

Shallow and Deep Convolutional Networks for Saliency Prediction Paper accepted at 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVP

Image Processing Group - BarcelonaTECH - UPC 183 Jan 5, 2023