Dogtector: dog breed detection app for iOS using YOLOv5 model combined with Metal based object decoder optimized

Related tags

JSON Dogtector
Overview

Dogtector


Project description

Dogtector is dog breed detection app for iOS using YOLOv5 model combined with Metal based object decoder optimized for ultra fast live detection on iOS devices.

Technical Overview

Requirements

Project was developed in Swift 5.5 using XCode 13.1 and was designed for devices running iOS 14 and newer.

  • XCode 13.1+
  • Swift 5.5+
  • iOS 14.0+

Technologies

Vast majority of the project was created in SwiftUI and Combine (UIKit was used only when there was no SwiftUI alternative for desired components or functionality) and all the porting part of YOLOv5 model was handled with CoreML and Metal.

  • SwiftUI
  • UIKit (when necessary)
  • Combine
  • CoreML
  • Metal

Object decoder

The most important part of the project was porting trained YOLOv5 model to iOS which turned out to be challenging for massive model trained with 417 class objects. The most basic CPU approach for object decoding was not enough as it resutled in 1.7 fps of live detection on devices equipped with latest A15 chips. Finally, the approach that worked sufficiently well for the desired use case was implementing the object decoder in Metal to process the output using GPU.

The created Metal based object decoder has been optimized for all iPhones with iOS 14 and newer taking advantage of the latest GPU enhancements on the newer models keeping support for devices as old as iPhone 6S at the same time. As a result there emerged two strategies for object decoding:

  1. Fastest full metal based decoder - available only for devices with GPU Family 4 (A11 and newer) taking the full adventages of fast GPU processing
  2. Slightly slower hybrid based decoder (mostly Metal, partly CPU based) - available for all older devices, as older GPU families seemed to have issues with atomic operations in loops in Metal kernels and do not support non-uniform thread grids

Performance

Finally, the Metal implementations appeared to be fast enough to be used for live detection for the computationally expensive model I intended to use. With comparision to CPU implementation, live detection on iPhone 13 Pro turned out to be approximaly 25x faster with the Metal decoder acheiving ~43 fps of live detection from initial ~1.7 fps on CPU.

Speed of the live detection depends largely on model parameters such as amount of classess and model input size. However, devices with Apple Neural Engine and faster GPUs perform better when compared using the same model.

The table below shows acheived results on various devices for YOLOv5m model trained with 417 object classes and with 288x288 input size:

Device iPhone 13 Pro iPhone 12 Pro iPhone 11 Pro iPhone X iPhone 6s
Chip A15 A14 A13 A11 A9
ANE
(5th gen)

(4th gen)

(3rd gen)

(private)
Apple GPU family 8 7 6 4 3
Live detection performance 30+ fps
(throttled*)
30+ fps
(throttled*)
~15 fps ~5 fps ~2.7 fps

*Camera output has been limited to 30 fps for stability and battery reasons but in raw conditions A15 and A14 acheived ~43 fps and ~35 fps of live detection, respectively.

Build and run

Dotgtector has been build with pure iOS SDK without any external libraries / frameworks so building and running is as simple as opening Dogtector.xcodeproj file with XCode and running Dogtector target.

Important note: this repository does not contain the trained model used in the production version of Dogtector so if you would like to build the application with trained model you need to follow the instructions from bringing own model section. However, if you want to just preview how the appliaction works without getting your own model you can download the production version of the application from the App Store.

Bringing own model

Model setup

You can configure the project to work with your model very easily as it was designed to be as flexible as possible for all the models from YOLOv5 family and should work fine regardless of different model parameters like amount of classes or input size.

First, you need trained YOLOv5 model. If you don't have any you can follow this tutorial to train your own. Keep in mind that weights in Pytorch .pt format won't work with CoreML so you need to convert your model to the compatible .mlmodel format using coremltools. If you don't know how, please follow these instructions.

When you have your trained model in CoreML format you just have to drag and drop it to your workspace in XCode. By default the application will seek for Detector.mlmodel file but if you want to use different name you have to change assetName value in Dogtector/ObjectDetector/ModelInfo.swift.

Class information

The last step is updating information about classes in order to display detection annoations properly. This data is stored in Dogtector/ObjectDetector/en.lproj/ObjectInfo.plist. Object class information can be localized and by default there is also Polish version of the file in Dogtector/ObjectDetector/pl.lproj directory.

Most important requirements for ObjectInfo.plist file:

  • The data should be reflection of the classes used to train your model
  • Order of the items in array and should be exactly the same as in your trained model
  • Identifier should be unique

Object format description:

  • identifier - [required] - should be unique string, is used to find object image to display on the screen. App will seek for images named {identifier} for larger resolution preivew and for {identifier}_miniature thumbnail images for faster rendereing of annoatations. Although it is not colpulsory to add images of each of the classes info the project, this field is also used for internal class object identification purposes so it is required to add it even if you don't plan to add any images.
  • name - [required] - string with name of class object displayed to the user.
  • alternativeNames - [optional] - array of strings with alternative class names displayed to the user. These will be displayed in Alternative names section on detection information sheet - was designed to display alternative dog breed names.
  • origin - [optional] - array of strings with origin of class item - was designed to display countries of origin of dog breeds.
  • url - [optional] - string with URL to more information about class object - was designed to display More info button on information sheet if present.
  • licence - [optional] - string with information about licence of displayed image.

Whole object format looks like this:

<dict>
        <key>identifier</key>
        <string>UNIQUE_NAME_IDENTIFIER</string>
        <key>name</key>
        <string>OBJECT_DISPLAYED_NAME</string>
        <key>alternativeNames</key>
        <array>
            <string>ALTERNATIVE_NAME_1</string>
        </array>
        <key>origin</key>
        <array>
            <string>ORIGIN_INFO_1</string>
        </array>
        <key>url</key>
        <string>URL_TO_MORE_INFO</string>
        <key>licence</key>
        <string>INFO_ABOUT_IMAGE_LICENCE</string>
    </dict>

The only required fields are identifier and name so the simplest working example for the whole plist file would be:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>identifier</key>
        <string>dog</string>
        <key>name</key>
        <string>Dog</string>
</dict>
</array>
</plist>
You might also like...
JSON object with Swift

JSON JSON using @dynamicMemberLookup, which allows us to write more natural code

Simple JSON Object mapping written in Swift

ObjectMapper ObjectMapper is a framework written in Swift that makes it easy for you to convert your model objects (classes and structs) to and from J

HandyJSON is a framework written in Swift which to make converting model objects to and from JSON easy on iOS.

HandyJSON To deal with crash on iOS 14 beta4 please try version 5.0.3-beta HandyJSON is a framework written in Swift which to make converting model ob

ObjectMapper is a framework written in Swift that makes it easy for you to convert your model objects to and from JSON.

ObjectMapper is a framework written in Swift that makes it easy for you to convert your model objects (classes and structs) to and from J

Model framework for Cocoa and Cocoa Touch

Mantle Mantle makes it easy to write a simple model layer for your Cocoa or Cocoa Touch application. The Typical Model Object What's wrong with the wa

JSONExport is a desktop application for Mac OS X which enables you to export JSON objects as model classes with their associated constructors, utility methods, setters and getters in your favorite language.
JSONExport is a desktop application for Mac OS X which enables you to export JSON objects as model classes with their associated constructors, utility methods, setters and getters in your favorite language.

JSONExport JSONExport is a desktop application for Mac OS X written in Swift. Using JSONExport you will be able to: Convert any valid JSON object to a

🌟 Super light and easy automatic JSON to model mapper

magic-mapper-swift 🌟 Super light and easy automatic JSON to model mapper Finish writing README.md Ability to map NSManagedObject Ability to convert m

An iOS framework for creating JSON-based models. Written in Swift.
An iOS framework for creating JSON-based models. Written in Swift.

An iOS framework for creating JSON-based models. Written in Swift (because it totally rules!) Requirements iOS 8.0+ Xcode 7.3 Swift 2.2 Installation E

Aging FRIENDS App for iOS Using Swift
Aging FRIENDS App for iOS Using Swift

Aging F.R.I.E.N.D.S 🎵 So no one told you how old they are ? 👏 👏 👏 👏 Ever wondered how old the FRIENDS cast are ? Let's find out ! Reemz.Mac.48452

Comments
  • Cannot select a photo

    Cannot select a photo

    Hi, Mr. Pluta, there is nothing coming up after I select the photo. And after I take a picture, there is always circling and no results. My trained model is trained by this version of YOLOv5 (https://github.com/ultralytics/yolov5/releases/tag/v6.1). I heard that the latest YOLOv5 changed its neural network a little bit. Does that cause no-result problem?

    opened by congrendai 5
  • gpu

    gpu

    Devices: iPhone11, A13, iOS15, are the two strategies for object decoding automatically selected to use GPU for work? Debugging on Xcode shows that the GPU usage is 0 and the CPU usage is 130MB

    opened by iscyy 1
  • please help - using yolov5s pre-trained doesnt work on iOS when converted with coreml

    please help - using yolov5s pre-trained doesnt work on iOS when converted with coreml

    I am trying to put a yolov5 pre-trained model on an apple phone (default class predictions)

    I converted the model to .mlmodel format using the ultralytics export.py function;

    !python export.py --weights yolov5s.pt --include coreml --imgsz 1080 1920

    when trying to use on the phone, the model input size seemed to be an issue, so i resized the .mlmodel using the following code;

    import coremltools import coremltools.proto.FeatureTypes_pb2 as ft spec = coremltools.utils.load_spec("yolov5s.mlmodel") input = spec.description.input[0] input.type.imageType.colorSpace = ft.ImageFeatureType.RGB input.type.imageType.height = 1920 input.type.imageType.width = 1080 coremltools.utils.save_spec(spec, "yolov5s_1080_1920.mlmodel")

    the pre-trained yolov5s model should detect people, but it doesn't seem to detect anything... I tried people and buses. am I doing something wrong or missing a step? any thoughts?

    Any help would be much appreciated!

    opened by mattrichard-datascience 1
  • Label placement/flickering

    Label placement/flickering

    First - great idea and execution! I really enjoy using it and it looks nice from UX perspective.

    However, I have 2 suggestions for improvement:

    • I find label placement a bit inconvenient, it usually covers the dog that I am analysing. In my opinion it would look better above/under the bounding frame (depending if there is any place left under/above frame),
    • in some situations, when model is not so sure about dog specie prediction, it starts to flicker, constantly changing the output spiece. Maybe you could change threshold for changing displayed output (change it only if model is quite sure about prediction)?
    opened by waszkiewiczj 1
Owner
Bartłomiej Pluta
Bartłomiej Pluta
A fast, convenient and nonintrusive conversion framework between JSON and model. Your model class doesn't need to extend any base class. You don't need to modify any model file.

MJExtension A fast, convenient and nonintrusive conversion framework between JSON and model. 转换速度快、使用简单方便的字典转模型框架 ?? ✍??Release Notes: more details Co

M了个J 8.5k Jan 3, 2023
JSONNeverDie - Auto reflection tool from JSON to Model, user friendly JSON encoder / decoder, aims to never die

JSONNeverDie is an auto reflection tool from JSON to Model, a user friendly JSON encoder / decoder, aims to never die. Also JSONNeverDie is a very important part of Pitaya.

John Lui 454 Oct 30, 2022
PMJSON provides a pure-Swift strongly-typed JSON encoder/decoder

Now Archived and Forked PMJSON will not be maintained in this repository going forward. Please use, create issues on, and make PRs to the fork of PMJS

Postmates Inc. 364 Sep 28, 2022
The easy to use Swift JSON decoder

Unbox is deprecated in favor of Swift’s built-in Codable API and the Codextended project. All current users are highly encouraged to migrate to Codable as soon as possible.

John Sundell 2k Dec 31, 2022
SwiftyJSON decoder for Codable

SwiftyJSONDecoder 功能 继承自 JSONDecoder,在标准库源码基础上做了改动,与其主要区别如下 使用 SwiftyJSON 解析数据,使用其类型兼容功能 废弃 nonConformingFloatDecodingStrategy 属性设置,Double 及 Float 默认解

null 2 Aug 10, 2022
A enhanced JSON decoder.

JSONDecoderEx A enhanced JSON decoder. Usage struct User: Codable { struct Role: OptionSet, Codable, CustomStringConvertible { let rawValu

SAGESSE-CN 105 Dec 19, 2022
A enhanced JSON decoder.

JSONDecoderEx A enhanced JSON decoder. Usage struct User: Codable { struct Role: OptionSet, Codable, CustomStringConvertible { let rawValu

SAGESSE-CN 105 Dec 19, 2022
Reflection based (Dictionary, CKRecord, NSManagedObject, Realm, JSON and XML) object mapping with extensions for Alamofire and Moya with RxSwift or ReactiveSwift

EVReflection General information At this moment the master branch is tested with Swift 4.2 and 5.0 beta If you want to continue using EVReflection in

Edwin Vermeer 964 Dec 14, 2022
A library to turn dictionary into object and vice versa for iOS. Designed for speed!

WAMapping Developed and Maintained by ipodishima Founder & CTO at Wasappli Inc. Sponsored by Wisembly A fast mapper from JSON to NSObject Fast Simple

null 8 Nov 20, 2022
Decodable Simple and strict, yet powerful object mapping made possible by Swift 2's error handling.

Decodable Simple and strict, yet powerful object mapping made possible by Swift 2's error handling. Greatly inspired by Argo, but without a bizillion

Johannes Lund 1k Jul 15, 2022