Spokestack: give your iOS app a voice interface!

Overview

Spokestack iOS

Spokestack provides an extensible speech recognition pipeline for the iOS platform. It includes a variety of built-in speech processors for Voice Activity Detection (VAD), wakeword activation, and Automatic Speech Recognition (ASR).

Table of Contents

Features

  • Voice activity detection
  • Wakeword activation with two different implementations
  • Simplified Automated Speech Recognition interface
  • Speech pipeline seamlessly integrates VAD-triggered wakeword detection using on-device machine learning models with transcribing utterances using platform Automated Speech Recognition
  • On-device Natural Language Understanding utterance classifier
  • Simple Text to Speech API

Installation

CocoaPods is a dependency manager for Cocoa projects. For usage and installation instructions, visit their website. To integrate Spokestack into your Xcode project using CocoaPods, specify it in your Podfile:

pod 'Spokestack-iOS'

Usage

Spokestack.io hosts extensive usage documentation including tutorials, integrations, and recipe how-tos.

Configure Wakeword-activated Automated Speech Recognition

import Spokestack
// assume that self implements the SpokestackDelegate protocol
let pipeline = SpeechPipelineBuilder()
    .addListener(self)
    .useProfile(.appleWakewordAppleSpeech)
    .setProperty("tracing", Trace.Level.PERF)
pipeline.start()

This example creates a speech recognition pipeline using a configurable wakeword detector that is triggered by VAD, which in turn activates an the native iOS ASR, returning the resulting utterance to the SpokestackDelegate observer (self in this example).

See SpeechPipeline and SpeechConfiguration for further configuration documentation.

Text to Speech

// assume that self implements the TextToSpeechDelegate protocol
let tts = TextToSpeech(self, configuration: SpeechConfiguration())
tts.speak(TextToSpeechInput("My god, it's full of stars!"))

Natural Language Understanding

let config = SpeechConfiguration()
config.nluVocabularyPath = "vocab.txt"
config.nluModelPath = "nlu.tflite"
config.nluModelMetadataPath = "metadata.json"
// assume that self implements the NLUDelegate protocol
let nlu = try! NLUTensorflow(self, configuration: configuration)
nlu.classify(utterance: "I can't turn that light in the room on for you, Dave", context: [:])

Troubleshooting

A build error similar to Code Sign error: No unexpired provisioning profiles found that contain any of the keychain's signing certificates will occur if the bundle identifier is not changed from io.Spokestack.SpokestackFrameworkExample, which is tied to the Spokestack organization.

Reference

The SpokestackFrameworkExample project is a reference implementations for how to use the Spokestack library, along with runnable examples of the VAD, wakeword, ASR, NLU, and TTS components. Each component has a button on the main screen, and can be started, stopped, predicted, or synthesized as appropriate. The component screens have full debug tracing enabled, so the system control logic and debug events will appear in the XCode console.

Documentation

Getting Started, Cookbooks, and Conceptual Guides

Step-by-step introduction, common usage patterns, and discussion of concepts used by the library, design guides for voice interfaces, and the Android library may all be found on our website.

API Reference

API reference is available on Github.

Deployment

Preconditions

  1. Ensure that git lfs has been installed: https://git-lfs.github.com/. This is used to manage the storage of the large model and metadata files in SpokestackFrameworkExample.
  2. Ensure that CocoaPods has been installed: gem install cocoapods (not via brew).
  3. Ensure that you are registered in CocoaPods: pod trunk register YOUR_EMAIL --description='release YOUR_PODSPEC_VERSION'

Process

  1. Increment the podspec version in Spokestack-iOS.podspec
  2. pod lib lint --use-libraries --allow-warnings, which should pass all checks
  3. git commit -a -m 'YOUR_COMMIT_MESSAGE' && git tag YOUR_PODSPEC_VERSION && git push --origin
  4. pod trunk push --use-libraries --allow-warnings

License

License

Copyright 2020 Spokestack, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments
  • Example app confusion

    Example app confusion

    Is there an explanation for what "SpokeStackFrameworkExample" app is supposed to be demonstrating? I see the four options on the initial landing page, and then start/stop recording buttons on each detail page. It asks for microphone access and sometimes speech access, but otherwise nothing seems to happen. There are some debug messages depending on whether I'm running iOS 12 or 13, but it's usually just "didStart" and "didStop".

    opened by cameron-erdogan 12
  • Pipeline failure due to: Failed to create the interpreter.

    Pipeline failure due to: Failed to create the interpreter.

    Hello,

    I'm getting the error Pipeline failure due to: Failed to create the interpreter. when trying to run my application. The app configurations are correct as they work seamlessly on the Android codebase. Can I have an elaboration on this error?

    AzureManager init
    2021-08-20 13:09:03.865741+0200 Runner[10656:128311] Metal API Validation Enabled
    2021-08-20 13:09:04.013476+0200 Runner[10656:128311] [plugin] AddInstanceForFactory: No factory registered for id <CFUUID 0x600003b21000> F8BB1C28-BAE8-11D6-9C31-00039315CD46
    2021-08-20 13:09:04.500620+0200 Runner[10656:128311] Initialized TensorFlow Lite runtime.
    Pipeline initialized.
    2021-08-20 13:09:04.676435+0200 Runner[10656:128695] flutter: Observatory listening on http://127.0.0.1:53834/L4ASPbmCbtU=/
    2021-08-20 13:09:04.690747+0200 Runner[10656:128311] Didn't find op for builtin opcode 'FULLY_CONNECTED' version '9'
    2021-08-20 13:09:04.691000+0200 Runner[10656:128311] Registration failed.
    Pipeline failure due to: Failed to create the interpreter.
    2021-08-20 13:09:22.104875+0200 Runner[10656:128608] flutter: ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    2021-08-20 13:09:22.105311+0200 Runner[10656:128608] flutter: │ #0   new FlutterSoundRecorder (package:flutter_sound/public/flutter_sound_recorder.dart:155:13)
    2021-08-20 13:09:22.107288+0200 Runner[10656:128608] flutter: │ #1   new ExperimentController (package:elementa/app/modules/experiment/controllers/experiment_controller.dart:29:38)
    2021-08-20 13:09:22.107543+0200 Runner[10656:128608] flutter: ├┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄
    2021-08-20 13:09:22.108070+0200 Runner[10656:128608] flutter: │ 🐛 ctor: FlutterSoundRecorder()
    2021-08-20 13:09:22.108586+0200 Runner[10656:128608] flutter: └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    2021-08-20 13:09:22.668322+0200 Runner[10656:129100] [aurioc] AURemoteIO.h:323:entry: Unable to join I/O thread to workgroup ((null)): 2
    iniSpokestack
    Pipeline started.
    

    It may be worth mentioning that in my Xcode debugger, I see the following error within my custom SpeechProcessor class:

    extension NoteProcessor: SpeechProcessor {
        public var context: SpeechContext {
            get {
                return self.context >> ERROR: AURemoteIO::IOThread (45): EXC_BAD_ACCESS (code=2, address=0x70000aa6dff8)
                }
            set {
                self.context = newValue
                }
        }
    

    Though I suspect this is simply due to the Pipeline never initialising, therefore the context is being sent to a deallocated Spokestack Delegate object?

    EDIT: On further investigation, it seems the error: 2021-08-20 13:09:04.690747+0200 Runner[10656:128311] Didn't find op for builtin opcode 'FULLY_CONNECTED' version '9' 2021-08-20 13:09:04.691000+0200 Runner[10656:128311] Registration failed. is thrown due to an issue with TensorFlow models or perhaps the runtime version of TFLite. It is not mentioned to explicitily import the TensorFlowLiteSwift module in the Podfile, I'm assuming Spokestack adds the module itself so it shouldn't be necessary? I will try adding this and see if this resolves the issue. I

    opened by rayyan808 10
  • Long pauses (hangs) on pipeline.stop()

    Long pauses (hangs) on pipeline.stop()

    Disclaimer: this is 9.0.1 (from May 6 2020) because I support iOS 12.

    I'm occasionally seeing long pauses / hangs on calling .stop(), with errors shown like this one: [aurioc] AURemoteIO.cpp:1639:Stop: AURemoteIO::Stop: error 268451843 calling TerminateOwnIOThread (port 104967) This seems to happen on iOS 12 and iOS 14 (errors are different) with similar frequency.

    The impact is minimized when I call stop off the main thread, ie. DispatchQueue.global().async { self.pipeline?.stop() }, because at least none of the UI work is waiting for it.

    2 questions:

    1. did this issue get solved in later releases somehow - any tips? I have already forked, am happy to develop fixes manually etc if I have a direction to go in
    2. what's the recommended threading model for calling start/stop on the speech pipeline? Currently i'm calling start() from main thread, stop() off main thread. Not sure that's wise...

    Many thanks, enjoying spoke stack so far (day 3)

    opened by xaphod 10
  • 9.0.1 with iOS 12 support: AVAudioEngine errors on pipeline start on iOS 14

    9.0.1 with iOS 12 support: AVAudioEngine errors on pipeline start on iOS 14

    Trying to get spokestack up and running. iOS 14.2, iPad 11" pro, spokestack-ios installed from pod, verison 9.0.1 per Pod.lockfile.

    I see these after calling start(), and nothing works.

    2020-12-04 18:12:14.650501-0500 booth[6325:5456020] [aurioc] AURemoteIO.cpp:1095:Initialize: failed: -10851 (enable 1, outf< 2 ch,      0 Hz, Float32, non-inter> inf< 2 ch,      0 Hz, Float32, non-inter>)
    2020-12-04 18:12:14.651563-0500 booth[6325:5456020] [aurioc] AURemoteIO.cpp:1095:Initialize: failed: -10851 (enable 1, outf< 2 ch,      0 Hz, Float32, non-inter> inf< 2 ch,      0 Hz, Float32, non-inter>)
    2020-12-04 18:12:14.652462-0500 booth[6325:5456020] [aurioc] AURemoteIO.cpp:1095:Initialize: failed: -10851 (enable 1, outf< 2 ch,      0 Hz, Float32, non-inter> inf< 2 ch,      0 Hz, Float32, non-inter>)
    2020-12-04 18:12:14.652534-0500 booth[6325:5456020] [avae]            AVAEInternal.h:109   [AVAudioEngineGraph.mm:1397:Initialize: (err = AUGraphParser::InitializeActiveNodesInInputChain(ThisGraph, *GetInputNode())): error -10851
    2020-12-04 18:12:14.652564-0500 booth[6325:5456020] [avae]          AVAudioEngine.mm:167   Engine@0x283c0f890: could not initialize, error = -10851
    
    opened by xaphod 4
  • Refactored SpeechPipeline with stages and profiles

    Refactored SpeechPipeline with stages and profiles

    A complete minor rewrite of the SpeechPipeline and associated components. This rewrite has three goals:

    1. Testability for previously under-tested components
    2. The pipeline utilizes stages that are completely configurable
    3. The pipeline may be instantiated from a set of profiles that encapsulate configuration complexity

    The end result is a more flexible, better-tested pipeline that better congrues with that in spokestack-android.

    opened by noelweichbrodt 4
  • Apple recognizer concurrency fixes

    Apple recognizer concurrency fixes

    Suggested by https://github.com/spokestack/spokestack-ios/compare/14.0.3...xaphod:xaphod/14.0.3 as fixes for #109 & #110.

    Memory usage is now constant, and repeated simulator runs suggest the concurrency issues wrt AudioEngine are no longer impacting usage.

    Also, some minor code changes and comments to help future debugging in TextToSpeech

    opened by noelweichbrodt 3
  • Bump addressable from 2.7.0 to 2.8.0

    Bump addressable from 2.7.0 to 2.8.0

    Bumps addressable from 2.7.0 to 2.8.0.

    Changelog

    Sourced from addressable's changelog.

    Addressable 2.8.0

    • fixes ReDoS vulnerability in Addressable::Template#match
    • no longer replaces + with spaces in queries for non-http(s) schemes
    • fixed encoding ipv6 literals
    • the :compacted flag for normalized_query now dedupes parameters
    • fix broken escape_component alias
    • dropping support for Ruby 2.0 and 2.1
    • adding Ruby 3.0 compatibility for development tasks
    • drop support for rack-mount and remove Addressable::Template#generate
    • performance improvements
    • switch CI/CD to GitHub Actions
    Commits
    • 6469a23 Updating gemspec again
    • 2433638 Merge branch 'main' of github.com:sporkmonger/addressable into main
    • e9c76b8 Merge pull request #378 from ashmaroli/flat-map
    • 56c5cf7 Update the gemspec
    • c1fed1c Require a non-vulnerable rake
    • 0d8a312 Adding note about ReDoS vulnerability
    • 89c7613 Merge branch 'template-regexp' into main
    • cf8884f Note about alias fix
    • bb03f71 Merge pull request #371 from charleystran/add_missing_encode_component_doc_entry
    • 6d1d809 Adding note about :compacted normalization
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 2
  • Pod Install not working

    Pod Install not working

    Error:

    [!] Error installing Spokestack-iOS
    [!] /usr/bin/git clone https://github.com/spokestack/spokestack-ios.git /var/folders/71/9tdn80h96cv0fgznfwlmsc0w0000gn/T/d20200815-21149-5hz3pc --template= --single-branch --depth 1 --branch 12.0.1
    
    Cloning into '/var/folders/71/9tdn80h96cv0fgznfwlmsc0w0000gn/T/d20200815-21149-5hz3pc'...
    Note: checking out '48c688301f8ee6966a011fa5cd267acc18a985e9'.
    
    You are in 'detached HEAD' state. You can look around, make experimental
    changes and commit them, and you can discard any commits you make in this
    state without impacting any branches by performing another checkout.
    
    If you want to create a new branch to retain commits you create, you may
    do so (now or later) by using -b with the checkout command again. Example:
    
      git checkout -b <new-branch-name>
    
    git-lfs filter-process: git-lfs: command not found
    fatal: the remote end hung up unexpectedly
    warning: Clone succeeded, but checkout failed.
    You can inspect what was checked out with 'git status'
    and retry the checkout with 'git checkout -f HEAD'
    
    opened by MHX792 2
  • NLU returns an empty slot instead of crashing

    NLU returns an empty slot instead of crashing

    Does what it says on the tin. For example, "take a selfie" in ios-studio results in NLUError.metadata("Could not find a slot called item in NLU model metadata.").

    opened by noelweichbrodt 2
  • Tracing event threaded through pipeline

    Tracing event threaded through pipeline

    Creates a new didTrace event with variable-level logging that propagates debugging events back to the client. This approach also resulted in additional advantageous simplifications in class/protocol structure.

    opened by noelweichbrodt 2
  • CoreML wakeword detection

    CoreML wakeword detection

    Based on @kwylez's port of spokestack-android's tensorflow-lite model-based wakeword detector. There is a default model and configuration included, and an example usage for testing.

    opened by noelweichbrodt 2
  • Allow locale to be set by the developer

    Allow locale to be set by the developer

    As of today the locale of the Apple Speech Recongizer is set by NSLocale.current as defined here: https://github.com/spokestack/spokestack-ios/blob/68de84b03a0dd568e149f665e79457983d599957/Spokestack/AppleSpeechRecognizer.swift

    This is restrictive without good reason. Allowing users to switch language within the app, regardless of the locale of the phone itself is a fairly common use case. Are there plans on the roadmap to fix this?

    Thanks!

    opened by alikareemraja 2
Releases(14.2.1)
Owner
Spokestack
Voice development platform that enables customized voice navigation for mobile and browser applications
Spokestack
DeepInfant® is a Neural network system designed to predict whether and why your baby is crying.

DeepInfant DeepInfant® is a Neural network system designed to predict whether and why your baby is crying. DeepInfant uses artificial intelligence and

Skytells AI Research 14 Oct 19, 2022
BetterMood is an iOS app that uses Tensorflow to recognize user’s emotions

BetterMood is an iOS app that uses Tensorflow to recognize user’s emotions, convert it into categories then send via our api along with the user’s date of birth and name, to end up with a emotion analyse and horoscope prediction.

Yosri 2 Sep 30, 2021
Slot-machine - SwiftUI - iOS App Slot Machine game

Slot Machine SwiftUI Masterclass project iPhone, iPad and Mac (with Catalyst framework) Complex interface Extensions and custom View Modifiers Input t

Arthur Neves 1 Sep 12, 2022
Hand-gesture recognition on iOS app using CoreML

GestureAI-CoreML-iOS Hand-gesture recognizer using CoreML Demo Screenshots This app is using RNN(Recurrent Neural network) with CoreML on iOS11. The m

null 151 Nov 2, 2022
Gyros-identifier - An iOS app capable of detecting whether a photo contains a gyros or not

gyros-identifier ??️ In case you enter a shady restaurant, with suspicious price

Alexandros Tzimas 3 Feb 25, 2022
A note on the Construction of the watchOS App Notes

This document is a note on the Construction of the watchOS App "Notes" Learn about the main topics of this watchOS project In this SwiftUI tutorial, w

Daniel Beatty 0 Dec 6, 2021
Track List Detail App With Swift

TrackListDetailApp TrackListDetailApp is master-detail application that containsa a list of items obtained from a iTunes Search API and show a detaile

Karlo Manguiat 1 Dec 12, 2021
Coffee App Splash Screen made with SwiftUI.

SplashScreen-CoffeeApp Coffee App Splash Screen made with SwiftUI. SplashScreen - Code struct ContentView: View { @State var splashScreen = true

Shreyas Bhike 18 Oct 2, 2022
WhatPet - A basic app that classifies images of dogs, cats and rabbits using CoreML

WhatPet ✓ A basic app that classifies images of dogs, cats and rabbits using Cor

Micaella Morales 0 Jan 6, 2022
CoreMLSample - CoreML Example for in app model and download model

CoreMLSample Sample for CoreML This project is CoreML Example for in app model a

Kim Seonghun 2 Aug 31, 2022
C-19-StatusSwiftUI - Covid-19 Status App made with SwiftUI

C-19-StatusSwiftUI C-19 Status App made with SwiftUI . First Creator of App - th

Shreyas Bhike 5 Feb 22, 2022
Conjugar is an app for learning Spanish verb conjugations.

Introduction Conjugar is an iPhone™ app for learning Spanish verb conjugations. Conjugar conjugates most Spanish verbs, regular and irregular, in all

Josh Adams 34 Oct 5, 2022
420! Alarm App ⏰

420! Alarm Download on the App Store We all need time to kick back and relax. With 420! you'll get exactly that: a simple alarm that goes off to remin

Lasha Efremidze 44 Sep 23, 2022
App made to educate people about cybersecurity and internet safety.

Cybersafe App made to educate people about cybersecurity and internet safety. [please open the app on an Iphone 11 if you have Xcode!] What to expect:

keira 2 Jul 25, 2022
Mobile-ios-ml - SBB Mobile Machine Learning for iOS devices

ESTA library: Machine Learning for iOS This framework simplifies the integration

Swiss Federal Railways (SBB) 9 Jul 16, 2022
Largest list of models for Core ML (for iOS 11+)

Since iOS 11, Apple released Core ML framework to help developers integrate machine learning models into applications. The official documentation We'v

Kedan Li 5.6k Jan 3, 2023
Artificial intelligence/machine learning data structures and Swift algorithms for future iOS development. bayes theorem, neural networks, and more AI.

Swift Brain The first neural network / machine learning library written in Swift. This is a project for AI algorithms in Swift for iOS and OS X develo

Vishal 331 Oct 14, 2022
Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.

Bender Bender is an abstraction layer over MetalPerformanceShaders useful for working with neural networks. Contents Introduction Why did we need Bend

xmartlabs 1.7k Dec 24, 2022
Model stock prediction for iOS

Stockify Problem Investing in Stocks is great way to grow money Picking the right stocks for you can get tedious and confusing Too many things to foll

Sanchitha Dinesh 1 Mar 20, 2022