DocumentClassifier
Overview
DocumentClassifier is a Swift framework for classifying documents into one of five categories (Business, Entertainment, Politics, Sports, and Technology). It uses a CoreML model trained with 1,500 news articles from the BBC.
Features
- iOS 11.0+, macOS 10.13+, tvOS 11.0+, watchOS 4.0+
- 100% Test Coverage
- Best CV Score: 0.965333333333
Usage
Swift 4 Released (Sample Article)
let text = articleText
guard let classification = classifier.classify(text) else { return }
print(classification.prediction) // Technology: 0.42115752953489294
print(classification.allResults) // Business: 0.141, Entertainment: 0.138, Politics: 0.113, Sports: 0.187, Technology: 0.421
Installation
CocoaPods
CocoaPods is a centralized dependency manager for Cocoa projects. To install DocumentClassifier with CocoaPods:
-
Make sure the latest version of CocoaPods is installed.
-
Add DocumentClassifier to your Podfile:
use_frameworks!
pod 'DocumentClassifier', '1.2.0'
- Run
pod install
.
Example App
NewsClassifier is an example app using the framework.
Model
- Model Link
- Best CV Score: 0.965333333333
- Trained using 1,500 news articles from the BBC from 2004-2005 (see references)
- Converted from scikit-learn Pipeline using coremltools.
- Based on the LinearSVC classifier.
Author
Todd Kramer, [email protected]
References
- BBC Datasets
- D. Greene and P. Cunningham. "Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering", Proc. ICML 2006. link
- Vadym Markov, SentimentPolarity
- Awesome Core ML Models
- scikit-learn
- Apple Machine Learning
- CoreML Framework
- coremltools