BudouX: the machine learning powered line break organizer tool

Related tags

Utility swift budoux
Overview

BudouX.swift

BudouX Swift implementation.

BudouX is the machine learning powered line break organizer tool.

How it works

The original BudouX uses HTML markup to ensure that clauses are broken properly. BudouX.swift inserts a U+2060(word joiner) and a U+200B(zero width space) between each character and clause to ensure that Cocoa's UI component to do the line breaking properly.

Here is a sample project in this repository "Example.swiftpm".

CLI tool budoux-swift contains in this repository as well.

Usage

You can get a list of phrases by feeding a sentence to the parser.

import BudouX
// Load Default Japanese Parser
let parser = Parser()
// Parse
print(parser.parse("あなたに寄り添う最先端のテクノロジー。"))
// ["あなたに", "寄り添う", "最先端の", "テクノロジー。"]

You can also translate an Swift's String with word joiners and zero width spaces for semantic line breaks.

import BudouX
// Load Default Japanese Parser
let parser = Parser()
let sample = "あなたに寄り添う最先端のテクノロジー。"
print(parser.translate(sentence: sample))
// あ⁠な⁠た⁠に​寄⁠り⁠添⁠う​最⁠先⁠端⁠の​テ⁠ク⁠ノ⁠ロ⁠ジ⁠ー⁠。

Here's a convinience String extension method as well.

import BudouX

let sample = "あなたに寄り添う最先端のテクノロジー。"
print(sample.budouxed())
// あ⁠な⁠た⁠に​寄⁠り⁠添⁠う​最⁠先⁠端⁠の​テ⁠ク⁠ノ⁠ロ⁠ジ⁠ー⁠。

Install

Support Swift Package Manager only. There are no plans to support other package management tools at this time.

package.append(
    .package(url: "https://github.com/griffin-stewie/BudouX.swift", from: "0.1.0")
)

package.targets.append(
    .target(name: "Foo", dependencies: [
        .productItem(name: "BudouX", package: "BudouX.swift")
    ])
)
Comments
  • Fix `BudouXText` bug when using a language not supported by BudouX.

    Fix `BudouXText` bug when using a language not supported by BudouX.

    Thank you for approving #11! I'm sorry there is a bug in BudouXText that I added in that pull request, I didn't anticipate at the time.

    When BudouXText is used in a language that BudouX does not support (currently non-Japanese), no translation by BudouX.Parser.

    (In the example screenshot below, the line break position of "brown" is incorrect.)

    |non-ja Before|non-ja After|ja (No change)| |:-:|:-:|:-:| |Screen Shot 2022-01-17 at 1 16 31|Screen Shot 2022-01-17 at 1 16 51|Screen Shot 2022-01-17 at 1 20 04|

    opened by treastrain 2
  • Add

    Add "supported natural languages" to the `Model`, and update `BudouXText` as well

    Thank you for reviewing #12 !

    Since the original BudouX doesn't have language detection, I think it is difficult to complete multi-language support in BudouXText. https://github.com/griffin-stewie/BudouX.swift/pull/12#issuecomment-1014545382

    This is a very critical point. However, in this package, which is the Swift version of BudouX, it can easily imagine it being used on platforms such as iOS, macOS, tvOS, watchOS, etc. where localization is very significant, and I believe that multi-language support is important.

    Based on the above, I would like to propose this pull request again.

    1. Change Model from struct to protocol.
    2. Model has supportedNaturalLanguages: Set<String> that the original BudouX does not have.
      • Add an option supportedNaturalLanguages for command line execution (this was added to conform to the protocol Model, and does not affect anything at this time).
    3. Change the existing jaKNBCModel: [String: Int] to struct JaKNBCModel: Model.
    4. Add an argument in BudouXText to tell BudouX's parser whether to do the translation.
      • I had the idea that "the decision to decide whether to translate with BudouX should be done by the user of this package", but this change is highly necessary considering that it makes it "easy to use by simply replacing SwiftUI.Text with BudouXText".
    5. Add a property defaultBudouXTextCondition that makes the decision that BudouX's parser will do the translation only if it is in a natural language supported by the Model, based on the localization settings of the current environment in which the application is running.
      • It is now possible to get the localized settings of the execution environment in this package.
      • It wants to use it for JaKNBCModel, which is currently included in this package, it is better if we can determine whether the current execution environment is Japanese or not. For this reason, we have added keys for en and ja.
        • For example, it is expected that users will use a custom model that supports zh. In this case, the defaultBudouXTextCondition cannot be used without submitting a new pull request.
          • However, with the new argument of BudouXText, this problem can be avoided without submitting a pull request.

    I hope that this pull request will bring a better evolution to this package!

    opened by treastrain 1
  • Add methods for `SwiftUI.Text`

    Add methods for `SwiftUI.Text`

    Thank you for publishing a great library! I wanted to add methods to it for SwiftUI's Text, so I created this pull request.

    I have added two public methods.

    • BudouXText(verbatim:parser:threshold:)
    • BudouXText(_:tableName:bundle:comment:parser:threshold:)

    ||||| |:-:|:-:|:-:|:-:|

    What I tried to do in the beginning

    At first, I tried to add a method of the same name to SwiftUI's Text, like the already existing String extension method budouxed(_:). But I couldn't find a way to get the String content that SwiftUI's Text has, so I gave up on that.

    Why do methods to be added not use lower-camel-case?

    These methods are just to give the SwiftUI's Text content translated from String by BudouX's parser. Therefore, creating a new SwiftUI's View called BudouXText is no longer an option for me. That's because we just want SwiftUI's Text.

    I was inspired by NSLocalizedString(_:tableName:bundle:value:comment:). It is not lower-camel-case, but it is a method that returns a String. I thought this would be the most suitable for this usage.

    opened by treastrain 1
  • Rename

    Rename "thres" to "threshold"

    Avoid abbreviations. Abbreviations, especially non-standard ones, are effectively terms-of-art, because understanding depends on correctly translating them into their non-abbreviated forms.

    The intended meaning for any abbreviation you use should be easily found by a web search. Swift.org - API Design Guidelines

    opened by griffin-stewie 0
  • Add tool for generating code from BudouX repository.

    Add tool for generating code from BudouX repository.

    BudouX.swift needs JaKNBCModel and UnicodeBlocks to run. The original BudouX written in TypeScript contains tools to import JSON file and generate code from Python version's BudouX directory. We do same approach.

    opened by griffin-stewie 0
Releases(v0.6.0)
Owner
griffin-stewie
iOS apps developer.
griffin-stewie
A command-line tool and Swift Package for generating class diagrams powered by PlantUML

SwiftPlantUML Generate UML class diagrams from swift code with this Command Line Interface (CLI) and Swift Package. Use one or more Swift files as inp

null 374 Jan 3, 2023
Swift library and command line tool that interacts with the mach-o file format.

MachO-Reader Playground project to learn more about the Mach-O file format. How to run swift run MachO-Reader <path-to-binary> You should see a simila

Gonzalo 5 Jun 25, 2022
This is a command line tool to extract an app icon. this sample will extract the icon 16x16 from Safari app.

?? X-BundleIcon This is a command line tool to extract an app icon. this sample will extract the icon 16x16 from Safari app. xbi com.apple.Safari 16 /

Rui Aureliano 3 Sep 1, 2022
Easy way to detect iOS device properties, OS versions and work with screen sizes. Powered by Swift.

Easy way to detect device environment: Device model and version Screen resolution Interface orientation iOS version Battery state Environment Helps to

Anatoliy Voropay 582 Dec 25, 2022
🕸️ Swift Concurrency-powered crawler engine on top of Actomaton.

??️ ActoCrawler ActoCrawler is a Swift Concurrency-powered crawler engine on top of Actomaton, with flexible customizability to create various HTML sc

Actomaton 18 Oct 17, 2022
A graphical Mach-O viewer for macOS. Powered by Mach-O Kit.

Mach-O Explorer is a graphical Mach-O viewer for macOS. It aims to provide an interface and feature set that are similar to the venerable MachOView ap

Devin 581 Dec 31, 2022
Unit-Converter-SwiftUI - A simple Unit Converter iOS app built in the process of learning SwiftUI

SwiftUI-Unit-Converter A simple Unit Converter iOS app built in the process of l

Ishaan Bedi 2 Jul 13, 2022
Flashzilla - Card game for learning and having lot of fun

Flashzilla Flashzilla is a card quiz game. Where you can add (and remove) your o

Pavel Surový 0 Jan 8, 2022
OwO.swift Badges go here in one line for the master branch ONLY.

OwO.swift Badges go here in one line for the master branch ONLY. Badges can also go in the header line. Short description describing the application/l

Spotlight 2 May 28, 2022
Command line apps for hacking on baseball stats

Thes are some swift command line apps I use to hack on my roto baseball league. No warranty or claim of usability what-so-ever. This represents work I

Jaim Zuber 0 Nov 4, 2021
Differific is a diffing tool that helps you compare Hashable objects using the Paul Heckel's diffing algorithm

Differific is a diffing tool that helps you compare Hashable objects using the Paul Heckel's diffing algorithm. Creating a chan

Christoffer Winterkvist 127 Jun 3, 2022
A functional tool-belt for Swift Language similar to Lo-Dash or Underscore.js in Javascript

Dollar Dollar is a Swift library that provides useful functional programming helper methods without extending any built in objects. It is similar to L

Ankur Patel 4.2k Jan 2, 2023
A SARS-CoV-2 Mutation Pattern Query Tool

vdb A SARS-CoV-2 Mutation Pattern Query Tool 1. Purpose The vdb program is designed to query the SARS-CoV-2 mutational landscape. It runs as a command

null 13 Oct 25, 2022
This package will contain the standard encodings/decodings/hahsing used by the String Conversion Tool app.

This package will contain the standard encodings/decodings/hahsing used by the String Conversion Tool app. It will also, however, contain extra encoding/decoding methods (new encoding/decoding)

Gleb 0 Oct 16, 2021
qr code generator tool

qr code generator tool Small command line tool for generate and reconition qr codes written in Swift Using Usage: ./qrgen [options] -m, --mode:

Igor 3 Jul 15, 2022
AnalyticsKit for Swift is designed to combine various analytical services into one simple tool.

?? AnalyticsKit AnalyticsKit for Swift is designed to combine various analytical services into one simple tool. To send information about a custom eve

Broniboy 6 Jan 14, 2022
SwiftRegressor - A linear regression tool that’s flexible and easy to use

SwiftRegressor - A linear regression tool that’s flexible and easy to use

null 3 Jul 10, 2022
A visual developer tool for inspecting your iOS application data structures.

Tree Dump Debugger A visual developer tool for inspecting your iOS application data structures. Features Inspect any data structure with only one line

null 8 Nov 2, 2022
A little beautifier tool for xcodebuild

xcbeautify xcbeautify is a little beautifier tool for xcodebuild. Similar to xcpretty, but faster. Features 2x faster than xcpretty. Human-friendly an

Tuist 650 Dec 30, 2022