Neon
A Swift library for efficient highlighting, indenting, and querying the structure of language syntax.
It features:
- Minimal text invalidation
- Support for multiple token sources
- A hybrid sync/async system for targeting flicker-free styling on keystrokes
- tree-sitter integration
- Compatibility with lazy text data reading
- Flexibility when integrating with a larger text system
It does not feature:
- A theme system
- A single View subclass
- Low complexity
Neon has a strong focus on efficiency and flexibility. These qualities bring some serious complexity. Right now, Neon is a collection of components that can be assembled together as part of a larger text system. It does not include a single component that ties everything together.
I realize that many people are looking for exactly that. But, it's deceptively difficult, as text systems can be phenomenally complicated. I'd love to make easier-to-use parts, and that's a goal. But, it has to be done in a way that does not sacrifice flexibility.
The library is being extracted from the Chime editor. It's a big system, and pulling it out is something we intend to do over time.
Why is this so complicated?
Working with small, static, and syntactically correct documents is one thing. Achieving both high performance and high quality behavior for an editor is totally different. Work needs to be done on every keystroke, and minimizing that work requires an enormous amount of infrastructure and careful design. Before starting, it's worth seriously evaluating your performance and quality needs. You may be able to get away with a much simpler system. A lot of this boils down to size of the document. Remember: most files are small, and small files can make even the most naive implementation feel acceptable.
Some things to consider:
- Latency to open a file
- Latency to visible elements highlight
- Latency to end-of-document highlight
- Latency on keystroke
- Precise invalidation on keystroke
- Highlight quality in the face of invalid syntax
- Ability to apply progressively higher-quality highlighting
- Precise indentation calculation
Not all of these might matter you. Neon's components are fairly loosely-coupled, so maybe just one or two parts might be usable without the whole thing.
Language Support
Neon is built around the idea that there can be multiple sources of information about the semantic meaning of the text, all with varying latencies and quality.
- Language Server Protocol has semantic tokens, which is high quality, but also high latency.
- tree-sitter is very good quality, and can potentially be low-latency
- Regex-based systems can have ok quality and low-latency
- Simpler pattern-matching systems generally have poor quality, but have very low latency
Neon includes built-in suport for tree-sitter via SwiftTreeSitter. Tree-sitter also uses separate compiled parsers for each language. Thanks to tree-sitter-xcframework, you can get access to pre-built binaries for the runtime and some parsers. It also includes the needed query definitions for those languages. This system is compatible with parsers that aren't bundled, but it's definitely more work to use them.
Integration
Neon's components need to react to various events:
- the text is about to change
- the text has changed
- a text change has been processed and is now ready to be styled
- the visible text has changed
- the styling has become invalid (ex: the theme has changed)
How and where they come from depends on your text setup. And, not every components needs to know about all of these, so you may be able to get away with less.
A very minimum setup could be produced with just an NSTextStorageDelegate
. Monitoring the visible rect of the NSTextView
will improve performance.
Achieving guaranteed flicker-free highlighting is more challenging. You need to know when a text change has been processing by enough of the system that styling is possible. This point in the text change lifecycle is not natively supported by NSTextStorage
or NSLayoutManager
. It requires an NSTextStorage
subclass. But, even that isn't quite enough unfortunately, as you still need to precisely control the timing of invalidation and styling. This is where HighlighterInvalidationBuffer
comes in. I warned you this was complicated.
Relationship to TextStory
TextStory is a library that contains three very useful components when working with Neon.
TSYTextStorage
gets you all the text change life cycle hooks without falling into theNSString
/String
bridging performance trapsTextMutationEventRouter
makes it easier to route events to the componentsLazyTextStoringMonitor
allows for lazy content reading, which is essential to quickly open large documents
You can definitely use Neon without TextStory. But, I think it may be reasonable to just make Neon depend on TextStory to help simplify usage.
Components
Highligher
This is the main component that coordinates the styling and invalidation of text.
- Connects to a text view via
TextSystemInterface
- Monitors text changes and view visible state
- Gets token-level information from a
TokenProvider
Note that Highlighter is built to handle a TokenProvider
calling its completion block more than one time, potentially replacing or merging with existing styling.
HighlighterInvalidationBuffer
In a traditional NSTextStorage
/NSLayoutManager
system (TextKit 1), it can be challenging to achieve flicker-free on-keypress highlighting. This class offers a mechanism for buffering invalidations, so you can precisely control how and when actual text style updates occur.
TextContainerSystemInterface (macOS only)
An implementation of the TextSystemInterface
protocol for an NSTextContainer
-backed NSTextView
. This takes care of the interface to NSTextView
and NSLayoutManager
, but defers Token
-style translation (themes) to an external AttributeProvider
.
TreeSitterClient
This class is an asynchronous interface to tree-sitter. It provides an UTF-16 code-point (NSString
-compatible) API for edits, invalidations, and queries. It can process edits of String
objects, or raw bytes. Invalidations are translated to the current content state, even if a queue of edits are still being processed. It is fully-compatible with reading the document content lazily.
- Monitors text changes
- Can be used to build a
TokenProvider
- Requires a function that can translate UTF-16 code points to a tree-sitter
Point
(line + offset)
TreeSitterClient
provides APIs that can be both synchronous, asynchronous, or both depending on the state of the system. This kind of interface can be important when optimizing for flicker-free, low-latency highlighting live typing interactions like indenting.
Using it is quite involved - here's a little example:
import SwiftTreeSitter
import tree_sitter_language_resources
import Neon
// step 1: setup
// construct the tree-sitter grammar for the language you are interested
// in working with manually
let unbundledLanguage = Language(language: my_tree_sitter_grammar())
// .. or grab one from tree-sitter-xcframework
let swift = LanguageResource.swift
let language = Language(language: swift.parser)
// construct your highlighting query
// this is a one-time cost, but can be expensive
let url = swift.highlightQueryURL!
let query = try! language.query(contentsOf: url)
// step 2: configure the client
// produce a function that can map UTF16 code points to Point (Line, Offset) structs
let locationToPoint = { Int -> Point? in ... }
let client = TreeSitterClient(language: language, locationToPoint: locationToPoint)
// this function will be called with a minimal set of text ranges
// that have become invalidated due to edits. These ranges
// always correspond to the *current* state of the text content,
// even if TreeSitterClient is currently processing edits in the
// background.
client.invalidationHandler = { set in ... }
// step 3: inform it about content changes
// these APIs match up fairly closely with NSTextStorageDelegate,
// and are compatible with lazy evaluation of the text content
// call this *before* the content has been changed
client.willChangeContent(in: range)
// and call this *after*
client.didChangeContent(to: string, in: range, delta: delta, limit: limit)
// step 4: run queries
// you can execute these queries directly in the invalidationHandler, if desired
// Many tree-sitter highlight queries contain predicates. These are both expensive
// and complex to resolve. This is an optional feature - you can just skip it. Doing
// so makes the process both faster and simpler, but could result in lower-quality
// and even incorrect highlighting.
let provider: TreeSitterClient.TextProvider = { (range, _) -> String? in ... }
client.executeHighlightsQuery(query, in: range, textProvider: provider) { result in
// Token values will tell you the highlights.scm name and range in your text
}
Suggestions or Feedback
We'd love to hear from you! Get in touch via twitter, an issue, or a pull request.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.