Python VM written in Swift

Overview

Violet

Violet is one of those Swift <-> Python interop thingies, except that this time we implement the whole language from scratch. Name comes from Violet Evergarden.

Many unwatched k-drama hours were put into this, so any would be appreciated.

If something is not working, you have an interesting idea or maybe just a question, then you can start an issue or discussion. You can also contact us on twitter @itBrokeAgain (optimism, yay!).

Requirements

  • 64 bit - for BigInt and (probably, maybe, I think) hash
  • Platform:
    • macOS - tested on 10.15.6 (Catalina) + Xcode 12.0 (Swift 5.3)
    • Ubuntu - tested on 21.04 + Swift 5.4.2
    • Docker - tested on swift:latest (5.4.2) on Ubuntu 21.04

Features

We aim for compatibility with Python 3.7 feature set.

We are only interested in the language itself without additional modules. This means that importing anything except for most basic modules (sys, builtins and a few others) is not supported (although you can import other Python files).

See Documentation directory for a list of known unimplemented features. There is no list of unknown unimplemented features though…

Future plans

Our current goal was to ramp up the Python functionality coverage, which mostly meant passing as many Python tests (PyTests) as possible. This gives us us a safety net for any future regressions.

Next we will try to improve code-base by solving any shortcuts we took:

  • New object model (representation of a single Python object in a memory) - currently we are using Swift objects to represent Python instances, for example Swift PyInt object represents a Python int instance. There are better ways to do this, but this is a bit longer conversation in Swift. For details see this issue.

  • New method representation - currently we just wrap a Swift method in a PyBuiltinFunction and put it inside type __dict__. For example: int.add (implemented in Swift as PyInt.add(:_) with following signature: (PyInt) -> (PyObject) -> PyResult<PyObject>) is put inside int.__dict__. This can be simplified a bit, but it depends on the object model, so it has to wait.

  • Garbage collection and memory management - as we said: currently use Swift class instances to represent Python objects, which means that we are forced to use Swift ARC to manage object lifetime. Unfortunately this does not solve reference cycles (which we have, for example: object type has type type and type type is a subclass of object, not to mention that type type has type as its type), but for now we will ignore this… (how convenient!).

  • V8-style isolates - currently the Python context is represented as a global static Py (something like: Py.newInt(2) or Py.add(lhs, rhs)). This prevents us from having multiple VM instances running on the same thread (without using thread local storage), which in turn makes unit testing difficult.

Sources

Core modules

  • VioletCore — shared module imported by all of the other modules.
    • Contains things like NonEmptyArray, SourceLocation, SipHash, trap and unreachable.
  • BigInt — our implementation of unlimited integers
    • While it implements all of the operations expected of BigInt type, in reality it mostly focuses on performance of small integers — Python has only one int type and small numbers are most common.
    • Under the hood it is a union (via tagged pointer) of Int32 (called Smi, after V8) and a heap allocation (magnitude + sign representation) with ARC for garbage collection.
    • While the whole Violet tries to be as easy-to-read/accessible as possible, this does not apply to BigInt module. Numbers are hard, and for some reason humanity decided that “division” is a thing.
  • FileSystem — our version of Foundation.FileManager.
    • Main reason why we do not support other platforms (Windows etc.).
  • UnicodeData — apparently we also bundle our own Unicode database, because why not…

Violet

  • VioletLexer — transforms Python source code into a stream of tokens.
  • VioletParser — transforms a stream of tokens (from Lexer) into an abstract syntax tree (AST).
    • Yet Another Recursive Descent Parser with minor hacks for ambiguous grammar.
    • AST type definitions are generated by Elsa module from Elsa definitions/ast.letitgo.
  • VioletBytecode — instruction set of our VM.
    • 2-bytes per instruction.
    • No relative jumps, only absolute (via additional labels array).
    • Instruction enum is generated by Elsa module from Elsa definitions/opcodes.letitgo.
    • Use CodeObjectBuilder to create CodeObjects (whoa… what a surprise!).
    • Includes a tiny peephole optimizer, because sometimes the semantics depends on it (for example for short-circuit evaluation) .
  • VioletCompiler — responsible for transforming AST (from Parser) into CodeObjects (from Bytecode).
  • VioletObjects — contains all of the Python objects and modules.
    • Py represents a Python context. Common usage: Py.newInt(2) or Py.add(lhs, rhs).
    • Contains int, str, list and 100+ other Python types. Python object is represented as a Swift class instance (this will probably change in the future, but for now it is “ok”, since the whole subject is is a bit complicated in Swift). Read the docs in the Documentation directory!
    • Contains modules required to bootstrap Python: builtins, sys, _imp, _os and _warnings.
    • Does not contain importlib and importlib_external modules, because those are written in Python. They are a little bit different than CPython versions (we have 80% of the code, but only 20% of the functionality <great-success-meme.gif>).
    • PyResult<Wrapped> = Wrapped | PyBaseException is used for error handling.
  • VioletVM — manipulates Python objects according to the instructions from Bytecode.CodeObject, so that the output vaguely resembles what CPython does.
    • Mainly a massive switch over each possible Instruction (branch prediction 💔 ).
  • Violet — main executable (duh…).
  • PyTests — runs tests written in Python from the PyTests directory.

Tools/support

  • Elsa — tiny DSL for code generation.
    • Uses .letitgo files from Elsa definitions directory.
    • Used for Parser.AST and Bytecode.Instruction types.
  • Rapunzel — pretty printer based on “A prettier printer” by Philip Wadler.
    • Used to print AST in digestible manner.
  • Ariel — prints module interface - all of the open/public declarations.

Tests

There are 2 types of tests in Violet:

  • Swift tests — standard Swift unit tests stored inside the ./Tests directory. You can run them by typing make test in repository root.

    You may want to disable unit tests for BigInt and UnicodeData if you are not touching those modules:

    • BigInt — we went with property based testing with means that we test millions of inputs to check if the general rule holds (for example: a+b=c -> c-a=b etc.). This takes time, but pays for itself by finding weird overflows in bit operations (we store “sign + magnitude”, so bit operations are a bit difficult to implement).
    • UnicodeData
      • In one of our tests we go through all of the Unicode code points and try to access various properties (crash -> fail). There are 0x11_0000 values to test, so… it is not fast.
      • We also have a few thousands of tests generated by Python. Things like: “is the COMBINING VERTICAL LINE ABOVE (U+030d) alpha-numeric?” (Answer: no, it is not. But you have to watch out because HANGUL CHOSEONG THIEUTH (U+1110) is).
  • Python tests — tests written in Python stored inside the ./PyTests directory. You can run them by typing make pytest in repository root (there is also make pytest-r for release mode).

    • Violet - tests written specially for “Violet”.
    • RustPython - tests taken from github.com/RustPython.

    Those tests are executed when you run PyTests module.

Code style

  • 2-space indents and no tabs at all
  • 80 characters per line
    • You will get a SwiftLint warning if you go over 100.
    • Over 120 will result in a compilation error.
    • If 80 doesn't give you enough room to code, your code is too complicated - consider using subroutines (advice from PEP-7).
  • Required self in methods and computed properties
    • All of the other method arguments are named, so we will require it for this one.
    • Self/type name for static methods is recommended, but not required.
    • I’m sure that they will depreciate the implicit self in the next major Swift version 🤞 . All of that source breakage is completely justified.
  • No whitespace at the end of the line
    • Some editors may remove it as a matter of routine and we don’t want weird git diffs.
  • (pet peeve) Try to introduce a named variable for every if condition.
    • You can use a single logical operator - something like if !isPrincess or if isDisnepCharacter && isPrincess is allowed.
    • Do not use && and || in the same expression, create a variable for one of them.
    • If you need parens then it is already too complicated.

Anyway, just use SwiftLint and SwiftFormat with provided presets (see .swiftlint.yml and .swiftformat files).

License

“Violet” is licensed under the MIT License. See LICENSE file for more information.

Comments
  • New object model

    New object model

    This issue tries to design a new way of representing a single Python object in memory.

    This may seem like a massive amount of work (we already have implemented almost all of the 100+ Python types and now we want to change this), but fortunately almost every aspect of the current model is exposed to code generation scripts (“Generated” directory inside the “Objects” module). This includes Swift type (with parent type), Swift fields, Swift methods and initializers, Python type, Python properties and Python methods.

    We also try to not rely on the current model too much, for example we do not use Swift initializers, but instead we use our own PyMemory type (which will eventually call init). We also have our own way of casting object from one type to another type via PyCast that checks the runtime Python type.

    Current - Swift class instances

    Currently we are using Swift objects to represent Python instances. For example Swift PyInt object represents a Python int instance (Sourcery annotations are explained in Violet documentation):

    // sourcery: pytype = int, isDefault, isBaseType, isLongSubclass
    public class PyInt: PyObject {
    
      // sourcery: pymethod = __add__
      public func add(_ other: PyObject) -> PyResult<PyObject> {
        // …
      }
    }
    

    This is nice because:

    • Is very easy to add new types/methods - which is important since we have more than 120 types and 780 methods to implement.
    • Free type compatibility - which means that we can cast object reference (represented as PyObject instance) from VM stack to a specific Python type, for example int (represented as PyInt instance).

    Not-so-nice things include:

    • Wasted cache/memory on Swift metadata - each Swift object holds a reference to its Swift type. We do not need this since we store a reference to Python type which serves a similar function.

    • Forced Swift memory management - ARC is “not the best” solution when working with circular references (which we have, for example: object type has type type and type type is a subclass of object, not to mention that type type has type as its type).

    • We have to perfectly reproduce Python type hierarchy inside Swift which can cause some problems if the 2 languages have different view on a certain behavior, for example:

      class PyInt {
        func and() { print("int.and") }
      }
      
      // `bool` is a subclass of `int` in Python.
      class PyBool: PyInt {
        override func and() { print("bool.and") }
      }
      
      let f = PyInt.and // This is stored inside `int.__dict__`
      f(intInstance)(rhs) // 'int.and', as expected
      f(boolInstance)(rhs) // 'bool.and'! 'int.and' was expected, since we took 'f' from 'PyInt'
      

    Pointer to struct + type punning

    The other approach would be to use struct and type punning to create object hierarchy by hand (this is more-or-less what CPython does).

    struct Ref<Pointee> {
      let ptr: UnsafePointer<Pointee>
    }
    
    struct PyObjectHeader {
      var type: Ref<PyType>
      var flags: UInt32
      private let padding = UInt32.max
    }
    
    struct PyObject {
      var header: PyObjectHeader
    }
    
    struct PyInt {
      var header: PyObjectHeader
      var value: Int
    }
    
    func newInt(value: Int) -> Ref<PyInt> {
      // Basically malloc(sizeof(PyInt)) + filling properties
    }
    
    // 'int' is a Python object representing number '2'
    let int = newInt(value: 2)
    
    // Cast it to 'PyObject' — this is really important since
    // we will need to do this to store object on VM stack.
    // (spoiler: this is not legal!)
    let object = Ref<PyObject>(ptr: int.ptr)
    

    As we can see both PyObject and PyInt start with PyObjectHeader, so one could assume that they can easily convert from Ref<PyInt> to Ref<PyObject>.

    Well… Swift does not guarantee that the memory layout will be the same as declaration order (even with SE-0210: Add an offset(of:) method to MemoryLayout). Yes, this is what happens right now, but it may change in the future. There are some exceptions, for example:

    • struct with single member will have the same layout as this member
    • enum with single case with payload will have the payload layout
    • homogenous tuples will look “as expected”
    • @frozen thingies with library-evolution enabled

    But, none of them applies in our case.

    This approach also blocks us from using flexible array member, unless we do some additional things (allocate more space, and then offset the struct pointer to get aligned address of array members).

    Related resources:

    Manual alignment

    We can just allocate a block of memory and manually assign where do each field starts and ends.

    struct PyInt {
    
      internal let ptr: UnsafePointer
    
      // `PyObjectHeader` holds `type/flags` (and maybe `__dict__`).
      internal var header: PyObjectHeader {
        return PyObjectHeader(ptr: self.ptr)
      }
    
      // Without using [flexible array member](https://en.wikipedia.org/wiki/Flexible_array_member).
      // This may also invoke needless COW when we modify the value, but there are
      // ways to prevent this.
      internal var value: Int {
        // We also need to align this 'ptr', but we will skip this in the example, 
        // since the code for this would depend on the property type.
        // Or we could craft 'PyObjectHeader', so that its 'size' is always word aligned.
        let ptr = self.ptr + PyObjectHeader.size
        return ptr.pointee
      }
    }
    
    extension PyMemory {
    
      internal func newInt(type: PyType, value: Int) {
        // Skipping alignment (again…)
        let memorySize = PyObjectHeader.size + Int.size
        let ptr = self.allocate(size: memorySize)
        let int = PyInt(ptr: ptr)
    
        int.header.type = type
        int.header.flags = // Something…
        int.value = value
    
        // register for garbage collection/arc
        self.startTracking(object: int.header)
      }
    }
    

    This is really promising since the whole concept is quite simple. The execution (as in: writing the code) is a little bit more complicated since it lacks the compiler support for proper alignment.

    Using C

    If we declare our objects in C we will get the C-layout. Then we can import them into Swift. This is nice because C layout is predictable and well described (see: cppreference.com: type).

    Unfortunately this has its own problems, most notably that some of the properties on our Swift types are not trivially representable in C.

    As far as I know this is the “official” (recommended by Swift team) way of dealing with this situation.

    Memory management

    When discussing object representation we have to bear in mind the memory management - a way of releasing unused instances.

    In CPython they use manual reference counting: each object has a reference count and an address of the previous and the next allocated object (doubly linked list). Py_INCREF is used to retain (reference count += 1) and Py_DECREF to release (reference count -= 1) an object. If the reference count reaches 0 then it is deallocated and the linked list entries are updated (having a reference to both previous and next object makes this trivial). Linked list itself is used by garbage collection algorithm to break cycles (at some point they need to iterate over all of the currently alive objects). This is how the object header looks like:

    #define _PyObject_HEAD_EXTRA            \
        struct _object *_ob_next;           \
        struct _object *_ob_prev;
    
    typedef struct _object {
        _PyObject_HEAD_EXTRA
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
    } PyObject;
    

    In Swift we have automatic reference counting for class instances (and a few others). The big catch here is that the user (programmer) is expected to take care of the strong reference cycles (to simplify things a bit: imagine 2 objects that hold a reference to each other, both have reference count 1, so neither of them will be deallocated).

    Anyway, now it's our turn to discuss this:

    • using Swift native ARC - I'm not really sure if you can just allocate a bunch of memory with native Swift ARC, but as a last resort we can use ManagedBufferPointer without elements.

      This is nice because (in theory) Swift takes case of everything. In practice however… there are some problems. First of all, you still need a some form of garbage collection to break the cycles. This will be simpler than full-on garbage collection and the requirements will be not as demanding since it will run less often, but still, it is something to keep in mind.

      The other things is that most of the garbage collection algoritms work by somehow marking all of the reachable objects and then in a 2nd phase removing all of the not-marked ones. Unfortunately to do this you need a reference to all of the reachable objects (most commonly by some sort of a linked list). The question is: what kind of reference would it be?

      • strong - always present +1 reference count, which invalidates the whole point of reference counting
      • unowned - this will allow objects to be deallocated, but we would somehow need to know which references are alive and which not
      • weak - this is an obvious choice

      Unfortunately there is a possible performance problem since having just a single the weak reference in Swift, moves the whole reference count to side-table. Unfortunately this side-table is off-line (it is an separate allocation outside of the object). This is a potential memory/cache waste, not to mention that the retain now has to fetch the object and then fetch the side-table (it is called slowRC for a reason).

      Code to test the presence of the side table
        import Foundation
      
        // Sources:
        // SWIFT_REPO/stdlib/public/SwiftShims/RefCount.h
        // SWIFT_REPO/stdlib/public/SwiftShims/HeapObject.h
      
        let strongExtraRefCountShift: UInt64 = 33
        let strongExtraRefCountBitCount: UInt64 = 30
        let strongExtraRefCountMask: UInt64 = ((1 << strongExtraRefCountBitCount) - 1) << strongExtraRefCountShift
      
        let useSlowRCShift: UInt64 = 63
        let useSlowRCBitCount: UInt64 = 1
        let useSlowRCMask: UInt64 = ((1 << useSlowRCBitCount) - 1) << useSlowRCShift
      
        func printARC(name: String, ptr: UnsafeMutableRawPointer) {
          let namePad = name.padding(toLength: 18, withPad: " ", startingAt: 0)
          // let typePtr = ptr.load(as: UInt64.self)
          let refCountBits = ptr.load(fromByteOffset: 8, as: UInt64.self)
      
          let hasSideTable = (refCountBits & useSlowRCMask) >> useSlowRCShift
          if hasSideTable == 1 {
            print("\(namePad)| sideTable")
          } else {
            let strongExtraRefCount = (refCountBits & strongExtraRefCountMask) >> strongExtraRefCountShift
            print("\(namePad)| strongExtraRefCount: \(strongExtraRefCount)")
          }
        }
      
        class Alice {
          var ptr: UnsafeMutableRawPointer {
            return Unmanaged.passUnretained(self).toOpaque()
          }
        }
      
        let alice = Alice()
        printARC(name: "1 strong", ptr: alice.ptr) // strongExtraRefCount: 0
      
        let aliceInWonderland = alice
        printARC(name: "2 strong", ptr: alice.ptr) // strongExtraRefCount: 1
      
        weak var aliceBackInLondon = alice
        printARC(name: "2 strong + 1 weak", ptr: alice.ptr) // sideTable
      
    • manual retain/release - in this approach the programmer is responsible for inserting the retain/release calls. For example: when adding an object to a list we retain it, when removing it (or deleting the while list) we release it. Everything else is the same as in the “using Swift native ARC” point.

      The main drawback is of course the manual labor of adding the retain/release calls. It is also extremely difficult to get right and even a single mistake may have consequences:

      • unbalanced retain - object lives forever (along with all of the objects it references)
      • unbalanced release - “use after free” error -> crash

      This approach is really good if you can make it right. CPython uses it, because they can afford it: they have the time and manpower to find and fix any possible errors. I don't think we could do the same.

    • manually implemented smart pointer - I don't think it is possible in Swift to write our own version of smart pointer. Even in languages which give you more control (like C++) it is an extremely hard thing to do.

    • full garbage collection without ARC - there are tons of materials about this on the internet.

    Btw. we had a similar problem when writing our implementation of BigInt: after the heap-allocated storage is not needed -> how do we release it? And how do we even know that it is no longer needed? We went with ManagedBufferPointer to get Swift ARC, but technically you can implement an BigInt with a garbage collector (although I am not sure that this would be a good idea). I think that the main difference between the BigInt and the Python object representation is that the Python objects can contain references to other objects, possibly creating a cycle. Anyway, under the hood we were dealing with an unowned pointers, so we can either find the owners (this would be on case-by-case basis, so a lot of room for mistakes) or automate things (either via ARC or garbage collector).

    enhancement 
    opened by LiarPrincess 1
  • Compile time member offsets in PyObjects

    Compile time member offsets in PyObjects

    Currently we use GenericLayout to calculate offsets of PyObject members (like this).

    Instead we could do the usual offset/size dance:

    extension PyObject {
      public static let member2Offset = calculateOffset(
        previousMemberOffset: Self.member1Offset,
        previousMemberSize: MemoryLayout<Member1Type>.size,
        alignment: MemoryLayout<Member2Type>.alignment
      )
    }
    
    func calculateOffset(previousMemberOffset: Int, previousMemberSize: Int, alignment: Int) -> Int {
      var offset = previousMemberOffset + previousMemberSize
      return round(offset, to: alignment)
    }
    

    Though, I have not checked what Swift compiler emits for the current version, so maybe it is not needed.

    enhancement 
    opened by LiarPrincess 0
  • Store `BigIntStorage.capacity` in header

    Store `BigIntStorage.capacity` in header

    From developer.apple.com/ManagedBufferPointer/Capacity:

    This value may be nontrivial to compute; it is usually a good idea to store this information in the “header” area when an instance is created.

    Under the hood

    From ManagedBuffer.swift:

    extension ManagedBufferPointer {
      @inlinable
      @available(OpenBSD, unavailable, message: "malloc_size is unavailable.")
      public var capacity: Int {
        return (
          _capacityInBytes &- ManagedBufferPointer._elementOffset
        ) / MemoryLayout<Element>.stride
      }
    }
    

    Where _capacityInBytes is defined as malloc_size:

    static inline __swift_size_t _swift_stdlib_malloc_size(const void *ptr) {
      extern __swift_size_t malloc_size(const void *);
      return malloc_size(ptr);
    }
    

    Solutions

    • Store additional member capacity: Int in Header. Header size: 2 words.
    • Pack capacity inside existing header. Header size: 1 word. Layout [1bit - isNegative][31 bits - count][32 bits - capacity]
    enhancement 
    opened by LiarPrincess 1
  • Tuples with tail allocation

    Tuples with tail allocation

    Currently we store tuple elements inside Swift array (elements: [PyObject]). The better idea would be to allocate more space after the tuple and store elements there (this is called flexible array member in C). This saves a pointer indirection and is better for cache, since we can fit a few first elements in the same line as type, __dict__ etc. We can also do this for other immutable container types:

    • str - currently native Swift String. This would force us to implement our own String type - not hard, but takes a lot of time.
    • int - currently our own BigInt implementation (which does store values in Int32 range inside the pointer).

    Calculating layout

    This is not a problem since GenericLayout.Field already supports repeatCount:

    internal struct GenericLayout {
    
      internal struct Field {
        internal let size: Int
        internal let alignment: Int
        internal let repeatCount: Int
    
        /// `repeatCount` is the number of times `type` is repeated:
        /// ```c
        /// struct DisnepPrincess {
        ///     char name[20];
        /// };
        /// sizeof (struct DisnepPrincess) // 20
        /// ```
        internal init<T>(_ type: T.Type, repeatCount: Int = 1) {
          assert(repeatCount >= 0)
          self.size = MemoryLayout<T>.size
          self.alignment = MemoryLayout<T>.alignment
          self.repeatCount = repeatCount
        }
      }
    
      internal init(initialOffset: Int, initialAlignment: Int, fields: [Field]) { things… }
    }
    

    So, for tuple with count 5 we would have following layout:

    • PyObject things
    • PyTuple things - for example count and cached hash (though we have to remember that while tuple is immutable, the elements inside are not)
    • GenericLayout.filed(PyObject.self, repeatCount: 5) <- this is the new thing

    Homo/hetero

    • Homomorphic - all allocated elements are layout compatible, both inline (for example RawPtr for storing multiple PyObjects) and on the heap (all PyObjects have the same header). This is true for tuple.
    • Heteromorphic - tail allocated elements can be different. Example for PyFrame.FastLocalsCellFreeBlockStackStorage:
      • fastLocals - PyObject?
      • cell/free - PyCell
      • blocks - PyFrame. Block - totally incompatible with PyObject? and PyCell
    enhancement 
    opened by LiarPrincess 0
  • Make unit tests run in `--parallel`

    Make unit tests run in `--parallel`

    With the new object model (see: #1) it should be possible to run unit tests with --parallel enabled (because py is no longer a global variable). Maybe. Probably…

    • UnicodeDataDoesNotCrashTests. test_iterateAll should be split by plane giving us 17 tests
    • make test (in Makefile) should be updated
    enhancement 
    opened by LiarPrincess 0
  • Garbage collection

    Garbage collection

    CPython

    In CPython they use manual reference counting:

    • each object has a reference count and an address of the previous and the next allocated object (doubly linked list)
    • Py_INCREF is used to retain (reference count += 1)
    • Py_DECREF to release (reference count -= 1) an object

    If the reference count reaches 0 then it is deallocated and the linked list entries are updated (having a reference to both previous and next object makes this trivial). Linked list itself is used by garbage collection algorithm to break cycles (at some point they need to iterate over all of the currently alive objects).

    This is how the object header looks like:

    #define _PyObject_HEAD_EXTRA            \
        struct _object *_ob_next;           \
        struct _object *_ob_prev;
    
    typedef struct _object {
        _PyObject_HEAD_EXTRA
        Py_ssize_t ob_refcnt;
        struct _typeobject *ob_type;
    } PyObject;
    

    Violet

    In Swift we have automatic reference counting for class instances (and a few others). The big catch here is that the user (programmer) is expected to take care of the strong reference cycles (to simplify things a bit: imagine 2 objects that hold a reference to each other, both have reference count 1, so neither of them will be deallocated).

    Our options are:

    • manual retain/release - in this approach the programmer is responsible for inserting the retain/release calls. For example: when adding an object to a list we retain it, when removing it (or deleting the whole list) we release it.

      The main drawback is of course the manual labor of adding the retain/release calls. It is also extremely difficult to get right and even a single mistake may have consequences:

      • unbalanced retain - object lives forever (along with all of the objects it references)
      • unbalanced release - “use after free” error -> crash

      This approach is really good if you can make it right. CPython uses it, because they can afford it: they have the time and manpower to find and fix any possible errors. I don't think we could do the same.

    • manually implemented smart pointer - I don't think it is possible in Swift to write our own version of smart pointer. Even in languages which give you more control (like C++) it is an extremely hard thing to do.

    • full garbage collection without ARC - there are tons of materials about this on the internet.

    • (THIS ONE IS NOT POSSIBLE ACCORDING TO NEW OBJECT MODEL) using Swift native ARC - I'm not really sure if you can just allocate a bunch of memory with native Swift ARC, but as a last resort we can use ManagedBufferPointer without elements.

      This is nice because (in theory) Swift takes case of everything. In practice however… there are some problems. First of all, you still need a some form of garbage collection to break the cycles. This will be simpler than full-on garbage collection and the requirements will be not as demanding since it will run less often, but still, it is something to keep in mind.

      The other things is that most of the garbage collection algoritms work by somehow marking all of the reachable objects and then in a 2nd phase removing all of the not-marked ones. Unfortunately to do this you need a reference to all of the reachable objects (most commonly by some sort of a linked list). The question is: what kind of reference would it be?

      • strong - always present +1 reference count, which invalidates the whole point of reference counting
      • unowned - this will allow objects to be deallocated, but we would somehow need to know which references are alive and which not
      • weak - this is an obvious choice

      Unfortunately there is a possible performance problem since having just a single the weak reference in Swift, moves the whole reference count to side-table. Unfortunately this side-table is off-line (it is an separate allocation outside of the object). This is a potential memory/cache waste, not to mention that the retain now has to fetch the object and then fetch the side-table (it is called slowRC for a reason).

      Code to test the presence of the side table
        import Foundation
      
        // Sources:
        // SWIFT_REPO/stdlib/public/SwiftShims/RefCount.h
        // SWIFT_REPO/stdlib/public/SwiftShims/HeapObject.h
      
        let strongExtraRefCountShift: UInt64 = 33
        let strongExtraRefCountBitCount: UInt64 = 30
        let strongExtraRefCountMask: UInt64 = ((1 << strongExtraRefCountBitCount) - 1) << strongExtraRefCountShift
      
        let useSlowRCShift: UInt64 = 63
        let useSlowRCBitCount: UInt64 = 1
        let useSlowRCMask: UInt64 = ((1 << useSlowRCBitCount) - 1) << useSlowRCShift
      
        func printARC(name: String, ptr: UnsafeMutableRawPointer) {
          let namePad = name.padding(toLength: 18, withPad: " ", startingAt: 0)
          // let typePtr = ptr.load(as: UInt64.self)
          let refCountBits = ptr.load(fromByteOffset: 8, as: UInt64.self)
      
          let hasSideTable = (refCountBits & useSlowRCMask) >> useSlowRCShift
          if hasSideTable == 1 {
            print("\(namePad)| sideTable")
          } else {
            let strongExtraRefCount = (refCountBits & strongExtraRefCountMask) >> strongExtraRefCountShift
            print("\(namePad)| strongExtraRefCount: \(strongExtraRefCount)")
          }
        }
      
        class Alice {
          var ptr: UnsafeMutableRawPointer {
            return Unmanaged.passUnretained(self).toOpaque()
          }
        }
      
        let alice = Alice()
        printARC(name: "1 strong", ptr: alice.ptr) // strongExtraRefCount: 0
      
        let aliceInWonderland = alice
        printARC(name: "2 strong", ptr: alice.ptr) // strongExtraRefCount: 1
      
        weak var aliceBackInLondon = alice
        printARC(name: "2 strong + 1 weak", ptr: alice.ptr) // sideTable
      

    Btw. we had a similar problem when writing our implementation of BigInt: after the heap-allocated storage is not needed -> how do we release it? And how do we even know that it is no longer needed? We went with ManagedBufferPointer to get Swift ARC, but technically you can implement an BigInt with a garbage collector (although I am not sure that this would be a good idea).

    I think that the main difference between the BigInt and the Python object representation is that the Python objects can contain references to other objects, possibly creating a cycle. Anyway, under the hood we were dealing with an unowned pointers, so we can either find the owners (this would be on case-by-case basis, so a lot of room for mistakes) or automate things (either via ARC or garbage collector).

    enhancement 
    opened by LiarPrincess 2
Owner
LiarPrincess
LiarPrincess
Forblaze - A Python Mac Steganography Payload Generator

Forblaze - A Python Mac Steganography Payload Generator Author: AsaurusRex Disclaimer DO NOT use this project for purposes other than legitimate red t

null 54 Sep 5, 2022
iOS sandboxed terminal with Python, Lua and Clang

LibTerm LibTerm is a terminal for iOS with Python 3.7 and Lua 5.3. Supports iOS 13 dark mode and multi window. Features The app supports most of OpenT

Emma Cold 527 Jan 7, 2023
Swift-HorizontalPickerView - Customizable horizontal picker view component written in Swift for UIKit/iOS

Horizontal Picker View Customizable horizontal picker view component written in

Afraz Siddiqui 8 Aug 1, 2022
A utility that reminds your iPhone app's users to review the app written in pure Swift.

SwiftRater SwiftRater is a class that you can drop into any iPhone app that will help remind your users to review your app on the App Store/in your ap

Takeshi Fujiki 289 Dec 12, 2022
An extensible monitoring framework written in Swift

XestiMonitors Overview Reference Documentation Requirements Installation CocoaPods Carthage Swift Package Manager Usage Core Location Monitors Core Mo

J. G. Pusey 271 Oct 5, 2022
A simple Pokedex app written in Swift that implements the PokeAPI, using Combine and data driven UI.

SwiftPokedex SwiftPokedex is a simple Pokedex app written by Viktor Gidlöf in Swift that implements the PokeAPI. For full documentation and implementa

Viktor G 26 Dec 14, 2022
noppefoxwolf/notion is a notion.so API library written in swift.

notion noppefoxwolf/notion is a notion.so API library written in swift. Installation Xcode Project > Swift Packages [email protected]:noppefoxwolf/notion

noppefoxwolf 44 Oct 7, 2022
Just another NES emulator written in Swift

SwiftNes This repo contains all the source code for my experimental 100 days of NES challenge. What is this all about? I'm planning to build a fully w

Tibor Bödecs 44 Dec 24, 2021
A Powerful , Extensible CSS Parser written in pure Swift.

A Powerful , Extensible CSS Parser written in pure Swift.

null 273 Sep 9, 2022
A (slow) ray tracer written in Swift.

Blitzlichtgewitter Blitzlichtgewitter is a (slow) ray tracer written in Swift. It loosely follows the steps described in The Ray Tracer Challenge by J

Lennart Stolz 1 Dec 21, 2021
Readium Mobile is a toolkit for ebooks, audiobooks and comics written in Swift & Kotlin.

Readium Swift Toolkit Readium Mobile is a toolkit for ebooks, audiobooks and comics written in Swift & Kotlin. This toolkit is a modular project, whic

Readium 89 Dec 30, 2022
A simple, but efficient CSV Parser, written in Swift.

CSV CSV.swift is a powerful swift library for parsing CSV files that supports reading as [String], [String: String] and Decodable, without sacrificing

Ben Koska 4 Nov 28, 2022
Questrade API written in Swift

QuestradeAPI Getting Started The QuestAPI is made up of two main concepts: ResponseProviders API ResponseProviders The job of the provider is to retur

Eli Slade 2 Mar 15, 2022
A parser combinator library written in the Swift programming language.

SwiftParsec SwiftParsec is a Swift port of the Parsec parser combinator library. It allows the creation of sophisticated parsers from a set of simple

David Dufresne 219 Nov 6, 2022
Simple and Lightweight App Version Tracking for iOS written in Swift

AEAppVersion Simple and lightweight iOS App Version Tracking written in Swift I made this for personal use, but feel free to use it or contribute. For

Marko Tadić 12 Nov 11, 2022
Simple getter for bundle informations, written in Swift

BundleInfos Simple getter for bundle informations It's simple and static way for getting information from main bundle Requirements iOS 8.0+ Xcode 7.3

Jay Choi 1 Dec 3, 2017
Super powerful remote config utility written in Swift (iOS, watchOS, tvOS, OSX)

Mission Control Super powerful remote config utility written in Swift (iOS, watchOS, tvOS, OSX) Brought to you by Have you ever wished you could chang

appculture 113 Sep 9, 2022
SuggestionsBox helps you build better a product trough your user suggestions. Written in Swift. 🗳

SuggestionsBox An iOS library to aggregate users feedback about suggestions, features or comments in order to help you build a better product. Swift V

Manuel Escrig 100 Feb 6, 2022
FancyGradient is a UIView subclass which let's you animate gradients in your iOS app. It is purely written in Swift.

FancyGradient is a UIView subclass which let's you animate gradients in your iOS app. It is purely written in Swift. Quickstart Static gradient let fa

Derek Jones 11 Aug 25, 2022