Online / 6 & 7 February 2021

schedule

The Fuzion Language

Combining safety and analysability with high performance - while distracted by a 🐶


Fuzion is a modern general purpose programming language that unifies concepts found in structured, functional and object-oriented programming languages into the concept of a Fuzion feature. It combines a powerful syntax and safety features based on the design-by-contract principle with a simple intermediate representation that enables powerful optimizing compilers and static analysis tools to verify correctness aspects.

Fuzion was influenced by many other languages including Java, Python, Eiffel, Rust, Go, Lua, Kotlin, C#, F#, Nim, Julia, Clojure, C/C++, and many more. The goal of Fuzion is to define a language that has the expressive power present in these languages and allow high-performance implementation and powerful analysis tools. Furthermore, Fuzion addresses requirements for safety-critical applications by adding support for contracts that enable formal specification and enable detailed control over runtime checks.

The talk will explain Fuzion's motivation and present its main concepts, feature declarations and feature calls. It will not go into details of the syntax, but present Fuzion's approach to immutability, memory management and type inference.

Introduction

Many current programming language are getting more and more overloaded with new concepts and syntax to solve particular development or performance issues. Languages like Java/C# provide classes, interfaces, methods, packages, anonymous inner classes, local variables, fields, closures, etc. And these languages are currently further extended by the introductions of records/structs, value types, etc. The possibility for nesting of these different concepts results in complexity for the developer and the tools (compilers, VMs) that process and execute the code.

For example, the possibility to access a local variable as part of the closure of a lambda expression may result in the compiler allocating heap space to hold the contents of that local variable. Hence, the developer has lost control over the allocation decisions made by the compiler.

In Fuzion, the concepts of classes, interfaces, methods, packages, fields and local variables are unified in the concept of a Fuzion feature. The decision where to allocate the memory associated with a feature (on the heap, the stack or in a register) is left to the compiler just as well as the decision if dynamic type information is needed. The developer is left with the single concept of a feature, the language implementation takes care for all the rest.

Fuzion Feature Declarations

A Fuzion feature is has a name, similar to the name of a class or a function. The main operation that can be performed on a feature is a feature call. The constituents of a feature declaration are as follows:

Formal Arguments

Features may have a list of formal arguments, which are themselves features implemented as fields. On a call to a feature with formal arguments, actual arguments have to be provided to the call, unless the list of formal arguments is empty.

Feature Result

The result of a feature call is an instance of the feature. Alternatively, a feature may declare a different result type, then it must return a value of that type on a call.

Closures

Features are nested, i.e., every feature is declared within the context of an outer feature. The only exception is the universe, which is the outermost feature in any Fuzion system. A feature can access features declared in its outer feature or, recursively, any outer feature of these outer features. This means, a feature declaration also declares a closure of the feature and its context.

When calling a feature f1 declared as an inner feature of f2, the call must include a target value which is the result of a call to f2, e.g., f2.f1.

Generics

Features may have generic type parameters. E.g. a feature declaration may leave the actual type used within that feature open and to be defined by the user of the feature.

The list of generic type parameters may be open, i.e., the number of actual generic type parameters is not fixed at feature declaration. This turns out to be useful in he declaration of choice types or functions explained below.

Inheritance

Fuzion features can inherit from one or several other features. When inheriting from an existing features, all inner features of the parent automatically become inner features of the heir feature. It is possible to redefine inherited features. In particular, when inheriting from a feature with abstract inner features, one can implement the inherited abstract features.

A redefinition of an inherited feature may implement an inherited feature as a routine or as a field. An inherited feature that is implemented as a field, however, cannot be redefined.

Inheritance may result in conflicts, e.g. if two features with the same name are inherited from two different parents. In this case, the heir must resolve the conflict either by redefining the inherited features and providing a new implementation or by renaming the inherited features resulting in two inner features in the heir feature.

Inheritance and redefinition in Fuzion does not require dynamic binding. By default, the types defined by features are value types and no run-time overhead for dynamic binding is imposed by inheritance.

Contract

A feature may declare a contract that specifies what the features does and under which conditions the feature may be called. This will be handled in more detail below in the section Design by Contract.

An implementaton

Features must have one of the following implementations

  • a routine is a feature implementation with code that is executed on a call

  • a field is a memory slot that stores a value and whose contents are returned on a call

  • an abstract feature has no implementation and cannot be called directly, but can be implemented by heir features

  • an intrinsic feature is a low-level feature implemented by the compiler or run-time system, e.g., the infix + operator to add two 32-bit integer values may be an intrinsic operation.

A feature implemented as a routine can contain inner feature declarations.

Feature examples

Here is an example that declares a feature point that functions similar to s struct or record in other languages:

point(x, y i32) is # empty
p1 := point(3, 4)
stdout.println("p1.x is " + p1.x)    # will print "p1.x is 3"
stdout.println("p1.y is " + p1.y)    # will print "p1.y is 4"

The next example show a feature base that provides an innver feature plus that adds its argument to the value passed to the enclosing base:

base(v i32) is
  plus(w i32) => v + w

b1 := base(30)
b2 := base(100)
stdout.println(b1.plus(23))    # will print "53"
stdout.println(b2.plus(23))    # will print "123"

Features and Types

A feature declaration implicitly declares a type of its instances. In the example above, the feature declaration

point(x, y i32) is

declares the type point that can be used to declare a feature of field type, so we could, e.g., declare a new feature that takes an argument of type point:

draw(p point) is
  drawPixel(p.x, p.y)

Syntactic Sugar

Fuzion's grammar adds syntactic sugar that simplifies the developers work by making code more readable and easier to write. However, internally, this syntactic sugar is converted into feature declarations and feature calls. Consequently, only the front-end of the language implementation is affected, the later phases that optimize, analyse, compiler, interpret, etc. the code are not affected by this.

Loops

Fuzion has a very powerful syntax for loops, an overview is given at flang.dev. However, loops are purely syntactic sugar and get translated into feature declarations and calls. This has important impact for the immutability of variables as will be explained below. Since the compiler performs optimizations for tail recursive calls, the resulting performance will be equal to inline loops.

Choice Types

Fuzion provides choice types (also called union types, sum types, tagged types, co-product types in other languages). A choice type is a feature declared in the standard library that has an open list of generic parameters for a number of actual types a field of this choice type may hold.

The simplest example of a choice type is the type bool, which is a choice between types TRUE and FALSE. TRUE and FALSE are themselves declared as features with no state, i.e., no fields containing any data.

Another example for a choice type from the standard library is Option<T>, which is a generic choice type that either holds a value of type T or nil, while nil is a feature with no state declared in the standard library.

A match statement can be used to distinguish the different options in a choice type, e.g.,

mayBeString Option<string> = someCall()
match mayBeString
  s String => stdout.println(s)
  _ nil    => stdout.println("no string")

The ? operator allows for more compact syntax, the following code is equivalent

stdout.println(mayBeString ? s | "no string")

while a single ? may be used to return immediately from the current feature in case of an error

stdout.println(mayBeString?)    # return nil immediately if mayBeString is nil

which works only within a feature that may return the unhandled types as a result.

First-class Function

Another open generic type in the standard library is Function. This is a feature that declares an abstract inner feature call. Syntactic sugar in the Fuzion front-end enables inline declaration of functions as shown in this example:

fun_example is

  eval(msg string, f fun (i32) i32) is
    for
      x in 0..3
      m := msg, ", "
    do
      y := f(x)
      stdout.print("" + m + "f(" + x + ") = " + y)
    stdout.println

  eval("f(x) = 2*x:   ", fun (x i32) => 2*x)
  eval("f(x) = x*x:   ", fun (x i32) => x*x)
  eval("f(x) = x*x*x: ", fun (x i32) => x*x*x)

which results in

f(x) = 2*x:   f(0) = 0, f(1) = 2, f(2) = 4, f(3) = 6
f(x) = x*x:   f(0) = 0, f(1) = 1, f(2) = 4, f(3) = 9
f(x) = x*x*x: f(0) = 0, f(1) = 1, f(2) = 8, f(3) = 27

Internally, function declarations are implemented as inner features. Since a feature declaration defines a closure, a function declarations also defines the function closure.

Tuples

Tuples in Fuzion are provided by a generic standard library feature Tuple. The open list of generic parameters specifies the types of each element and their number.

Here is an examples of a feature that splits a 32-bit unsigned integer v into four bytes returned as a tuple:

bytes(v u32) => ((v >> 24) ,
                 (v >> 16) & 255,
                 (v >>  8) & 255,
                 (v      ) & 255)

A caller might directly decompose this tuple when calling bytes:

(b0, b1, b2, b3) := bytes(1234567890)

stdout.println("Bytes: " + b0 + " " + b1 + " " + b2 + " " + b3 + " ")

References

By default, Fusion types have value semantics. This means that a value cannot be assigned to a field whose type is a parent type of the value's type. However, a value can be marked as a reference by adding the keyword ref. Then, any value of a heir type can be assigned to this field.

An example is the argument of the println feature, which is a ref to an Object. Its implementation is shown here:

public println(s ref Object) void is
  for c in s.asString.asBytes do
    write(c)
  println

Object is the implicit parent feature of all features that do not explicitly declare a parent. Consequently, any value can be assigned to the argument s of println.

Object declares some basic features such as asString, which creates a string representation of the Object. Redefinitions of these functions in heir features such as integer provide useful string representations of the objects.

Note that the ref keyword does not require that the implementation would allocate the value passed as a ref on the heap and it also does not imply that we will have dynamic type information and dynamic binding at run-time. Instead, these overheads can be avoided by stack allocation and specialization.

Design by Contract

Fuzion features can be equipped with pre- and post-conditions to formally document the requirements that must be met when a feature is called and the guarantees given by a feature. An example is a feature that implements a square root function:

sqrt(a i32) i32
  pre
    a >= 0
  post
    result * result <= a,
    (result + 1) > a / (result + 1),
    result >= 0
is
  if a == 0
    0
  else
    for
      last := 0, r
      r    := 1, (last +^ a / last) / 2
    until r == last

In this case, the function defines the pre-condition that its argument a is non-negative. A call of this function with a negative value will result in a run-time error. On the other hand, its post-conditions make a clear statement about the result: The result will be the largest value that, when squared, is <= a.

Checking Pre- and Post-conditions

Pre- and post-conditions can be classified for different purposes. Default qualifiers provided in the standard library are

  • safety

    This qualifier protects pre-conditions that are required for the safety of an operation.

    An example is the index check pre-condition of the intrinsic operation to access an element of an array: Not performing the index check would allow arbitrary memory accesses and typically would break the applications safety.

    This qualifier should therefore never be disabled unless you are running code in an environment where performance is essential and safety is irrelevant.

  • debug

    This qualifier is generally for debugging, it is set iff debugging is enabled.

  • debug(n)

    this qualifier is specific for enabling checks at a given debug level, where higher levels include more and more expensive checks.

  • pedantic

    This qualifier is for conditions that a pedantic purist would require, that otherwise a more relaxed hacker would prefer to do without.

  • analysis

    This is a qualifier for conditions that are generally not reasonable as run-time checks, either because they are prohibitively expensive or even not at all computable in this finite universe. These conditions may, however, be very useful for formal analysis tools that do not execute the code but perform techniques such as abstract interpretation or formal deduction to reason about the code.

Run-time checks for pre- and post-conditions can be enabled or disabled for each of these qualifiers. This gives a fine-grain control over the kind of checks that are desired at run-time. Usually, one would always want to keep safety checks enabled in a system that processed data provided from the outside to avoid vulnerabilities such as buffer overflows. However, in a closed system like a rocket controller, it might make sense to disable checks since a run-time error would mean definite loss of the mission, while an unexpected intermediate value may still result in a useful final result of a calculation.

Immutability in Fuzion

Fuzion encourages the use of immutable data by simple syntax for the declaration of immutable fields. Also, the use of tail recursion for loops automatically converts index variables used in that loop into immutable variables with a life span of a single loop iteration.

Since immutability is essential to ensure correctness of parallel execution within threads that do not rely on locks or similar synchronization mechanisms, Fuzion's analyser will require data that is shared between threads to be immutable.

Memory Management in Fuzion

Fuzion to a large extend relies on static analysis to reduce memory management overhead. Instances are by default value instances that do not require heap allocation. Furthermore, immutability in many cases avoids the need to keep a shared copy on the heap.

For dynamic calls, heap allocation and dynamic binding overhead is avoided by specialization of calls.

Only for those instances for which all of these optimizations would fail, in particular instances shared between threads or long-lived instances with mutable fields, heap allocation will be required.

Memory allocated on the heap will be reclaimed by a real-time garbage collector, an example means to provide this is the real-time GC presented at ISMM'10: Concurrent, parallel, real-time garbage-collection.

Type Inference

Fuzion is a statically types language. However, type inference is used to avoid the need to explicitly provide types for fields, for function results and for actual generic parameters. Furthermore, the type of constant numeric expression is propagated from target the expression's value is assigned to.

For the type inference for fields and function results, a dedicated phase in Fuzion's front-end lazily determines the types of expressions. This phase also detects possible cycles in the type inferencing and flags these as errors if encountered.

For interference of actual generic parameters, the types of actual arguments passed the the generic feature are used. This, however, works only for generic features that use the generic types within their formal argument lists.

Fuzion Implementation

The Fuzion implementation consists of several, independent parts from a front-end performing parsing and syntax-related tasks and creates the intermediate representation (IR), via a middle-end that collects the part of a Fuzion application, to the optimizer and several back-ends.

Fuzion IR

Fuzion has very simple intermediate representation. The dominant instruction is a feature call. The only control structure is a conditional (if) operation. Loops are replaced by tail recursive feature calls, so there is no need in the compiler or analysis tools to handle loops as long as (tail-) recursion is provided.

The largest part of a compiler back-end consists of providing target-platform specific implementations for all intrinsic features declared in the Fuzion standard library.

Fuzion Optimizer

The Fuzion Optimizer modifies the intermediate representation of a Fuzion application. In particular, it determines the life spans of values to decide if they can be stack allocated or need to be heap allocated and it specializes feature implementations for the actual argument types and the actual generic arguments provided at each call.

This means that run-time overhead for heap allocation and garbage collection will be avoided as much as possible, most values can be allocated on the run-time stack. Additionally, run-time type information such as vtables will be required only in very few cases where dynamic binding cannot be avoided. An example is a data structure like a list of some reference type with elements of different actual types that are stored in this list.

Fuzion Back-Ends

Fuzion currently has two back-ends: An interpreter written in Java running on OpenJDK and a back-end creating C source code processed by gcc or clang. It is planned to add further back-ends, in particular for LLVM and Java bytecode.

Next Steps

The Fuzion language definition and implementation is still in an early stage. It is planned to publish a first version of the language and its implementation at FOSDEM 2021. But a lot of work remains to be done, in particular, for Fuzion to be successful, it will need

  • a powerful standard library
  • additional library modules for all sorts of application needs
  • powerful interfaces to other languages such as Java, C, Python, etc.
  • highly optimizing back-ends
  • IDE integration
  • documentation, tutorials
  • much more

Also, professional services around Fuzion will be required for acceptance outside the open-source community. The Tokiwa SW GmbH currently leads the development of Fuzion and plans to provide professional services as well.

Speakers

Photo of Fridtjof Siebert Fridtjof Siebert

Attachments

Links