DeCA partners with military to help promote su...

Image by Morning Calm News via Flickr

This post is a summary of  Eli Gottlieb’s thesis on the Deca programming language from May 2011. In short Deca is a language designed to provide the advanced features of sophisticated, high-level programming languageswhile still programming as close as possible to the bare metal. It brings in the functional, object-oriented and generic programming paradigms without requiring a garbage collector or a threading system.

Since it is a programming-language thesis, it is also dedicated to every programmer who ever wanted a better language but could not use a virtual machine or a run-time library. To them I dedicate this thesis and say: I am become type theory, destroyer of minds.

Problems of systems programming

I once contemplated doing some Linux kernel hacking, but decided to lie down until the feeling passes. That is my closest brush with systems programming and after reading this thesis – yikes.

Essentially the problems are what you’d expect when dealing with hardware-imposed limitations, lacking useful abstractions and safety features – in fact, you’re usually the one creating these.

  1. precise data representation - working so close to bare metal means datatypes must correspond directly to their hardware representations. You can’t just have a magical List datatype, you can have a block of memory though
  2. safety properties and confined unsafety - the most common form of safety is
    English: Kernel panic Magyar: "Kernel pan...

    Panic

    type safety, the idea being that the compiler makes sure you aren’t trying to multiply a sack of potates with a banana. But other safety features a good language should provide according to Gottlieb are escape analysis for pointers, region-based memory management and preservation and progress (well-typedness)

  3. abstraction, encapsulation, modularity - most modern languages provide ways of packaging code so it can be reused and encapsulated. For instance: when you are using a stack structure, you don’t really care whether it’s implemented as a list or a memory vector.
  4. extensibility - a way to extend the language itself with new features (for example, making the + operator work with new data types). So far possible solutions for this exist as OOP, ad-hoc polymorphism, macros and so on, but it remains an open question and the perfect solution might not even exist
  5. Stroustroup’s rule - the lead designer of C++ Bjarne Stroustrup once presented a rule that What you don’t use, you don’t pay for. But in many high-level languages automatic memory management runs whether you need it or not, or every function needs exception handling … all big problems in the limited confines of bare metal programming.

Deca’s solutions

English: Safety

Image via Wikipedia

  1. unboxed data types - for those of us who didn’t know, boxed types are represented by a point to an object; in Deca all types are compiled down to raw unboxed representations – just a value – so when the code is running there’s no more overhead. There are also two kinds of pointers (scoped and referenced) that allow you to use a pointer as if it was a variable, which sounds pretty cool from my experience with explicit pointers.
  2. type safety - admittedly, this section went a bit over my head, but Deca uses a magical combination of static type inference (static typing where the compiler guesses stuff for you) and bit-casting – this is a system that allows you to eschew type safety under certain conditions because systems programmers apparently need that. If you care for this sort of thing -> Deca uses a modified Hindley-Milner inference algorithm that also allows subtyping
  3. module system - just as you’d expect of any modern language, you can package things into modules and modules into modules
  4. encapsulated existential types - these are best known as the type-theoretic encoding of abstract data types – giving us the ability to use data structures without knowing all the internal logic. In Deca these exist as a language extension and the whole thing works out just like it did for Caml
  5. extensible types - Deca provides two ways of extending data types. The internal way of “open-sum variant types”, which I don’t understand and the thesis isn’t very specific as to what thi means. The other are good old friendly classes, which we all understand and love from object-oriented languages
  6. symmetric multiple dispatch - dynamic dispatch is a way to dynamically decide which method to call in order to process a particular message (polymorphism, pattern matching etc.) Deca does this by having a partially ordered list of possible methods, walking through it and when it finds something that can execute the given arguments, it is the most specific binding.
  7. low-level encodings of high-level features - this section of the thesis is a bit longer, but it essentially boils down to the idea of using the LLVM to run compiled code and making sure all the features explained above are compiled to their most basic incarnations. According to another section of the thesis this also ensures adherence to Stroustrup’s rule

An example

I would love to personally produce an example of what Deca looks like, but I’m already having enough fun learning Haskell, so here’s an official example of a List implementation.

 
module list
 
import malloc
 
type List<a> = class(e,n) extends Sequence<a> {
  element: a:= e;
  next: @List:= n;
}
 
function cons(element,next) {
  malloc.malloc(pool => new(pool)(List(element,next)))
}
 
function car(lst: @List) {
  match *lst {
    case Cons(head,tail) => Some(head)
    case Nil => None
  }
}
 
function cdr(lst: @List) {
  match *lst {
    case Cons(head,tail) => tail
    case Nil => Nil
  }
}
 
end

Conclusion

The thesis itself also compares Deca to other modern high-level languages for systems programming like Clay, BitC, Cyclone and Java “with magic”. That section didn’t feel too important, the resolution is simply that Deca is better.

Unfortunately though, Deca itself doesn’t look to be ready for real-world use just yet. Even though the language itself is pretty much defined and its grammar is known, no complete compiler yet exists. The official compiler, decac, developed by Gottlieb doesn’t yet support all the features and I’ve heard rumors it has been scrapped and is being developed anew because some fundamental issues were discovered.

All in all, this looks like an interesting language to keep an eye on if you’re a systems programmer, but I feel C will be the king for a long while yet.

Enhanced by Zemanta

Comments are closed.