2012/09/26

A 21st Century Software Stack

One thing that really bugs me is how the most pervasive programming language in "consumer" hardware (I won't even get to the billions of lines of FORTRAN persisting elsewhere) is ANSI C.  It has been 23 years since that standard was published.  I am younger than the most popular programming language in terms of deployed software.  Yet in that time, lots of excellent tools of expressionism have emerged without sacrificing performance - objects, templates, exceptions, smart allocators (be they GC that consumes less CPU time overall compared to manual management, or smart pointers), compile time behavior in the language.  C has none of that, but it is still turing complete and you can still do anything you want in it.  But that omnipresence means that everything has to relate back to C, that derelect old language lacking so many modern conveniences, in how they are ABI compatible.  Almost every interaction between languages and executables is done through C, and almost every dynamically linked object was written in C, for just that reason.

And it drives me nuts that the foundations of modern computing persist while they are so flawed.  I have grown to despise C for how I can't do anything more intuitive than right functions and use switch statements instead of templates, use calculator_base_computeNode to write a function definition without namespaces or access modifiers.  I can't package up variables and functions into objects without having a dozen function pointers in a struct.  For all its faults, at least C++ is too verbose.  Especially with its cumbersome syntax.  But all that backwards compatibility ruins the language, every semicolon after a class definition makes a kitten cry.

D is another language I fell in love with, only to realize the root problem - it also adheres to C backwards compatibility.  You can still use C libraries in it, albeit that is meant as a feature - and it has to be a feature.  We are pretty much neck deep in C compatibility (at least under Unix, in Windows its basically every man for themselves) and it stifles development because I think there is a class of people (or at least me) who doesn't even fathom working at the hardware level on fixing many of the persistent problems in desktop Linux (because where else is hardware such a pain?) because everything is in C.  C is boring and dumb.

Now for the thought experiment, and I really like this idea - hopefully in a couple decades when we have our flying cars and Jaws 14 finally we will have a popular FOSS computing environment (because it definitely won't be a desktop environment anymore, it will be in your toaster, car, and probably your brain by that point) that uses a foundation of a trifecta of dialects of one core language to do everything.

Maybe not even that.  But I would absolutely like to work on something like that.  We will call these three languages L, M, and H, for low, medium, and high level programming.  L would be a direct compile, statically typed, stack based, multi-paradigm language meant to support writing efficient, brief code.  It would be a balancing act between verbosity, brevity, correctness, ease of implementation, and performance, always favoring performance but balancing it using a modern understanding of computing that includes all that we know today (for example, having a dynamic standard library instead of a statically linked one would reduce code sizes by tremendous amounts for almost every binary).  It wouldn't have a preprocessor, and you would be expected to use H as its build procedure (instead of some arbitrary language like make, shell, maven, etc that makes learning the entire stack such a pain in the butt). 

M would be a JIT interpreted bytecode language that has specific levels of privileges, would be garbage collected (L might include hooks for a GC and maybe even a standard GC, but it wouldn't be compiled or run by default) and would be a mixed static language supporting dynamic code generation and type deduction (a la D / c# mixins) depending on the operating level.  It would share syntax almost identically with L and would share almost its entire standard library with it (maybe even the same dyamically linked files) and be designed for a business productivity / application development tier.  It would also, I imagine, be a better candidate for web programming than something like the abomination JS has evolved into, given the evolving nature of web development (which I actually think is what might one day drive something like this into being, which will be another blog post).

Like I said, M would be the rapid deployment application language.  I imagine any core application (game engine, web / file browsers, video players) would be written in L, but almost everything else (word processor, calculator, games running on a library engine) could easily be targeted at M.  It would be the "official" language for developers, in that the OS can control specific privildges of M bytecode on a case by case basis, from high security no file access in a web browser to dynamic linking on very privlidged applications.  All at the OSes discretion.

Speaking of which, we need more microkernels.  I don't get why device drivers still need kernel injection, it seems so silly and bloats what should be a very simple system (hardware probe, scheduler, memory manager) into a complex blob.

H is what I really like about all this - an easy readable scripting language that has execution privildges based on executing environment, like M, but with the ability to probe for available modules (such as if runningInBrowser: stuff).  It would probably be whitespace significant like Python to minimize boilerplate and maximize productivity, because it would be a true "scripting" language - while its JIT can be optimized, I would imagine it being very much like modern Python - an order of magnitude slower than native / static bytecode languages but meant as glue.  It would have a syntax similar to a JSON like data serialization format in its syntax (whitespace significant data serialization?) and would be everything from the shell to the userscript language.  It could also run in the browser environment (the lack of multi-language browsers now and the coupling of html parser with JS JIT seems silly).  It could also have fast applications like pyGTK does now with ubuntu applications, so you could easily use this as an intro language.  But unlike Python, module availability might change syntax - I imagine something like run() being a function to run some file as an executable (given the operating environment has permissions).

I'll have to think about file permissions on this one.  The executable privilege on a per-file basis in linux bugs me some times.  I'd blame it more on how the file system is eccentric on its usage of extensions - the absence of extensions for runnable applications, mandating the presence of an executable "flag" while the OS still uses file extensions elsewhere makes it seem hodge-podge to me in some ways.

Anywho, point is, I'm dumb and bad at learning, so I would really like a "computing" environment with as few syntaxes and dialects in its core stack as possible, instead of the mess we have now - as beautiful as that mess may be, I know most of "us" wade through it lost and without a chance to comprehend the whole thing, and computers shouldn't be that complicated.  We did it, we can make it better.

No comments:

Post a Comment