2012/10/16

Software Rants 3: Exploring L

I talked about L in my post about my dream operating system.  The idea was that you could take 3 decades of low level programming experience and write a language that maintains the efficiencies of C and its ilk without having the unintuitive syntax and backwards compatibility woes.

One big thing is that this is entirely opinionated.  C++11 will be the de-facto low level large scale language for many projects going forward, and my complaints about it are never that it doesn't fit its purpose.  I'll speak my reasons by example - in C++11, rvalues are dnoted as T&&, since they already used T& as a reference, and &thing is the address of something.  << and >> are overloaded by pipe operations from <iostream>, which is actually something you can do to any object (overload operands).  I won't even decry that behavior, since it has its uses - lots of people like how many languages use += as a concatenation operator on strings instead of "App".concat("le) or some other object syntax.

So what do you want out of a new fangled low level language?  You want all the productivity efficiencies of low level abstractions we have developed and taken to heart in recent decades, first off, and here is a list of my personal preferences:
  • Templates - I hate writing C code without them.  They are my killer C++ feature.  I can't write generic functions without them, and since C doesn't have classes (which means it doesn't have polymorphism) I can't use virtual overloads through inheritance. 
  • Classes, and more importantly, polymorphism - you can emulate classes pretty reasonably in C with just structs and function pointers.  Access modifiers aren't inherent to classes either, and you can use namespaces as a replacement.  The real benefit of classes is polymorphism and virutal function lookup.  Now, an important quality in any low level language is to not hide implementation, so virtualizing a function absolutely needs to be a user decision as it is in C++.
  • Access modification - private / public / protected, and all that.  And not in the C++ hackneyed overloaded jump references syntax of public: stuff.  C# / Java declaration specific defining of scope.  You may have a way to declare a namespace of public or private data members, but it would need to have dedicated syntax, because data accessors are not anything like block naming and it is just another hackneyed obtuse angle of C++.
  • Objects from the ground up - being able to treat anything as an object as desired, and understanding that low level objects are nothing more than a package of data members and functions.  I feel this is something sorely lacking in any low level language, including the syntax to have something like (15l).toString().  One of the greatest weaknesses of even Java is the negligance in treating primitives and objects harmoniously (reference vs copy passing, Integer and int being separate types but you can type coerce one into another implicitly, etc).
  • Function objects - in keeping with objects from the ground up, you want your functions to be regular objects just like primitives or collections of either.
  • Pointers as their own type, preferably templated - *int is not a hard concept, it is a hard syntax.  c and c++ are usually the only exposure almost any programmer gets to memory addressing, and they are a completely alien ideology of glyphic modification of meaning of referencing to a primitive.  I would absolutely much rather see, instead of *int, pointer<int>.  I want to elaborate on this, so lets have some examples of this idea:

C++:

int *foo, **bar, zoo = 5;
foo = &zoo
**bar = &foo
cout << foo // is the address of zoo
cout << *foo // value of zoo (5)
cout << bar // address again
cout << *bar // address of zoo
cout << zoo // the integer
foo = 5000
cout << foo // memory address 5000
cout << *foo // segfault most likely, page for 5000 probably wasn't generated

L:

reference<int> foo; // compile time error if pointing to a non-integer
reference genericfoo; // can point to anything, ambiguous contents
reference<reference<int>>bar; // compile time error if not point to an int pointer
reference<reference>genericbar; // can point to any pointer since it is generic
int zoo = 5;

foo = reference(zoo) or zoo.reference() // the OO way is to bind a reference function to the Object type, but that obscures implementation, since it acts as if you are calling a function - in truth, it is a compiler reduction to the address of zoo, so something like reference zoo akin to return zoo might be more appropriate, to not obscure purpose, but not maintain the glyphic overload of &

bar = reference(foo) // hey look, consistency!

//using python print syntax, just for brevity, I'd imagine the real L would require a reference to the stdout pipe to write to
print(foo) // address
print(genericfoo) //another address
print(foo.val or val()?) // print contents...

foo.val seems more appropriate, since it is a template data member.  Due to explicit templating, you know it returns an int, if it was generic, it would be an Object.

That brings up another interesting train of thought.  I wonder if there would be a way to unify the concepts of typename and the globally inherited Object? 
Traditionally, typename t fulfills nearly the same idea as referencing an Object, except that typename is a compile time generic with explict declaration at runtime but Object is always generic because it is used at runtime.  Actually, that seems silly - an Object is a type, but a type is not an object, it is a name.  So types are names and objects are named things.  So the type / object distinction is still useful.

Anywho, the point on this pointer business is that if you use a ground up language designed around complex bundling of functionality into classes, with all the benefits of templates and polymorphism involved, it seems silly not to wrap a lot of the more obscure behavior, like pointers, into an object of their own.

Some other ideas I'm just footnoting here:

I like the idea of using : instead of = for "equality".  In javascript maps, you use the syntax {foo:5, bar:'bacon'}, and that is using the traditional colon means is syntax.  The equals sign really does imply equality, and although mathematically haivng the concept of glyph name equals contents being a mathematically sound statement also works, it kind of borks the concept of equality in an environment where you do have explicit equality in the form of ==.  Also, in classic C++ and C, the colon is really underutilized, because it implies block declaration.  Since namespaces and blocks are effectively the same thing (except traditional C blocks don't denote namespace containment, but the point is they both represent contiguous regions) you lose value in the colon anyway.

Maybe just stick with = for backwards compatability.  But just as a brain tease, which of the following makes more sense?

foo is 5
foo equals 5

I'd argue the first, because you are making the claim that the integer foo is 5, not the awkward statement-not-question that integer foo equals 5 (since you are already using the interrogative form elsewhere)

No comments:

Post a Comment