2012/10/07

Static Typing as a Religion

Typing is a pervasive concept throughout software. You want to describe data - and types are the what to a names who.  Having a very clear definition of "what" pertaining to everything you deal with makes interactions simpler and purposes more obvious.

If / when I develop with other people involved, I love statically typed languages because they mean that my code is easier to comprehend for others, and that I can more easily debug someone elses code.  The barrier to entry on static types is lower, even if the "power" of boxing means you can do some crazy things in dynamic languages.  I have a natural aversion to "crazy" things, mainly because I am both not very smart and not very good at development.  So static typing makes things easier.

But they are also more performant.  So they are easier to understand and debug, and perform better, the only reason to use a dynamic language is for the lack of boilerplate necessary for type conversions and the power of boxing languages in terms of a language feature.

Now, that is really the end of the programmatic static typing blog.  I want to talk more about static typing elsewhere.

Windows does a lot wrong.  Its user model is awful, its VFS is crap, it has terrible abstractions of hardware, etc.  But one thing it does right is that it statically types every file in the system.  The Unix way of having lots of files without static typing (hint, file extensions) leads to a lot of ambiguity in interpreting what something is meant to do.  Windows doesn't need an executable bit, because Windows executables are .exe.  You can actually put a file extensions on Linux binaries - .elf, for executable and linkable format.  Linux doesn't traditionally do this because Unix didn't use many file extensions.

There are a lot of latent extensionless files floating around, especially on the software side of the traditional Unix ecosystem - MAKEFILE, README, CHANGELOG, etc.  But these are ambiguous files from an external viewer, and it requires more information to determine what it is meant to be.  The default is just to treat it like a text file, and hope it matches your default encoding, which nowadays is UTF8.  And if it doesn't?  I guess you get gibberish. 

In Linux space, a lot of files have their type distinguished by metadata in either the filesystem or file header.  You hope that it exists or makes sense, because the name of the file itself doesn't give you any clues.  If you don't recognize the metadata syntax of a file, you might interpret it wrong as something else.  When you don't even know if it is supposed to be interpreted in some encoding or as binary you have little assurance to stand on.

Of course, I can rename a .txt to .jpg or vice versa and get gibberish.  But I can, in software, try boxing FLOAT_MAX as an int and get an overflow.  Every sound security policy can be broken.  But when you don't make assumptions, you maintain more clarity of purpose.  If I ever designed a file system, I would absolutely mandate all files be statically typed with something akin to a file extension (I would argue, though, that the traditional foo.bar syntax is unintuitive since traditionally a dot or decimal point means either a property of or a fractional component, it doesn't naturally mean "type of").  I'd rather see it called foo:bar, a colon is a much better delegate sigil.  I would go so far as to mandate it - do what Windows does, if a file is extensionless / typeless, bitch about it.  Because you shouldn't be making assumptions.

1 comment:

  1. This comment has been removed by a blog administrator.

    ReplyDelete