2012/10/12

Software Rants 2: Useful File Abstractions

Since my plans to explore Plan 9 came crashing down when the live cd wouldn't even boot (in 386 mode even!) I'm going to tackle my thoughts on what Plan 9 is all about - files.  And distributed systems.  But mostly files.

Since I can't actually boot the thing, I'm going off videos I watched, tutorials I read on the site, and a professors pdf beginners manual for Plan 9.

So the system is a monokernel where this kernel controls the filesystem because the filesystem is so integral to the running of everything.  Device drivers are there, and are exposed as file systems, so I could rant about how microkernels are great, but this ain't necessarily the time.  The kernel of plan9 is less interesting than what you can do with it.

So I think a procfs /proc is one of the best things ever.  Exposing devices, hardware interfaces, and writable / readable buffers to interact is amazing.  One of the best things Linux ripped from Plan 9 was the idea behind the procfs.  Being able to do system calls through files (since you are already blocking on read / writes) is great.

The /net is also amazing.  Exposing sockets, tcp and udp protocols, under a filesystem is just the natural way to do things.  You already read and write to sockets and streams, having them exist in the VFS seems obvious, but only Plan 9 does it.  Which is sad.  Because the way Plan 9 does it is amazing.  Open a new connection on port 80?  Guess you should make a file in /net/tcp for it.  Since each application and user has its own namespace view on the filesystem, you don't have name conflicts.  This has a downside I'd imagine though, since debugging an application whose files are not obviously exposed through other programs to the user might be problematic.  It seems like a trivial problem to solve, since /procfs could just provide mount points for the file systems of other running programs if the program and OS gives that application permission.

So that comes to what I think are useful file abstractions - files on disks, obviously.  File systems to abstract physical disks and partitions from one another, absolutely.  Hardware information as live generated files to profile the environment under proc, delightful. The idea behind /net, having files for sockets, ports, and protocols.  Astonishing.  Hardware devices, such as the sound card, as read and writable devices (maybe even providing their own file systems to control system calls and behavior) revolutionary.

Interprocess communication through file systems also works nicely with namespaces.  A process can request to share some portion of the visible file space of another process, and they can interact privately through files.  You can open sockets to one another too.  9P is amazing because it doesn't care if your file request is local or remote - which I feel is an abstraction sorely lacking in every modern OS.  We already call the localhost 127.0.0.1 (such an arbitrary ip) and already have some processes opening up socket connections to the localhost for interprocess communication.  I'd figure almost anything is better than arbitrary protocols like d-bus or more generalized RPCs.  Sharing serialized text data is really easy over a network or a file system, but harder over a signaling schema.  Signals are another interesting prospect, because you could abstract and provide process signaling folders where you can write to a signal file and the operating system will do a kernel signal on the process in question with the payload you write.  That is amazing.

The idea of having distributed components of the operating system in Plan 9 is also neat.  I'm only saying neat here, because the fact they all exist in kernel space with the OS as a giant sludge of unncessary vulnerability is a little off putting - why the hell is rio, the window manager, running in the OS?  All a kernel needs to do is provide abstractions.  It abstracts processes by allowing for scheduled execution.  It provides virtual memory so application's can exist in their own sandboxes.  It provides means to hook and interact with hardware devices (or in Linuxs case, just handles the drivers itself, and provides software devices to interact with).  That really should be it.  If you have something that starts, runs a scheduler, starts a memory manager, and provides privildged access to control and interact with the devices it has available to it, you have all a kernel need do.

Nothing stops a sufficiently well designed kernel from providing its init payload the ability to hook userspace software drivers for very hardware level components given certain privildges.  I really think that the problem with kernels, security, and a lot of other modern computer issues is the failure to ever conceive a really sound execution privilege hierarchy.  Linux has group permissions, but the kernel isn't providing per-device privileges for access or manipulation.

This isn't about file system privileges - groups cover that fairly well, albeit the creation of groups being restricted to root seems obtuse.  It is about device and interface privileges, allowing some programs to hook into devices and some to be restricted in what they can interact with.  Maybe some executable, made by a privileged user, shouldn't have read access to certain folders?  Maybe it shouldn't have write access?  Maybe you don't want it to see the printer as a device file?  That kind of extreme fine grained per process execution control is lacking, at least in a concise way, and that limits the security model of the modern operating environment significantly.  Plan 9 and its process namespaces does an amazing lot in moving towards solving that problem, since you can distinguish who gets access to what folders and files based off the creators desires and its own privileges. 

So Plan 9 is awesome, but Plan 9 is ugly.  It scraps the C standard in many ways, and still uses C for everything.  It doesn't embrace static file typing, which I really feel would make an everything-in-the-filesystem approach much more palpable to people if they can easily tell what any given file is supposed to be just from the file extension.  The ability to host pieces of the OS as servers is great, but many things shouldn't be running in such a privileged state when it isn't necessary, and like I argued, many things can be limited to userspace just fine.

No comments:

Post a Comment