2012/10/02

Filesystems Ranting

I've been playing around for a few days learning the various virtualization solutions on Linux, after having some past experience with qemu I finally jumped on the virtualbox bandwagon.  I really missed a lot of features!  I plan to try out kvm since it is kernel based and promises good performance, and using anything from Sun now that Oracle is decimating all their OSS projects leaves me hesitant. 

Anywho, virtualization is awesome, hard, and way beyond my intellect in general.  A lot of the meta elements of software blow my mind at their complexity, and what is effectively a machine code JIT when not running the same instruction set definitely is up there.  I want to complain a bit about the commonplace Linux VFS structure, because it makes my head hurt. 

Let us start with the greatest insult to the world - usr.  The only reason I can look at my root directory without crying is that the existance of usr would make for a great mount point for user space applications if opt wasn't used by half of commercial software.  But that is just the worst offender, let us look at the myriad of folders in my / on my Ubuntu 12.04 partition (I'll switch to Arch soon, but I wasn't that hardcore 2 months ago). 

bin, dev, boot, etc, cdrom, home, lib, lib32, lib64, lost+found, media, mnt, opt, proc, root, run, sbin, selinux, srv, sys, tmp, usr, var.  The whole usr and bin / sbin debacle is explained elsewhere in more respectable vocabulary so I don't even get into that, but it is stupid.  If the world made sense we would have system wide applications and user wide applications as the only application subdivisions needed.  We don't, so we have bin, sbin, usr/bin, usr/sbin, usr/share/bin, opt, and some crap that installs itself in bizarre places like ~/.app.  So that sucks.

cdrom is rubbish because a cd drive should be a device that you treat like a freaking device and mount like a device.  It should be under /mnt or /media.  Those two suck as well, because mnt is supposed to be device mounts that are not a part of the OS FS and /media is supposed to be for removable media.  Too bad 12.04 defaults to everything being freaking removable media, including SAS and SATA drives.  That is mostly genomes fault though, since it is using their hardware detection defaults. 

Now, Windows for comparison isn't much better.  Having the top level directory be all devices and having special behavior at the top level like it does makes it a leaky abstraction to no end.  The beauty of Linux is that it can mount arbitrary devices, be they network servers, usb sticks, or raid drives on mount points and abstract away the pains of D:\\ and C:\\.  My E: drive in Windows, for example, is /media/Storage in Ubuntu.  Which is slick.  The non-slick part is that devices (and not dev, that is different!) are by default mounted in media, but Ubuntu keeps mnt around (and I do too, it is more grammatically accurate about SATA drives than media) for posterities sake.

Speaking of posterity, bin and sbin are rubbish everywhere.  Superuserbin or whatever you want to call it is absurd, because every linux FS is advanced enough to have per-file permissions, which is what you *really* want with applications.  The abstraction of two folders, one for everyone and one for root, only works until you want tiers of privildged application users and that abstraction breaks down.  Also, most distros use sbin on the user path anyway, even if they can't run anything in there, so they know that blkid and fdisk require su permissions, which defeats the performance benefit of skipping a bunch of binaries a user can never run in the first place.

lib, lib32, and lib64 are just hilarious.  Really.  Var is another interesting one, since variable data goes there.  When I traced the folder for all files, pretty much all of them were under /var/lib or were the logs.  So application data was being stored there.  Seems dumb to dedicate a top level directory to it.  Just a reminder, usr has everyone one of these in duplicate because who wants to make sense.  I would imagine applications would rather store logs and vars user local with some top level system default for multi-user applications, but not in the root directory.

lost+found is really, really nutter.  A directory just for lost file recovery.  That is always there.  You couldn't have /media/lost+found or something, it needs its own top level directory that never  gets used, because ext4 doesn't blow chunks as much anymore and anything lost at this point is hardware failure.  Which is good.

Anywho, I'll propose what I wish I had as my top level -

users
system
applications
devices
network

The end.  Users can contain root or superuser, and any other users made.  I'd even wager you could get rid of applications and put it in /users/all/applications or something, it would be applications that are installed for all users with execution permissions and write in their own folders only (so they can modify their state but the user running them can't arbitrarily modify the applications configurations if it doesn't want them to).  Devices would be physical attached media, be it non-mounted physical drives, usb sticks, or cd drives, but they are all the storage devices that are not a part of the file system at boot that are auto-mounted (if they are).  Network would be an abstracted file system like proc that contains network devices.  People like seeing those.  Might even put those in devices and be really pretty about it.  System would be a bulk folder containing most of what goes in / right now - I could imagine system being something like:

status (renamed proc)
dev (the old /dev raw device files)
lib
log
var
tmp
bin
sbin
boot
cfg
headers (from /usr/headers)
dump (alternate lost+found)

Of course I'd love to see Plan 9 style system calls in here too.  Point is, having the equivalent of the M$ Windows folder splayed over other important concepts.  Nothing stops you from doing the traditional VFS mount different parts of the filesystem all over different disks).  The only downside here is that everything needs to be able to see inside the system folder (you'd want to hide it in a filebrowser from your average joe user) but if you don't put scary important stuff in /system you should be fine, you can still restrict viewing of the subdirectories appropriately (ie, everyone can read / write files they create in tmp, var, and log, but not other applications stuff unless they have super permissions).  lib would hold libraries of course, but I would imagine we could be sane and have a file nomenclature for libs that makes sense, like glibc-x86-64-3.332.so like the kernel is named. 

Let's talk about lib a bit more.  For one, if I ever implemented this FS, it would be on something running 64 bit anyway.  You still have a glutton of 32 bit applications without 64 bit versions (it isn't as much of a problem under Linux since most stuff is foss and you can recompile it in 64 bit anyway) but there are enough programs to make it an issue.  Of course, they all expect /lib or /usr/lib or usr/share/lib and 32 bit libraries anyway, so you can stick whatever old nomenclature sos they use in /system/lib and let the linker deal with the conflict.  I can't justify having separate 32 and 64 bit folders when 32 bit should be going by the wayside soon and you can just deal with some old syntax libraries cluttering up your libs.

Anywho, the idea is that you abstract away the operating system internals from the file system so average joe doesn't get scared when they open / and see everything.  Hell, you could skip the whole /system thing and put the internals at the top level - it is better than the usr mess we have now.

Of course, I also need to mention what is in place now works, and has worked for years, and was the product of years of iteration on old and new concepts.  The mess is only superficial because nobody is supposed to even be looking at / anymore - you want your average joe to think ~ is the world and devices are magically showing up as they attach them.

I'd still argue that is dumb, because files and file hierarchies are some of the best abstractions in computing and trying to hide them makes people dumber, but to each their own, when I have my own distro and influence I'll shove people around :P

No comments:

Post a Comment