2013/06/24

Magma Rants 5: Imports, Modules, and Contexts

One of the largest issues in almost any language is the trifecta of imports, packaging, and versioning. For Magma, I want it to be a well thought out design that enables portable, compartmentalized code, interoperability between code, and the ability to import both precompiled and compiled object code.

First, we inherit the nomenclature of import <parent>:<child>, where internally referencing such a module is through the defined <parent>:<child> namespacing. Imports are filesystem searched, first locally (with a compiler limited depth, blacklist, and whitelist available) then on the systems import and library paths. You can never define a full pathname import to a static filesystem object with the import clause, but the internal plumbing in std:module includes the necessary woodwork to do raw module loading.

The traditional textual headers and binary libraries process still works. You don't want to bloat deployment libraries with development headers, though if possible I'd make it an option. Magma APIs, with the file suffix of .mapi, are the primary way to provide an abstract view of a library implementation.

In general practice though, we want to avoid the duplication of work in writing headers and source files for every part of a program to speed up compile times. This is mostly a build system problem, in that you want to verify (via hash) a historic versioning of each module, so if it changes its hash you know to recompile it. This means you should should write APIs for libraries or externalized code - which is what a c++ header really should be for.

In addition, an API only describes public member data - you don't need to describe the memory layout of an object in an API so that the compiler can resolve how to allocate address space, you just specify the public accessors. When you compile a shared object, the public accessors are placed in a forward table that a linker just needs to import out. Note that since a library can contain multiple api declarations in one binary, the format also has a reference table to the API indexing arrays.

The workflow becomes one of importing APIs where needed, and using compiler flags and environment variables to search and import the library describing that api. One interesting prospect might be to go the other way - to require compiled libraries be named the same as their apis, and to have one api point to one binary library with one allocator table. It would mean a lot of smaller libraries, but that actually makes some sense. It also means you don't need a seperate linker declaration because any imported api will have a corresponding (for the linkers sake) compiled binary of the same name in the library search path.

I really like that approach - it also introduces the possibility of delayed linking, so that a library isn't linked in until its accessed, akin to how memory pages work in virtual memory. You could also have asynchronous linking, where accessing the libraries faculties before it is pulled into memory causes  a lock. Maybe an OS feature?

As a thought experiment I'm going to document what I think are all the various flaws in modern shared object implementations and how to fix them in Altimit / Magma:

  • You need headers to a library to compile with, and a completely foreign binary linkable library or statically included library to link in at build or run time.
  • You need to describe the complete layout of accessible objects and functions in a definition of a struct or class, so that the compiler knows the final size of an object type.
  • You need to make sure the header inclusions and library search path contained the desired files, even on disparate runtime environments.
  • Symbol tables in binaries can be large and cumbersome to link at runtime and can create sizable load times.

No comments:

Post a Comment