This is a deep subject, and challenges the way many programmers think. It is rooted in the most fundamental assumptions about the way computers work and the way we program them. In my experience, the ideas first strike people as being simple, obvious, and uninteresting. They seem irrelevant to anything important, and even seem like far-out crank talk.
One way to approach this is to imagine that you stop a computer and examine its memory, byte by byte. At one level, it’s all data. By definition, what’s stored in a computer’s memory is data. But in practical terms, every single byte falls into one of two categories: (1) “plain” data, and (2) instructions. Instructions are placed or loaded into a computer’s memory like any other data, but unlike plain data, the computer can be “pointed” to a starting address and told to start executing the data that is there, and good things will result if the data conform to the rules for instructions.
Achieving Occamality through definitions essentially amounts to dividing the computer memory into three categories: in addition to instructions and “plain” data, you add “meta-data.” If you wrote a program with just data and instructions, you would have a certain amount of space devoted to each. If you write a program using meta-data, the data stays pretty much the same, but the space devoted to instructions typically shrinks by a great deal, and you have a good deal of space devoted to meta-data.
It is important to note that the meta-data should not be in the form of tokens, and while the instructions in some sense “interpret” the meta-data, the meta-data should not primarily be a directive, imperative program.
In practical reality, what you do is create a descriptive, declarative language that is as close as possible to the problem domain as possible. There should be a minimum of translation between the “natural” way of thinking about the problem and the way the problem is expressed in the language. A good example is a navigation system as described here – the meta-data describes the map in natural terms. If you understand language concepts at all, there is probably a one-to-one relation between elements on a visual map and elements in the map description language.
Of course, even a highly declarative approach has a directive aspect to it. My intention here is to emphasize what is usually ignored, not to present an either/or. For example, while a program that gives directions mostly consists of the map meta-data, the direction-generator itself can usually be built in a partly imperative and partly declarative fashion, and needs some plain old parameters, things like whether to avoid toll roads or interstate highways.
This is not a left-field approach to programming. In fact, every method of programming computers, with the arguable exception of “raw” assembler language, starts with a “model” of the kind of program you are going to write, and a choice of how to think about that program.
One of the earliest approaches to rising above raw assembler was the language FORTRAN, short for “formula translator,” derived from the fact that its creators were scientists and engineers who wanted to put their mathematical formulas into the machine for solution.
The business programmers who wanted to use the machine for common business record-keeping functions didn’t find FORTRAN particularly helpful. So COBOL, an acronym for Common Business-Oriented Language got invented.
While it’s reasonable to think about FORTRAN, COBOL and the various languages that succeeded them in terms of language, and this has often been done, it is also reasonable to think: When I sit down with language X, what kind of problem does it help (or hurt) me to think about? How much “translation” is required from the natural terms of thinking about the problem and the language? Do the very terms of the language talk about things I think about? If not, you probably have an opportunity to create a definition domain in which to express your problem, and write a much shorter program than you would otherwise need to write to get the job done.
I’m happy to say that the approach I advocate here is very close to what many people call “model-based” programming. It is ironic that this is so hard to describe, instead of just being mainstream common sense. When you think about a body of code that solves a problem, it is normally possible to code each abstract function that the code performs many times exactly once, and then reduce things it does uniquely to a set of meta-data. Most programmers are pretty comfortable taking this approach when defining the schema for a database-style problem. The model-based approach is really little more than taking the concept of a database schema and greatly generalizing it.
Comments