Someday, there will be tools that actively help you build occamal software. I imagine that the tools will resemble a modern IDE, but will have assists, wizards, visual representations and other methods of helping you see actual and potential commonalities. In addition, there will be common components, both part of the development environment and part of the execution environment, which will make building occamal software easy and natural. Until such tools and components exist, the work and imagination of the developer will have to fill the gap. Remembering that we live in the real world, our goal is not to build software that is perfectly occamal; software that is nearly occamal would be wonderful, and software that is much more occamal than today’s software would be a big improvement.
A good way to think about occamality is the elimination of redundancy. It may be useful to put redundancy into categories that vary by blatancy. So we can identify:
- Simple redundancy
Simple redundancy is really, really blatant redundancy. There’s not much excuse for it, although some programming environments are shockingly encouraging of it. A great deal of the Y2K problem was due, sadly enough, to simple redundancy. Most of the relevant programs were written before databases were in common use, and there was in any case no native support for the “date” data type in the programming environments used; in the vast majority of cases, no one bothered to take the simple steps required to create the then-equivalent of a date data type. During the seventies, instead of eliminating simple redundancy, people spent their time engaged in fierce ideological arguments about whether “structured” programming represented an advance in program quality and programmer productivity or whether it was just a bunch of hoo-hah. The extent and depth of the Y2K problem told us the answer.
Simple redundancy is normally cured by replacing the redundant instances with references to a definition of whatever it is that they have in common.
- Complex redundancy
Complex redundancy takes some real effort and energy to eliminate, though often not a great deal. Complex redundancy involves redundancy over multiple programming environments or some other issue that raises it above the simple. For example, you may display dates on screens and store dates in databases. Is there a truly single place where you define what you know about “date” in general and each date in particular, for both environments, including temporary variables and parameters? If not, you have a case of complex redundancy. Dates are relatively easy these days, because most systems provide some sort of native support for them. So take something more application-specific, which is unlikely to be predefined, like account number, part number, order number or something like that and ask the same questions.
Complex redundancy is normally cured by shared definitions, but those definitions need to be created and maintained outside of the individual programming environments, and generated into them as needed.
- Redundancy due to incidental differences
There are many cases where there are true differences between potentially redundant items, but they are distinctions that don’t really “make a difference.” This kind of incidental difference frequently comes out of “over-determination,” for example, specifying that a certain field should be at this exact X and Y position, when what you really mean is “under” some other field.
- Redundancy due to over-refinement
Sometimes differences that are potentially “incidental” spring from definite user requirements. Typical sources of such requirements are highly experienced and sophisticated users, experts in a field, who know how it’s best to do things. They know, for example, that at step 4 of a certain process, they have always wanted the screen to be different in this or that way, to remove or add these buttons or fields, or do something else that makes that screen different from the others. It’s something they feel strongly about. It expresses their knowledge, experience and judgment. Rejecting the requirement can be taken by such people as ignoring their experience, denying their knowledge and deprecating their judgment. In other words, it’s bad. It may be a great idea to incorporate the suggestion – or it may be yet another distinction that definitely has a cost from the beginning to the end of the software project, and in the end makes no real difference. From the point of view of Occamality, the bar has to be set very high for such things.
- Redundancy due to external software
Another prime place for complex redundancy to show its ugly face is in interfaces. Many interfaces require you to do the same thing over and over and over again; they give you no choice. If there is no way to centralize this and eliminate the repetition (see next section), you’re stuck in a highly non-Occamal situation, and there may be little you can about it.
- Redundancy imposed by the programming environment
Some programming methods, tools, languages and environments naturally lend themselves to redundancy more than others. A good, simple example is the “polish” convention of naming variables, in which the variable name includes its type, for example instead of just naming a variable IDENTIFIER, you would name it IDENTIFIER-INT to indicate that it’s an integer. The convention came out of a real problem – someone would apply an operation to a variable that was inappropriate for its type; wouldn’t it be a nice idea if the name of the variable itself reminded you of the type so you wouldn’t make such mistakes? Yes, of course, but then if you need to change the variable’s type, either its name misleads you or you have to find every instance of it and change it. While well-intended, it’s non-Occamal.
Once you begin to think about simple things like variable naming, you begin to wonder why in most programming environments, the name of a variable has any significance at all? Why can’t you go to the single place where the variable is defined, and change anything at all about it – including its name?
The exercise is pretty simple. Anyplace you see repetition of any kind, you have to ask yourself, when I make a change, do I have to find all the places and make the change? The answer is probably yes, and it’s not good.
- Redundancy to which abstraction can be applied
If you’re just looking at code, redundancies may not spring out at you – in fact, at the level of the code, there may not be redundancies. But that doesn’t mean your code is Occamal! Sometimes there are common ideas that are expressed with great diversity in the code – or so you realize once you grasp the relevant abstraction. In such cases, it is typical that the people who understood the original problem and the people who wrote the code thought they were dealing with many independent things, and the abstraction that unified them simply never occurred to them.
A good example of this is a collections system I worked on. This was a huge body of code that was used to automate the work of hundreds or thousands of people in call centers whose job was to call people who owed money to an institution, for example a credit card company, and get them to pay. The code had a wide variety of concepts implemented in that were specific to the collections industry. I came into the situation with a broad background in multiple workflow-type applications, and quickly recognized that collections was 95% call-center-oriented workflow, with a little customization and a few specific features for collections.
It wasn’t obvious when looking at the code, but if you started with the basic workflow constructs (workstep, queue, conditional routing, etc.) and took it from there, you ended up with a much more compact and easily extensible application. The original application was simply a “collections” application with some parameters; changing anything required finding the relevant places in the code and making the changes. The new application implemented a core set of workflow abstractions, and hardly ever needed to change. It also had a set of easily editable tables that expressed the current state of the workflow, which enabled almost anything to be changed. Finally, it had a small collection of application fragments that were truly specific to collections, things like the “promise to pay” function.