While not much discussed or broadly recognized, the vast majority of efforts to build computer software are disasters. They take too long, cost too much, and result in varying degrees of crap. Lots of solutions have been been promoted and tried. None of them have worked.
There are exceptions, of course. The exceptions prove that it really is possible to create good software quickly and efficiently. It is highly unlikely that the cause is that the methods taught in academia are good, but screwed up when applied; instead, it’s likely that nearly everyone has it wrong, just like doctors did when they bled patients to cure them and refused to sanitize when operating on patients.
A major cause of the dysfunction can be found in the approach to building software that was taken for good reason in the early days of computing. This approach was necessary for the first decades of computing. As computers grew more powerful, the necessity of doing things the same way began to fade away and finally disappear, Many of the early cumbersome practices were discarded, but the key focus has remained the core of software development to this day.
What is this universally accepted, unquestioned aspect of building software that does so much harm? Simple: it’s obsessing on software imperative language, relegating data definitions and attributes to a necessary annoyance, confined to a tiny island in an obscure corner of the vast sea of procedural language.
Does this sound awful? No. Everyone who does it doesn't think they're obsessing -- they're just working, writing their code! Similarly, getting water from the local community well didn’t sound awful in the 1800’s – until people finally found out about water contaminated with diseases like cholera. It took decades for the need for sanitation to be taken seriously, when the result of doing it poorly was death! We can hope that procedural language obsession will in the future be recognized as the main source of disease in software.
Early Computing
The roots of the language obsession are in the earliest days of computing. It was one thing to build a machine, and quite another to get it to do what you wanted it to do. The start was plugs and switches. Then the stored program computer was given step-by-step instructions in binary machine language. In the 1950’s first FORTRAN then COBOL were invented to make the process of creating the precise instructions needed easier, while still enabling the computer to operate at maximum speed.Those were indeed big advances.
In the 1960’s it still took a great deal of careful work to get results in a timely manner from computers. While languages like FORTRAN made writing easier, the fact that a compiler translated them to maximum speed machine language made their use acceptable.
The Apollo space capsule had a custom-built guidance system that was essential to its operation. Here is Margaret Hamilton next to a stack of code she and her team wrote for the Apollo Mission computers.
The Apollo guidance computer was a fast machine for its day, but the programmers had to get all the power out of it that they could to guide the capsule in real time. This is an extreme example, but 20 years into the computer revolution, everyone focused on using compiled procedural languages to get performance, and assembler language when necessary.
It was already evident that getting programs written quickly and well was incredibly hard. In fact, a big conference was held in 1968 to address what was called the "crisis" in software. Nothing got fixed. Meanwhile, efforts continued then and to this day to invent new programming languages that would miraculously make the problem go away. Nothing has changed for the better.
Partial steps towards declarative
From the early days to the present, there have been isolated efforts to go beyond simple definitions of data for procedural commands to operate on. Generally, the idea is that procedural commands spell out HOW to accomplish a task, while data definitions and attributes define WHAT the task is. It's like having a map (what is there) and directions (how to get from place to place on the map). See this for an explanation.
The invention of the SQL database was a small but important early step in this direction. SQL is all declarative. It is centered around a schema (a set of data definitions organized in rows and columns). The SELECT statement states what you want from the database, but not how to get it. WHAT not HOW!
You would think this would have led to a revolution in language (HOW) obsession. It didn't. In fact, because the language obsession stayed in charge, in some ways things got worse.
A few years after the DBMS revolution, people started putting big collections of historic data into what were called data warehouses. The idea was to make reporting easier without impacting production databases. Before long, OLAP (OnLine Analytical Processing) was invented to complement existing OLTP (OnLine Transaction Processing). While there were many differences, the core of OLAP was having a schema definition in the form of a star (star schema), with a central table of transactions and tables related to it containing the attributes (dimensions) of the transactions, typically organized in hierarchies. So there would be a time dimension (days, weeks, months), a location dimension (office, region, state) and others as relevant (sales, profits, department, etc.). After constructing such a thing, it was easy to get things like the change in sales from month to month in the hardware department without writing code.
OLAP was and is powerful. It assigned attributes to data in hierarchies, with unchanging programs that made it easy to navigate around. You could add attributes, dimensions, etc. without changing code! What an idea! But the idea was strictly confined to the isolated, distant island of OLAP and had no impact on software as a whole. The procedural language obsession continued without pause.
Declarative front and center
Procedural code is necessary. Code is what makes a machine run. However, the time for near-exclusive obsession about procedural code has long since passed. Limitations of computer speed and storage space were a legitimate reason to obsess about using the speed you had optimally. Think about cars. Engineers worked for decades to get the maximum speed of cars from about 10MPH to finally breaking 100. Now it's in the hundreds. In a much shorter period of time, computers have increased in speed by millions of times. Computer speed is rarely an issue.
How can we spend a little of the mountains of excess, unused computer speed to help make creating computer software less dysfunctional? Maybe instead of concentrating on procedural languages, there's a another way to get computer software to work quickly and well?
There is a proven path. It's obsessing about the WHAT is to be done. Obsessing about the data and everything we know about the data -- its attributes. This means applying the fruitful but limited approach we took with OLAP, and extending it as far as possible. In other words, instead of creating a set of directions from every possible starting point to every possible destination, we create a map in the form of metadata and a tiny, rarely-changing direction-generating program that takes as input starting and ending points and generates directions. You know, like those direction-generating program that are so handy in cars? That's how they work! See: https://www.blackliszt.com/2020/06/the-map-for-building-optimal-software.html
When we do this, we'll advance from the endless complexities of the solar system as described by Ptolemy to the simple, clear and accurate one described by Newton. What Newton did for understanding the movements of the planets, Metadata obsession will do for software. See this for more: https://www.blackliszt.com/2022/10/how-to-improve-software-productivity-and-quality-code-and-metadata.html
Software development is stuck in the endlessly complex epicycles of Ptolemy; we need to get to Newton.
Comments