Components and layers are supposed to make software programs easier to create and maintain. They are supposed to make programmers more productive and reduce errors. It’s all lies and propaganda! In fact, the vast majority of software architectures based on components (ranging from tiny objects to large components) and/or layers (UI, server, storage and more) lead to massive duplication and added value-destroying work. The fact that these insane methods continue to be taught in nearly all academic environments and required by nearly all mainstream software development groups is cornerstone evidence that Computer Science not only isn’t a science, it resembles a deranged religious cult.
Components and Layers
It has been natural for a long time to describe the different “chunks” of software that work together to deliver results for users in dimensional terms.
First there’s the vertical dimension, the layers. Software that is “closer” to end users (real people) is thought of as being on “top,” while the closer software is to long-term data storage and farther from end users is thought of as being on the “bottom” of a software system. In software that has a user interface, the different depths are thought of as “layers,” with the user interface the top layer, the server code the middle layer and the storage code the bottom layer. Sometimes a system like this is called a “three tier” system, evolved from a two tier “client/server” system.
Second there’s the horizontal dimension, the components. These are bodies of code that are organized by some principle to break the code up into pieces for various reasons. I've given a detailed description of components. A component may have multiple layers in it or it may operate as a single layer.
Layers are often written in different languages and using different tools. Javascript is prominent in user interface layers, while some stored procedure language is often used in the lowest data access layer. The server-side layer may be written in any of dozens of languages, including java and python. Sometimes there are multiple layers in server-resident software, with a scripting language like PHP or javascript used for server code on the “top,” communicating with the client software.
Components are self-contained bodies of software that communicate with other components by some kind of formal means. Formal components always have data that can only be accessed by the code in the component. The smallest kind of component is an object, as in object-oriented languages. The data members of the object can only be accessed by routines attached to the object, known as methods. Methods can normally be called by methods of other objects. Larger components are often called microservices or services. These are usually designed so that they could run on separate machines, with calling methods that can span machines, like old RPC’s or more modern RESTful API’s. Sometimes components are designed so that instead of being directly called, they take units of work from a queue or service bus, and send results out in the same way. When a component makes a call, it calls a specific component. When a component puts work onto a queue or bus, it has no knowledge or control over which component takes the work from the queue or bus.
Layer and components are often combined. For example, microservice components often have multiple layers, frequently including a storage layer. With a thorough object-oriented approach, there could be multiple objects inside each layer that makes up a service component.
Why are there layers and components? Software certainly doesn’t have to use these mechanisms to be effective. In fact, major systems and tools have been built and widely deployed that mostly ignore the concepts of layers and components! See this for historic examples.
The core arguments for layers and components typically revolve around limiting the amount of code (components) and the variety of technologies (layers) a programmer has to know about to make the job easier, along with minimizing errors and maximizing teamwork and productivity. I have analyzed these claims here and here.
Data Definitions in Components and Layers
All software consists of both procedures (instructions, actions) and data. Each of those needs to be defined in a program. When data is defined for a procedure to act upon, the data definitions are nearly always specific to and contained within the component and/or layer they’re part of. For the general concept and variations on how instructions and data relate to each other, see this.
Thinking about data that’s used in a component or layer, some of the data will be truly only for use by that component or layer. But nearly any application you can imagine has data that is used by many components and layers. This necessity is painful to object/component purists who go to great lengths to avoid it. But when a piece of data like “application date” is needed, it will nearly always have to be used in multiple layers: user interface, server and database. To be used it must be defined. So it will be defined in each layer, typically a minimum of three times!
When data is defined in software, it always has a name used by procedures to read or change it. It nearly always also has a data type, like character or integer. The way most languages define data that’s it! But there’s more.
- When the “application date” data is shown to the user in the UI layer, it typically also needs a label, some formatting information (month, day, year), error checking (was 32 entered into the day part?) and an error message to be displayed.
- When application date is used in the server layer by some component, some kind of error checking is often needed to protect against disaster if a mistaken value is sent by another component.
- When a field like social security number is used in the storage layer, sanity checks are often applied to make sure, for example, that the SSN entered matches the one previously stored for the particular customer already stored in the database.
- There need to be error codes produced if data is presented that is wrong. When the user makes an error, you can’t use a code, you have to use a readable message, which the user layer might need to look up based on the error code it gets from another component or layer.
- Each language has its own standards for defining data and the attributes associated with it. Someone has to make sure that all the definitions in varying languages match up, and that when changes are made they are made correctly everywhere.
- If the data is sent from one component to another, more definitions have to be made: the procedure that gets the data, puts it into a message and sends it, possibly on a queue; the procedure that gets the message, maybe from a queue and sends the data to the routine that will actually do the processing.
- When data is sent between components, various forms of error checking and error return processing must also be implemented for the data to protect against the problems caused by bad data being passed between components that, for example, might have been implemented and maintained by separate groups. Sometimes this is formalized into "contracts" between data-interchanging components/layers.
So what did we gain by breaking everything up into components and layers? A multiplication of redundant data definitions containing different information, expressed in different ways! What we “gained” by all those components and layers was a profusion of data definitions! The profusion is multiplied by the need for components and layers to pass data among themselves for processing.The profusion can’t be avoided and can only be reduced by introducing further complexity and overhead into the component and layer definitions.
See this for another take on this subject.
I’ve heard the argument that unifying data definitions makes things harder for the specialists that often dominate software organizations. The database specialists are guardians of the database, and make sure everything about it is handled in the right way. The user interface specialists keep the database specialists away from their protected domain, because if they meddled the users wouldn’t enjoy the high quality interfaces they’ve come to expect. There is no doubt you want people to know their stuff. But none of this is really that hard – channeling programmers into narrow specialties is one of the many things that leads to dysfunction. Programmers produce the best results by thoroughly understanding their partners and consumers, which can only be done by spending time working in different roles – for example spending time in sales, customer service and different sub-departments of the programming group.
Data Access in Components and Layers
Now we’ve got data defined in many components and layers. In a truly simple system, data would be defined exactly once, be read into memory and be accessed in the single location in which it resides by whatever code needs it for any reason. If the code needs to interact with the user, perform calculations or store it, the code would simply reference the piece of data in its globally accessible named memory location and have at it. Fast and simple.
If this concept sounds familiar, you may have heard of it in the world of relational DBMS’s. It’s the bedrock concept of having a “normalized” schema definition, in which each unique piece of data is stored in exactly one place. A database that isn’t normalized is asking for trouble, just like the way that customer name and address are frequently stored in different places in various pieces of enterprise software that evolved over time or were jammed together by corporate mergers.
Components and layers performing operations on global data is neither fast nor simple. Suppose for example that you’ve got a new financial transaction that you want to process against a customer’s account. In an object system, the customer account and financial transaction would be different objects. That’s OK, except that the customer account master probably has a field that is the sum of all the transactions that have taken place in the recent time period.
In a sensible system, adding the transaction amount to a customer transaction total in the customer record would probably be a single statement that referenced each piece of data (the transaction amount and the transactions total) directly by name. Simple, fast and understandable.
In a component/object system that’s properly separated, transaction processing might be handled in one component and account master maintenance in another. In that case, highly expensive and non-obvious remote component references would have to be made, or a copy of the transaction placed on an external queue. In an object system, a method of the transaction object would have to be called and then a method of the account master object.
It’s a good thing we’ve got smart, educated Computer Scientists arranging things so that programmers do things the right way and don’t make mistakes, isn’t it?
Conclusion
With the minor exception of temporary local variables, nearly every layer or component you break a body of software into leads to a multiplication of redundant, partly overlapping data definitions expressed in different languages -- each of which has to be 100% in agreement to avoid error. The communication by call or message between layers and components to send and receive data increases the multiplication of data definitions and references. Adding the discipline of error checking is a further multiplication.
Not only does each layer and component multiply the work and the chances of error, each simple-seeming change to a data definition results in a nightmare of redundancy. You have to find each and every place the data is defined, used or error-checked and make the right change in the right language. Components and layers are the enemy of Occamality. Software spends the VAST majority of its life being changed! Increasingly scattered data definitions make the thing that is done 99% of the time to software vastly harder and riskier to do.
Comments