Back to post index

On Taylorism and software development
Tags: [software] [essay] [bugs]
Published: 14 Jul 2014 23:12

Table of contents:


Taylorism in software development

One of the core themes of Taylorism as mentioned in the Wikipedia article is that it emphasizes the “knowledge transfer between workers and from workers into tools, processes, and documentation.”

In an ideal world, software development has the potential to be perfect to apply Taylorism to: who the individual developer is at any given time no longer matters. If one developer leaves, then another is hired who can speak the (programming) language, and in replacing them they are effectively picking up exactly where the previous person left off, holding their tools as if they never left the virtual assembly line and armed with all of their experience and knowledge embedded in the source code.

Where the analogy falls apart

The idealism falls apart when the developers are not perfect at embedding the totality of their knowledge and skill into the code. For Taylorism to work, it is necessary that each developer be perfect at this.

The point of this article is to emphasize the following:

  1. All developers implicitly and explicitly build mental models related to the code they interact with.
  2. Each developer builds their own unique mental model.
  3. Mental models between developers can be largely similar but still different.
  4. Developers may not realize if and when these mental models differ.
  5. Developers write code based on assumptions which are generated from their mental models.

And perhaps most importantly:

Each developer’s mental model should be synchronized, tested, and debugged.

Problem of out-of-band information

If out-of-band information exists, knowledge transfer by means of only reading the code means that out-of-band information is inaccessible, and a developer reading only the code gets an incomplete picture. In reading the source code and modifying it, each developer is implicitly building a mental model relating to that part of the code. Incomplete pictures tend to lead to incomplete and possibly faulty mental models of how the code is supposed to work.

Incorrect understanding of the problem domain is skipped here - if the reality of the problem and the code’s model of the problem are different, that is a whole different class of bad news.

A developer with an incomplete or faulty mental model can still write code that works given the current test suite / range of inputs, which is dangerous. Even if the whole test suite runs clean, laying down code with faulty assumptions creates future bugs that unbounded input from the problem domain can trigger.

Synchronizing mental models

Code reviews, and developer conversation in general, act as a way to synchronize different mental models. Each developer comes into the code review with their mental model. How each mental model was constructed doesn’t matter, but bringing developers into a code review and asking them to verbally review the code has the tendency to expose faulty assumptions and eventually faulty models.

The key has to be to encourage this type of explicit synchronization, but the next step after synchronization is to encode what was learned back into the code. Code reviews and informal conversations are themselves out-of-band information (with respect to the source code) and this creates a problem.

Code ‘ownership’ is bad

Code should not be ‘owned’ by any developer(s). This is likely an indication that knowledge exists out-of-band. If anything, individual developers should be familiar with sections of code, but all should have accurate mental models associated with any section.

If ‘ownership’ means

“go talk to this developer because they are familiar with this code section”

it means that the code is a lot healthier than if ‘ownership’ means

“this developer knows more than anyone else about this section of code, so you need to go talk to them.”

Exploring the problem domain

Along the same vein as ‘synchronizing mental models’, discovering new areas of the problem domain creates new out-of-band information which should be hastily encoded into the software. Unfortunately, simply encoding this new facet of the mental model of the problem domain does not attack the latent assumptions present in the software already. This creates the requirement that software be flexible in the face of new information, which is somewhat similar to the Agile requirement that software be flexible to new user demands.

This requires lower coupling in general, and more specifically coupling of things that change together in the code together. This is one of the points of the paper ‘On the Criteria To Be Used in Decomposing Systems into Modules’ by David Parnas:

“We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others.”

Adding to this,

  1. updating each module should hide the update from others.
  2. if possible, latent assumptions should be localized together.

Encoding this new facet also requires the software itself to act as the basis of synchronization. New information pours in, and developers have to update themselves with it. This is a pull synchronization rather than a push synchronization unless the developers are notified external to the source code - any given change made to the code can be the result of new information, may be related to existing information, or might just be a simple bug fix.

Conclusion

From this, here are some questions to ask:

  1. Are developers encouraged to embed the totality of their knowledge and skills into the source code?
  2. Are developers’ mental models about the code and problem domain explicitly challenged, synchronized, tested and debugged?
  3. Are developers actively trying to challenge their assumptions?
  4. What does ‘code ownership’ mean to the team?
  5. Is a developer’s understanding of the problem domain shared?
  6. Does your design allow a full encoding of the problem domain?
  7. Can potentially out-of-band information be written into the code eventually, or is some of it not encodable given the current system?