Thursday, December 18, 2008

Craftsmanship

Simon Brown has been re-evaluating the role of Software Architecture in response to the question:
Why do we need this new software architecture stuff? We've managed fine until now and projects *have* been successful.

Good question. Some teams don't dwell on the subject at all, any yet have produced remarkably well designed systems that have stood the test of time. Unix (Thompson and Richie), BSD Unix (Bill Joy) and Linux (Linus Torvald), just to name a few. In the case of Linux, there wasn't an opportunity to perform Big Architecture Up Front (BAUF) as a specific activity, since it was the result of the collaboration of literally thousands of developers on the web, so the Linux architecture we see today is truly an emergent entity devoid of a master plan.

What is architecture anyway? Simon concludes:
I say that architecture is about introducing structure and guidelines to a software project, and I've come to realise just how true this is. Without formalising these good ideas, a codebase with some otherwise great ideas can appear inconsistent, with the overall system lacking coherence and clarity.

In the discussion that follows, Simon goes on to explain that the architecture is "the big picture". Although he avoids using the word "design", I interpret what he says as architecture is the highest level system software design, which with a lot of programming languages is something that is difficult to gleam from just the code. I agree. What is disturbing however is that he flirts with the idea that defining this design is some how a different activity from programming, and calls for a dedicated role with dedicated skills. So architecture and consistency is something to be imposed onto a team of developers who can only code at the detail level and require an higher thinker, the architect, to impose "structure".

I like Simon, and I think he senses the irony in this. The team can't think and can only code, so the Architect must do their thinking for them. I wonder where this mindset stems from?
We will win and you will lose. You cannot do anything because your failure is an internal disease. Your companies are based on Taylor’s principles. Worse, your heads are Taylorized too. You firmly believe that sound management means executives on the one side and workers on the other, on the one side men who think and on the other side men who only work.”

Konusuke Matsushita – Panasonic

Simon can't be blamed. The values in our industry are such, that in many instances this is the only politically acceptable way to address the problem. People who are only paid to code and not think will struggle with architectural concerns and do need help. I pointed out that these same people will struggle with other aspects of programming too like listening to customers, effective collaboration and testing for example. Andrew Walker goes on to make the point quite eloquently that programming and architecting are one activity are can't easily be separated. I agree. In fact as I become better at my craft I no longer do BAUF (I once was an Architect in a past life) and I now allow my architecture to emerge as I write code.

So de-skilling the developer role by splitting out development activities like designing, communicating, listening etc into separate roles runs counter to the idea of good craftsmanship IMO. Craftsmanship is holistic. So how did we get here? Well de-skilling the workforce does allow you to pay them less and hire and fire people at will. It also puts those pesky all powerful "programmers" in their place, and may serve to provide the management with the illusion of control. There is a whole industry out there selling "Silver Bullets", and suggesting that software development can some how be automated and made simple by the latest Framework or piece of Middleware. And finally there are a bunch of ancillary industries like Business Analysts, Systems Analysts, Enterprise Architects, Solution Architects, etc, willing to do all "the thinking" at a fee, leaving the bulk of the team to only work.

Henry Ford had huge success with this approach in the 1930s, drastically reducing the cost of motor car manufacture by employing cheap unskilled immigrant labour. In recent decades however, in the face of competition from companies whose workers are empowered to think for themselves this Taylorist management approach has been found wanting. To see evidence of this you only need to open your newspaper. The Big 3 US Motor Car Manufacturers have recently gone cap in hand to congress for a bail-out. Toyota, Nissan etc build and sell cars in the US too, but they aren't asking for a bail-out! These Japanese manufactures are Agile enough to adapt to the needs of the market and are riding out the current economic storm. Toyota are even recruiting in the US for a new plant in California I believe. So why are these Japanese companies prospering at the expense of Ford, GM and Chrysler? In the 1950's Japanese motor manufacturers took a look at Ford, GM etc and rejected their Taylorist approach in favor of the ideas of Edward Deming. Deming felt that "scientific management" wasn't enough because it undervalued people and ignored human psychology. Could the Japanese choice of Deming over Taylor have something to do with the value they place on people?
Value and respect. I work as a Coach and I would never be as arrogant to suggest that I am the only person on the team qualified to determine the "big picture". I would like to believe that I have mastered my craft, but does that mean that I am the only person qualified to think? No this is arrogant nonsense and is exactly what Konusuke is speaking about.
Every human being can think and each of us should be encouraged to think. Our brain is our most valuable asset. The issue is that we aren't born knowing how to program well, and we all need to learn (even those of use who later become architects :)). So collaboration, architecting, coding, listening, designing are all things that must be learnt. The Japanese speak of Shu-Ha-Ri (Knowledge, understanding and skill), as the three phases of learning. A computer science degree will only provide you with some basic knowledge. To understand you will need to practice, and to gain skill will require repeated application in varying contexts. And of course you can't learn on your own. To learn effectively you need a teacher.
In past times when a man wanted to learn a craft he became an apprentice, and would live and work alongside a master craftsman for several years, receiving only food and lodgings for his efforts. After years as an apprentice he was deemed qualified and would become a journeyman, allowed to charge for his labours. A journeyman would then move from workshop to workshop gaining experience as he went. The best journeyman could become master craftsman themselves, setting up their own workshop and taking on apprentices of their own. Along the way people not suited to the craft would drop out rather than waste time, ensuring that standards remain high, and people found crafts that they were suited to. So in the past the west had its own version of Shu-Ha-Ri it seems. In fact this tradition still lives on in professions, like Lawyers, Doctors, Architects (the building kind), Fashion designers etc. So it looks as though the software Industry chose an unfortunate template in the Manufacturing industries to model itself on.
I can imagine that Henry Ford, must have seen traditional Coach Building Master Craftsman as an extravagant waste of money and a threat to his profits. Surely the job could be split into a number of simple unskilled roles with much of the work automated. This proved to be true (for a while) for motor car manufacture, but the same approach has failed us when applied to software. Why? because software development is not manufacture, it is not repetitive, instead it is creative. It is new product development, akin to fashion design, car design etc.
Successful software development calls for master craftsman like the names I mentioned earlier: Bill Joy, Linus Torvald etc. These masters have a responsibility to pass on their skills to apprentices whose job it is to think and learn by example at the knee of their master. The Agile community have gone back to basics and understand that developing people is the only way to pull the software industry out of habitual failure. People over process and tools.
Infoq has been tracking the tour of a Journeyman as he visits various workshops in an attempt to learn from the masters. His blog posts make interesting reading. Especially the way the learning occurs: side-by-side at the keyboard whilst programming. The master and the journeyman discussing the work and lessons being learned on both sides. An age old tradition, you can imagine Michelangelo working with his apprentices and journeyman in the same way.
Looking at the videos and listening to the interviews it seems such a natural way to learn, and the ideal solution for teams that are struggling with architecture or anything else. Hire the best people you can afford, and have them mentor their colleagues in a learning environment.
So obvious, yet Kevin a colleague of Simon seems to have missed this possibility. The idea that programmers can be helped to think for themselves escapes him as a possible alternative to the Architect going in and doing the thinking for them.
Perhaps Konusuke Matsushita is right. We do have an internal disease.

Tuesday, December 02, 2008

Object 101 - What is an Object?

Judging from the responses to my last post on uniformity it looks as though I started off with the bar too high. So lets lower it a bit. Lets go right back to basics and try and define what I mean by Object.

Concept


Firstly, the idea of Objects is a conceptual one. Alan Kay and his team did research for over a decade trying to understand how best to structure computer programs and settled on the idea of Objects. The concept of Objects can be explained by taking a biological analogy. Alan Kay speaks of the idea of an identifiable cell, which has a membrane encapsulating its insides from the outside world. Each cell is autonomous and goes about its job independently from other cells, but cells do collaborate in a loosely coupled way by sending (chemical?) messages to each other.

So Objects are analogous to cells and the key defining characteristics of an object are identity, encapsulation and messaging. Notice I haven't mentioned classes or inheritance. These things are merely just one approach to implementing objects and defining what goes on inside the membrane. There are object orientated languages like Self which eschew classes all together.

Lets explore these fundamental characteristics in more detail:

Encapsulation

The power of encapsulation is that it hides the implementation from the outside world. In the same way that a cell membrane hides the inside of a cell. The outside world need only know the message interface of an object (known in Smalltalk speak as the message protocol). So when I send a message to an object, I have no idea how that object will process that message. The object is free to process the message in anyway it likes. This leads to the idea of polymorphism, which is a Greek word meaning "many forms". Since an objects implementation is encapsulated behind its message interface, the implementation can take any form it likes. This means that message implementations are free to perform any side effect they wish as long as they satisfy the message interface. Notice again I have not mentioned classes or subclassing. Again subclassing is just one approach to achieving polymorphism. Any object that satisfies the message protocol is viewed as a polymorphic variant, and can substitute any other object that shares the same protocol.

Messaging

During their research Alan's team looked at a number of ways of getting their objects to communicate with each other in a loosely coupled fashion. If you take the biological analogy to its extreme, then each object should be totally autonomous and share nothing with the others. This means that each object should have its own cpu, its own memory, it own code, Alan Kay as even argued that each object could/should have its own IP address as a global identity. So an object in this view of the world is a networked computer. OK how to hook these objects together. Well the natural solution is asynchronous messaging, just like you get with e-mail. Since each object has its own cpu it doesn't want to block and wait until the receiving object processes the message. So an object can send a message without blocking and the receiver will send back an answer into the objects inbox its own good time. This approach is what we call the Actor model today, and as someone kindly pointed out, Alan Kay first explored this approach in an early Smalltalk back in the 1970's. Interestingly the Actor model has come back en vogue with Erlang which has adopted it to implement "share nothing" concurrency. This is why some people say that Erlang is Object orientated.

The share nothing approach to objects could be considered to be a bit inefficient. I'm not sure of the history, but Alan Kay and his team decided to move on from asynchronous messaging to a more restricted synchronous approach. With synchronous messaging all objects share a common processor (or thread) and message sends block until the receiving object has completed processing the message and has responded with an answer. This is the messaging approach that was settled on in Smalltalk-80 and released to the world in 1983.

State and Behaviour

So we now have objects sharing cpu, but each object still encapsulating its own memory (program state) and its own code (behaviour). The state and behaviour of an object is private (encapsulated). So what happens when two objects share common behaviour, but have different state? As an example what if I have two bouncing balls, one red and one blue? Do I implement two objects separately, duplicating the code? Obviously there is an opportunity here for these two objects to share common code.

As part of the private implementation of these two objects, sharing code is desirable ( the DRY principle). One approach is to create a new object to encapsulate the common behaviour. In Self they call this a trait object. This leaves the two original objects just containing state (red for one and blue for the other). Common messages sent to the red ball and the blue ball (like the message 'bounce') are delegated to the shared trait object (through something called a parent slot) which encapsulates the common code. The red ball however may decide that in addition to bouncing, it can blink too. Blink meaning changing colour from red to white and back again repeatedly. So in addition to 'bounce' the red ball adds the message 'blink' to its message interface. This behaviour is not shared with the blue ball, so the red ball will need to have its own blink code which is not shared. So in Self objects can choose to share some behaviour or may choose not to share any behaviour at all.

In Smalltalk-80, the idea of not sharing implementation was relaxed, although conceptually the idea is still useful. So in Smalltalk-80 all objects share behaviour with other objects of the same kind, leading to the idea of classes of objects. Again I'm not sure of the history, but I believe this is a more efficient approach then the approach adopted by Self (Self came about in the 1990s many years after Smalltalk when memory was cheaper and CPUs faster). So in Smalltalk all objects share common behaviour with objects of the same kind through a common Class object.

Classes

So we finally get to the idea of a Class. A class is merely an implementation convenience, and unlike what the C++ proponents would have you think, the idea of class is not central to OO. In the same way that objects can share behaviour through a common class object, so can class objects share behaviour through a common superclass, leading to the class hierarchies we are all familiar with and tend to associate with OO programming.

Notice I use the term Class object. In Smalltalk a class is a factory object. Incidentally a lot of the so called OO patterns in the Gang of four book are really C++ patterns. So for example there is no factory object pattern needed in Smalltalk where you get factory objects for free.

A factory object is an object that creates other objects. In Smalltalk such objects are stored in a global hashmap called Smalltalk and are available throughout your program. Global objects in Smalltalk are identified with a capitalised first letter in their name by convention. Classes in Smalltalk are just global factory objects and hence all have a capital first letter. This convention has been carried forward into C++ and Java.

So how do you create a class? Well you send a message to another class you wish to subclass. It will come as no surprise to know that the message is called 'subclass'. This message answers a new class. To this class you can add class methods and instance methods again by sending messages. Instance methods are methods that you want to appear on objects that are created by this class (remember a class is a factory), class methods are methods that belong to this class itself and defines the class's behaviour. Also you may want to add class and instance variables to your class to encapsulate state.

So how do you create an object? Well you send a message to the factory responsible for generating the kind of object you want. This means that you send the message 'new' to the class object.

This is where languages like Java and C++ take a cop out. 'new' in such languages is not a message send on an object, instead it is a keyword in the language. This means that you cannot override new and hence the need for the gang of four 'factory object pattern' in these languages.

Back to Smalltalk. In response to the message 'new 'the class will answer a new instance object. So now we have a new object. Incidentally we skipped over how objects are created in Self. In Self any object can act as a factory and is able to create a copy of itself. So in Self you send the message 'copy' to the object you want to copy. The copied object is now a prototypical instance which is why Self is called a prototype based OO language (rather than a class based OO language).

MetaClass

Back again to Smalltalk. We have factory objects (classes), and we have instance objects (objects), but where do the class methods live? Where for example is the code for 'new'? As I said before a class has two roles, one to define it own behaviour (such as defining new) and the other to define the behaviour of its instance objects. I called its own behaviour class methods. This behaviour belongs in another object, the classes class (classes have a class too). I think we need an example:

aTranscriptStream := TrancriptStream new.

The 'new' message implementation is defined in the class of the TranscriptStream Class object. The class of 'TranscriptStream' is called 'TranscriptStream class' which is also known as the meta-class. 'TranscriptStream class' also has a class: 'TranscriptStream class class', and so it continues. 'TranscriptStream class' and 'TranscriptStream class class' etc are all implemented as the same object, the Meta-class object (otherwise we could go on for ever). This circularity is one of the beauties of Smalltalk. Meta-classes do not exist in C++ and Java, and hence why 'new' is a keyword in these languages.

So the 'new' message is sent to the TranscriptStream object (which is a class) and the implementation is defined in the 'TranscriptStream class' object (which is a meta-class). Now how do we end up with 'Transcript'? Remember that in Smalltalk globals all start with a capital letter. To make something global I need to add it to a global hashmap called Smalltalk along with a global identifier (symbol):

Smalltalk at: #Transcript put: aTranscriptStream.

Then I can do this:

Transcript show: '3 is less than 4'.

Uniformity

The message 'show:' in the previous example is sent to the Transcript object (which is an instance) whose implementation is defined in the TranscriptStream object (which is a class). Interestingly you cannot tell whether Transcript is a global instance or a class. I initially mistook it for a class in my discussion with Stephan until I looked it up. The thing is with Smalltalk is that it doesn't matter. Instances, classes and meta-classes are all the same thing, they are all objects. Smalltalk is uniform and everything is an object, which is where I started.

Updated 4/12
Replaced Transcripter with TranscriptStream. Instances of both these classes satisfy the 'Transcript' protocol, but in Squeak Smalltalk the global 'Transcript' object is an instance of TranscriptStream. I found this out by printing 'Transcript class' in a workspace. Another approach is to print 'Smalltalk at: #Transcript'.