Making programming pay: 2008

Thursday, December 18, 2008

Craftsmanship

Simon Brown has been re-evaluating the role of Software Architecture in response to the question:

Why do we need this new software architecture stuff? We've managed fine until now and projects *have* been successful.

Good question. Some teams don't dwell on the subject at all, any yet have produced remarkably well designed systems that have stood the test of time. Unix (Thompson and Richie), BSD Unix (Bill Joy) and Linux (Linus Torvald), just to name a few. In the case of Linux, there wasn't an opportunity to perform Big Architecture Up Front (BAUF) as a specific activity, since it was the result of the collaboration of literally thousands of developers on the web, so the Linux architecture we see today is truly an emergent entity devoid of a master plan.

What is architecture anyway? Simon concludes:

I say that architecture is about introducing structure and guidelines to a software project, and I've come to realise just how true this is. Without formalising these good ideas, a codebase with some otherwise great ideas can appear inconsistent, with the overall system lacking coherence and clarity.

In the discussion that follows, Simon goes on to explain that the architecture is "the big picture". Although he avoids using the word "design", I interpret what he says as architecture is the highest level system software design, which with a lot of programming languages is something that is difficult to gleam from just the code. I agree. What is disturbing however is that he flirts with the idea that defining this design is some how a different activity from programming, and calls for a dedicated role with dedicated skills. So architecture and consistency is something to be imposed onto a team of developers who can only code at the detail level and require an higher thinker, the architect, to impose "structure".

I like Simon, and I think he senses the irony in this. The team can't think and can only code, so the Architect must do their thinking for them. I wonder where this mindset stems from?

We will win and you will lose. You cannot do anything because your failure is an internal disease. Your companies are based on Taylor’s principles. Worse, your heads are Taylorized too. You firmly believe that sound management means executives on the one side and workers on the other, on the one side men who think and on the other side men who only work.”
Konusuke Matsushita – Panasonic

Simon can't be blamed. The values in our industry are such, that in many instances this is the only politically acceptable way to address the problem. People who are only paid to code and not think will struggle with architectural concerns and do need help. I pointed out that these same people will struggle with other aspects of programming too like listening to customers, effective collaboration and testing for example. Andrew Walker goes on to make the point quite eloquently that programming and architecting are one activity are can't easily be separated. I agree. In fact as I become better at my craft I no longer do BAUF (I once was an Architect in a past life) and I now allow my architecture to emerge as I write code.

So de-skilling the developer role by splitting out development activities like designing, communicating, listening etc into separate roles runs counter to the idea of good craftsmanship IMO. Craftsmanship is holistic. So how did we get here? Well de-skilling the workforce does allow you to pay them less and hire and fire people at will. It also puts those pesky all powerful "programmers" in their place, and may serve to provide the management with the illusion of control. There is a whole industry out there selling "Silver Bullets", and suggesting that software development can some how be automated and made simple by the latest Framework or piece of Middleware. And finally there are a bunch of ancillary industries like Business Analysts, Systems Analysts, Enterprise Architects, Solution Architects, etc, willing to do all "the thinking" at a fee, leaving the bulk of the team to only work.

Henry Ford had huge success with this approach in the 1930s, drastically reducing the cost of motor car manufacture by employing cheap unskilled immigrant labour. In recent decades however, in the face of competition from companies whose workers are empowered to think for themselves this Taylorist management approach has been found wanting. To see evidence of this you only need to open your newspaper. The Big 3 US Motor Car Manufacturers have recently gone cap in hand to congress for a bail-out. Toyota, Nissan etc build and sell cars in the US too, but they aren't asking for a bail-out! These Japanese manufactures are Agile enough to adapt to the needs of the market and are riding out the current economic storm. Toyota are even recruiting in the US for a new plant in California I believe. So why are these Japanese companies prospering at the expense of Ford, GM and Chrysler? In the 1950's Japanese motor manufacturers took a look at Ford, GM etc and rejected their Taylorist approach in favor of the ideas of Edward Deming. Deming felt that "scientific management" wasn't enough because it undervalued people and ignored human psychology. Could the Japanese choice of Deming over Taylor have something to do with the value they place on people?

Value and respect. I work as a Coach and I would never be as arrogant to suggest that I am the only person on the team qualified to determine the "big picture". I would like to believe that I have mastered my craft, but does that mean that I am the only person qualified to think? No this is arrogant nonsense and is exactly what Konusuke is speaking about.

Every human being can think and each of us should be encouraged to think. Our brain is our most valuable asset. The issue is that we aren't born knowing how to program well, and we all need to learn (even those of use who later become architects :)). So collaboration, architecting, coding, listening, designing are all things that must be learnt. The Japanese speak of Shu-Ha-Ri (Knowledge, understanding and skill), as the three phases of learning. A computer science degree will only provide you with some basic knowledge. To understand you will need to practice, and to gain skill will require repeated application in varying contexts. And of course you can't learn on your own. To learn effectively you need a teacher.

In past times when a man wanted to learn a craft he became an apprentice, and would live and work alongside a master craftsman for several years, receiving only food and lodgings for his efforts. After years as an apprentice he was deemed qualified and would become a journeyman, allowed to charge for his labours. A journeyman would then move from workshop to workshop gaining experience as he went. The best journeyman could become master craftsman themselves, setting up their own workshop and taking on apprentices of their own. Along the way people not suited to the craft would drop out rather than waste time, ensuring that standards remain high, and people found crafts that they were suited to. So in the past the west had its own version of Shu-Ha-Ri it seems. In fact this tradition still lives on in professions, like Lawyers, Doctors, Architects (the building kind), Fashion designers etc. So it looks as though the software Industry chose an unfortunate template in the Manufacturing industries to model itself on.

I can imagine that Henry Ford, must have seen traditional Coach Building Master Craftsman as an extravagant waste of money and a threat to his profits. Surely the job could be split into a number of simple unskilled roles with much of the work automated. This proved to be true (for a while) for motor car manufacture, but the same approach has failed us when applied to software. Why? because software development is not manufacture, it is not repetitive, instead it is creative. It is new product development, akin to fashion design, car design etc.

Successful software development calls for master craftsman like the names I mentioned earlier: Bill Joy, Linus Torvald etc. These masters have a responsibility to pass on their skills to apprentices whose job it is to think and learn by example at the knee of their master. The Agile community have gone back to basics and understand that developing people is the only way to pull the software industry out of habitual failure. People over process and tools.

Infoq has been tracking the tour of a Journeyman as he visits various workshops in an attempt to learn from the masters. His blog posts make interesting reading. Especially the way the learning occurs: side-by-side at the keyboard whilst programming. The master and the journeyman discussing the work and lessons being learned on both sides. An age old tradition, you can imagine Michelangelo working with his apprentices and journeyman in the same way.

Looking at the videos and listening to the interviews it seems such a natural way to learn, and the ideal solution for teams that are struggling with architecture or anything else. Hire the best people you can afford, and have them mentor their colleagues in a learning environment.

So obvious, yet Kevin a colleague of Simon seems to have missed this possibility. The idea that programmers can be helped to think for themselves escapes him as a possible alternative to the Architect going in and doing the thinking for them.

Perhaps Konusuke Matsushita is right. We do have an internal disease.

Tuesday, December 02, 2008

Object 101 - What is an Object?

Judging from the responses to my last post on uniformity it looks as though I started off with the bar too high. So lets lower it a bit. Lets go right back to basics and try and define what I mean by Object.

Concept

Firstly, the idea of Objects is a conceptual one. Alan Kay and his team did research for over a decade trying to understand how best to structure computer programs and settled on the idea of Objects. The concept of Objects can be explained by taking a biological analogy. Alan Kay speaks of the idea of an identifiable cell, which has a membrane encapsulating its insides from the outside world. Each cell is autonomous and goes about its job independently from other cells, but cells do collaborate in a loosely coupled way by sending (chemical?) messages to each other.

So Objects are analogous to cells and the key defining characteristics of an object are identity, encapsulation and messaging. Notice I haven't mentioned classes or inheritance. These things are merely just one approach to implementing objects and defining what goes on inside the membrane. There are object orientated languages like Self which eschew classes all together.

Lets explore these fundamental characteristics in more detail:

Encapsulation

The power of encapsulation is that it hides the implementation from the outside world. In the same way that a cell membrane hides the inside of a cell. The outside world need only know the message interface of an object (known in Smalltalk speak as the message protocol). So when I send a message to an object, I have no idea how that object will process that message. The object is free to process the message in anyway it likes. This leads to the idea of polymorphism, which is a Greek word meaning "many forms". Since an objects implementation is encapsulated behind its message interface, the implementation can take any form it likes. This means that message implementations are free to perform any side effect they wish as long as they satisfy the message interface. Notice again I have not mentioned classes or subclassing. Again subclassing is just one approach to achieving polymorphism. Any object that satisfies the message protocol is viewed as a polymorphic variant, and can substitute any other object that shares the same protocol.

Messaging

During their research Alan's team looked at a number of ways of getting their objects to communicate with each other in a loosely coupled fashion. If you take the biological analogy to its extreme, then each object should be totally autonomous and share nothing with the others. This means that each object should have its own cpu, its own memory, it own code, Alan Kay as even argued that each object could/should have its own IP address as a global identity. So an object in this view of the world is a networked computer. OK how to hook these objects together. Well the natural solution is asynchronous messaging, just like you get with e-mail. Since each object has its own cpu it doesn't want to block and wait until the receiving object processes the message. So an object can send a message without blocking and the receiver will send back an answer into the objects inbox its own good time. This approach is what we call the Actor model today, and as someone kindly pointed out, Alan Kay first explored this approach in an early Smalltalk back in the 1970's. Interestingly the Actor model has come back en vogue with Erlang which has adopted it to implement "share nothing" concurrency. This is why some people say that Erlang is Object orientated.

The share nothing approach to objects could be considered to be a bit inefficient. I'm not sure of the history, but Alan Kay and his team decided to move on from asynchronous messaging to a more restricted synchronous approach. With synchronous messaging all objects share a common processor (or thread) and message sends block until the receiving object has completed processing the message and has responded with an answer. This is the messaging approach that was settled on in Smalltalk-80 and released to the world in 1983.

State and Behaviour

So we now have objects sharing cpu, but each object still encapsulating its own memory (program state) and its own code (behaviour). The state and behaviour of an object is private (encapsulated). So what happens when two objects share common behaviour, but have different state? As an example what if I have two bouncing balls, one red and one blue? Do I implement two objects separately, duplicating the code? Obviously there is an opportunity here for these two objects to share common code.

As part of the private implementation of these two objects, sharing code is desirable ( the DRY principle). One approach is to create a new object to encapsulate the common behaviour. In Self they call this a trait object. This leaves the two original objects just containing state (red for one and blue for the other). Common messages sent to the red ball and the blue ball (like the message 'bounce') are delegated to the shared trait object (through something called a parent slot) which encapsulates the common code. The red ball however may decide that in addition to bouncing, it can blink too. Blink meaning changing colour from red to white and back again repeatedly. So in addition to 'bounce' the red ball adds the message 'blink' to its message interface. This behaviour is not shared with the blue ball, so the red ball will need to have its own blink code which is not shared. So in Self objects can choose to share some behaviour or may choose not to share any behaviour at all.

In Smalltalk-80, the idea of not sharing implementation was relaxed, although conceptually the idea is still useful. So in Smalltalk-80 all objects share behaviour with other objects of the same kind, leading to the idea of classes of objects. Again I'm not sure of the history, but I believe this is a more efficient approach then the approach adopted by Self (Self came about in the 1990s many years after Smalltalk when memory was cheaper and CPUs faster). So in Smalltalk all objects share common behaviour with objects of the same kind through a common Class object.

Classes

So we finally get to the idea of a Class. A class is merely an implementation convenience, and unlike what the C++ proponents would have you think, the idea of class is not central to OO. In the same way that objects can share behaviour through a common class object, so can class objects share behaviour through a common superclass, leading to the class hierarchies we are all familiar with and tend to associate with OO programming.

Notice I use the term Class object. In Smalltalk a class is a factory object. Incidentally a lot of the so called OO patterns in the Gang of four book are really C++ patterns. So for example there is no factory object pattern needed in Smalltalk where you get factory objects for free.

A factory object is an object that creates other objects. In Smalltalk such objects are stored in a global hashmap called Smalltalk and are available throughout your program. Global objects in Smalltalk are identified with a capitalised first letter in their name by convention. Classes in Smalltalk are just global factory objects and hence all have a capital first letter. This convention has been carried forward into C++ and Java.

So how do you create a class? Well you send a message to another class you wish to subclass. It will come as no surprise to know that the message is called 'subclass'. This message answers a new class. To this class you can add class methods and instance methods again by sending messages. Instance methods are methods that you want to appear on objects that are created by this class (remember a class is a factory), class methods are methods that belong to this class itself and defines the class's behaviour. Also you may want to add class and instance variables to your class to encapsulate state.

So how do you create an object? Well you send a message to the factory responsible for generating the kind of object you want. This means that you send the message 'new' to the class object.

This is where languages like Java and C++ take a cop out. 'new' in such languages is not a message send on an object, instead it is a keyword in the language. This means that you cannot override new and hence the need for the gang of four 'factory object pattern' in these languages.

Back to Smalltalk. In response to the message 'new 'the class will answer a new instance object. So now we have a new object. Incidentally we skipped over how objects are created in Self. In Self any object can act as a factory and is able to create a copy of itself. So in Self you send the message 'copy' to the object you want to copy. The copied object is now a prototypical instance which is why Self is called a prototype based OO language (rather than a class based OO language).

MetaClass

Back again to Smalltalk. We have factory objects (classes), and we have instance objects (objects), but where do the class methods live? Where for example is the code for 'new'? As I said before a class has two roles, one to define it own behaviour (such as defining new) and the other to define the behaviour of its instance objects. I called its own behaviour class methods. This behaviour belongs in another object, the classes class (classes have a class too). I think we need an example:

aTranscriptStream := TrancriptStream new.

The 'new' message implementation is defined in the class of the TranscriptStream Class object. The class of 'TranscriptStream' is called 'TranscriptStream class' which is also known as the meta-class. 'TranscriptStream class' also has a class: 'TranscriptStream class class', and so it continues. 'TranscriptStream class' and 'TranscriptStream class class' etc are all implemented as the same object, the Meta-class object (otherwise we could go on for ever). This circularity is one of the beauties of Smalltalk. Meta-classes do not exist in C++ and Java, and hence why 'new' is a keyword in these languages.

So the 'new' message is sent to the TranscriptStream object (which is a class) and the implementation is defined in the 'TranscriptStream class' object (which is a meta-class). Now how do we end up with 'Transcript'? Remember that in Smalltalk globals all start with a capital letter. To make something global I need to add it to a global hashmap called Smalltalk along with a global identifier (symbol):

Smalltalk at: #Transcript put: aTranscriptStream.

Then I can do this:

Transcript show: '3 is less than 4'.

Uniformity

The message 'show:' in the previous example is sent to the Transcript object (which is an instance) whose implementation is defined in the TranscriptStream object (which is a class). Interestingly you cannot tell whether Transcript is a global instance or a class. I initially mistook it for a class in my discussion with Stephan until I looked it up. The thing is with Smalltalk is that it doesn't matter. Instances, classes and meta-classes are all the same thing, they are all objects. Smalltalk is uniform and everything is an object, which is where I started.

Updated 4/12
Replaced Transcripter with TranscriptStream. Instances of both these classes satisfy the 'Transcript' protocol, but in Squeak Smalltalk the global 'Transcript' object is an instance of TranscriptStream. I found this out by printing 'Transcript class' in a workspace. Another approach is to print 'Smalltalk at: #Transcript'.

Sunday, November 30, 2008

Objects 101 - Uniformity

Some interesting comments to my last post have prompted my to expand more on what I believe is good OO.

For me getting to know good OO started out with being skeptical about Bad OO and saying "I don't get it". This is why I think I understand Joe. Healthy scepticism is a good thing, especially when something blatantly just doesn't add up. I could go into a rant about why programmers find it difficult to say "I don't know" or "I just don't get this" and blindly admire the “kings new cloths” even when the king is naked, but I'll leave that for another day :)

As an example of good OO let me reproduce the Smalltalk code example from my response to a comment by Paul Homer to my last post:


 3 < 4 ifTrue:[ Transcript show: ‘3 is less than 4’].

Ok lets try and express this in a more familiar OO language, Java:


if ( 3 < 4) System.out.println(“3 is less than four”);

What is bad about this? Like Paul Homer pointed out the instructions (if, <, ..) clearly take precedence over the data (3, 4, ..) in true procedural style. Also there is only one object here "System.out", which given that it is a global static object is hardly an object at all. System.out.println() is no different then ( #include <system/io>) printf(). So the whole statement is not OO, it is procedural and would look pretty much the same written in C.

But Java is supposedly an OO language, so surely I can express this as objects. Lets try:


(new Integer(3)).isLessThan(new Integer(4)).isTrue(new Block {
         void execute() {
             System.out.println("3 is less than four");
         }});

Ok. I've created a DSL here in Java to get rid of the primitives and procedures and express everything uniformly as objects. So what is so bad about this? Well it doesn't read half as well as the Smalltalk example. All those parenthesis and periods tend to obfuscate the intent. Secondly I would need to write my own Integer and Boolean classes, overwriting the ones in the Java standard library. Java's Integer doesn't understand the message 'isLessThan" and Java's Boolean object doesn't understand the message "isTrue". Also the use of an anonymous inner class to simulate a block seems rather verbose, but without closures what else can I do?

So writing pure OO code in Java is difficult to say the least. Does this matter? Well if you are trying to learn OO with an hybrid procedural/OO language, then I think it does. I for one (using C++) definitely found it a challenge.

What do you think?

Paul.

Thursday, November 27, 2008

Why Bad OO Sucks

Anyone who as read my blog knows that I am an OO advocate, but I was watching Joe Armstrong the other day of Erlang fame and he posed the question: "In which Object should you place a given function? To me the choice seems arbitrary". Now if you've got a solid grounding in OO design principles then the answer to this one is easy. Keep the function near the data. So the object with most of the data is where the function should be. I have posed this very same question at interviews and most of the time supposed experienced OO Java developers are clueless.

The fact that a language designer is asking this question in itself is interesting, and says a lot about how OO has been misrepresented over the years. Joes critism is thoughtful. He outlines why he believes OO sucks in a blog post. He then asks the question, why did OO become so popular? I repeat each of his reasons here:

Why is (Bad) OO so popular?

Reason 1 - It was thought to be easy to learn.
Reason 2 - It was thought to make code reuse easier.
Reason 3 - It was hyped.
Reason 4 - It created a new software industry

I agree with all of Joes points.

Reason 1- Bad OO is easy to learn because it requires no learning at all. C with Classes is still C. And a class can be used the same as a struct if you choose. The whole ORM industry is built on using classes as structs :) So a bunch of programmers have transitioned from C to C++ to Java etc without learning almost anything at all.

Reason 2- Code reuse as always been speculation. The languages where object reuse has been achieved are not the popular ones. Take a look at NextStep. They managed to produce a pretty reuseable Financial Framework in the 1990s, but it was all written in Objective-C. The "industry" had decided at the time that C++ should be the OO language of choice and went off and tried to build "Pink", "CommonPoint", "OpenDoc" etc which all failed. Why? Could it be that C++ is a static language and the idea of "live" resuable Objects only works with a dynamic language? Even with dynamic languages reuse is illusive. Why? Well back to the idea of categorisation. No two things in this world are truly identical. So my Banks idea of a Bank Account is different to your Banks. So even with the binding problem solved by using a late bound dynamic language, it doesn't mean I can send my Account Object to your Bank and expect everything to work. Proper OO aficionados know this and for this reason alone do not seek to reuse objects out of context.

Reason 3 - Hyped. No arguments here :)

Reason 4 - OO didn't create an industry. The IT industry created an industry. In fact the father of OO Alan Kay has spent the last twenty years bemoaning what the IT industry has done with his baby. Reasons 3 and 4 go together. The industry hyped OO with promises of reuse etc and then felt obliged to go off and build middleware to deliver on the hype. The astute amongst us noticed pretty early on that hey this stuff is getting pretty complex and heavyweight. The rest merrily jumped on the bandwagon with COM, CORBA, J2EE etc. So as Joe rightly points out the industry created a problem of its own making then went about selling us stuff to solve it.

At this point I should go on to defend good OO. But Joe wasn't speaking about good OO. Good OO doesn't suffer from all these problems and good OO isn't popular. Infact good OO is still in obscurity and suffering from funding problems, something that cannot be said for Bad OO.

No, I have plenty of other posts on this blog that speak to the qualities of good OO. And I am not going to say that good OO is intrinsically better or worst then good FP. I would just say that they are different approaches to modeling, each with its own sweet spot. And as always the thing that really matters is the fleshy blob behind the keyboard :)

Sunday, November 23, 2008

Small is Beautiful

Computers are getting faster and more powerful all the time, yet whilst the average mobile phone as more processing power then all the computers NASA used to put man on the moon combined (well I'm not sure whether this is true, but if it is it wouldn't surprise me:)), our software of today is still no better then Doug Englebarts in the 1950's.

So why are us programmers having such an hard time keeping up with the hardware? The root of the problem is complexity. Whilst hardware has got much faster, it still performs a small number of very simple things like ' load', 'add', 'subtract', 'accumulate', 'shift', 'store', etc. It is the sequence in which these simple things are performed and the speed at which they are executed that gives the illusion of the computer being clever. In actual fact the computer isn't clever at all, the clever part is arranging these simple instructions into a useful sequence, and this is where programmers come in.

So programmers are the clever ones. Yes, but our cleverness is finite. Evolution works at a very slow pace and cannot be expected to keep up with Moores law :) So how can we expect to utilise ever more powerful computers if our brains just can't keep up? Well the starting point is realising that people are indeed the weakest link. Our cognitive abilities are finite and we need to organise our programming activities in such a way to best exploit our inherent strengths and respect our limits as human beings. This leads naturally to the subject of programming language design, but I'm not going to go there today.

No what I want to speak about is chunking and the magical number 7 plus or minus two. The way the theory goes is that us as human-beings can hold around 7 +/- 2 things in our brains at one time. Once the number of parts in a system become larger then this, then we are well advised to abstract and create hierarchies of things, ensuring that the 7 +/- 2 cognitive limit at any given level in our hierarchy is observed.

Anyway, back to software, 7 +/- 2 doesn't only apply to the organisation of computer programs, it also applies to teams. Small software teams tend to be able to adapt to changing requirements in away that large teams just can't. As a strong believer in Agile development and Emergent Design, this ability to turn on a sixpence is invaluable to me, and I tend to avoid large teams.

Experience has proven to me, that when it comes to software development that small is beautiful. After proclaiming this for many a year I have finally stumbled on the original paper that explains the theoretical underpinnings of this point of view:

The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information
by George A. Miller.

Enjoy.

Wednesday, November 19, 2008

Shu-Ha-Ri (Knowledge, Understanding and Skill)

InfoQ has picked up on a couple of blog post by Jim Shore recently. One where he speculates that Kanban and Lean approaches are viable alternatives to methodologies like Scrum and XP. Another where he says that Agile is in decline.

All the terms in italics in the above paragraph are just labels. Labels are just a convenient way to categories things. What we all would do well to remember is that in the words of Hayakawa: "the label is not the thing". Just because something is labeled Agile doesn't mean that it is an actual instance of the thing the original advocates had in mind. The Japanese understand this, and have developed a method of learning where they recognise different levels of mastery over the thing that is being taught. Shu-Ha-Ri, or roughly translated: Knowledge, Understanding and Skill. Just because some one knows the label it doesn't mean that they understand the thing. And just because they understand a specific instance of the thing, it doesn't mean that they have the skill to improvise and create a new instance of the same thing in a new context.

Categorising things and applying labels to them is a practice that is full of pitfalls. I will be posting more on this subject in the near future.

Saturday, November 01, 2008

America is not Voting

I came across this on the BBC. Now why on earth in the richest country on the planet are people queuing for hours just to exercise their right to vote? I am living in the US right now and watching the election as an outside spectator is very interesting.

Traditionally around a third of Americans bother to vote. In fact one of the biggest grass roots issues over here has been voter registration. So why the apathy? This time around voter turn out is expected to reach record levels. So it looks like the silent majority have finally found their voice.

Central to the idea of democracy is the responsible citizen making an informed choice. In the UK we take our voting responsibility very seriously indeed. Which is as it should be given that people have died defending our right to vote. The whole ethos of our (free) education system and our public service cultural institutions like the BBC is geared to producing well rounded and well informed citizens able to utilise their vote intelligently. Over the years this has led to consensus politics, where there is broad cross party agreement on a number of major issues. When you have vice presidential candidates who can't tell you which newspapers they read, doesn't that say something about the strength of your democracy? If the politicians aren't informed, what are the chances of the electorate being informed? Isn't this the root cause of the polarisation that is so self evident in US politics today? The informed versus the uninformed rather than left versus right?

I picked up this quote from a documentary I once saw on the plight of the American Indian. An old Indian chief on a reservation was asked what he thought of the white man:

"The white man has many great things, but he cares not whether his people are wise".

Apt words, which still have relevance today. Living here now, I can say that Americans are wonderful people and worthy of the claim to be the greatest nation on earth. With any luck they will make the right choices and begin to address the weaknesses in their democracy and become even greater still.

The defining moment in the election so far for me was Colin Powells contribution. Colin Powells America is a country I would be proud to be a citizen of.

Friday, October 03, 2008

Comprehending Large Software Systems

William Martinez has started blogging again. He as an interesting blog outlining the roots of emergent design. It is worth a read. One factor that limits peoples abilities to apply emergent design effectively in a team context, is the high degree of design communication needed across the team. Since the design is always changing, maintaining a shared comprehension of the entire system design across the whole team can be a challenge.

The XP Solution
Emergent design is the design approach adopted by XP, and like other XP practices emergent design relies on the presence of other complimentary practices which are cleverly woven together to reinforcing each other, the whole being greater then the sum of the parts. The practices that come to mind are common code ownership where every developer on the team "owns" all the code and hence is responsible for comprehending all the design. Another is pair programming, where the quality of the code is policed by continuous peer review and design knowledge is communicated across the team by rotating pairs. TDD helps by documenting the design specification as executable tests, the design specification is then kept up to date by refactoring the tests along with the code. Another XP practice is small teams working in a "bull pen" or some other type of congenial space. The small size reduces the number of paths of communication, so if one developer makes a design change he only needs to broadcast that change across a small number of paths to ensure that a fully connected network of design knowledge is maintained. The congenial space facilitates 'osmotic' communication, where the communication of small day-to-day design changes and refactors occurs coincidentally, merely by people over hearing design discussions amongst pairs of programmers, or perhaps by the team breaking out and quickly discussing a design choice facing someone, coming to a joint decision, and going back to work with a new shared understanding.

Its All about People
With small teams seeded with people skilled in these XP practices, the XP approach to maintaining system comprehension amongst a team works very well. It does require a number of subtle skills though, many of which are right brained 'soft skills'. I usually recommend a ratio of 1 to 3, skilled practitioners to novices. With such a ratio I find that the novices pick up the idea pretty quickly. Interestingly Williams discussion on emergent design also identifies the need for teachers and learners, when it comes to mastering the skills needed, and of course we are all learners when it comes to discovering the emerging design of a new system.

Code as a Communication Medium
In my experience once you have experienced this informal means of maintaining system comprehension, you never really see the need for the types of traditional system documentation we have all gotten use to. Novices in these informal communication tools soon become masters themselves and go on to seed other teams. Unfortunately, in many environments we do not have the luxury of creating the right conditions for informal osmotic communication. So what then?

William made the following comment which is thought provoking:

Finally: Documents per se are not evil, but I have come to realize that the documentation tools may not serve the goals. I mean, I need to be able to take a 10000 feet look to a system, to see the big picture. But, I will not be able to see it looking at just code. A word document is of no help. I need something else.

The code is not sufficient? Why? I guess it depends on the prior context of the reader, and also on the quality of the code. In many organisations, people not directly responsible for the code are called upon to make comment and review 'other peoples' code. Now in this scenario the reader has very little context. The XPer would say, well you've broken the tenant of common code ownership. If the reader was part of the team, then he would also be part of the osmotic process and would have a great deal of context to draw upon when reading any given section of code. Yes, but lets assume that the environment is not conducive to "common code ownership". What then?

Readable Code
The other issue is the readability of the code itself. This can be a self fulfilling prophecy: people looking elsewhere other then the code to gain system comprehension, so the code itself need not be comprehensible. For XPers, their code is the main means of design communication, so they are very fussy about their code. For XPers production code and test code together encapsulate a body of knowledge that describes the design choices that they have made along the way. Kent Beck chooses to make a distinction between 'quality' code and 'healthy' code. 'Quality' code is code which has desirable external attributes, like a low number of bugs. Healthy code is code with desirable internal attributes, that allow the code to be maintained, changed and extended even when the team looses people and new people come on board. Healthy code is easy to comprehend and well designed. I don't make this distinction myself. I see health as part of quality, but it is interesting that Kent does make a distinction, perhaps in an attempt to raise the profile of internal code quality which in some environments is readily overlooked.

Domain driven design and the idea of an ubiquitous domain language which is carried through into the code, is yet another method to improve both the comprehension and the design of code. Cryptic names in code have long been seen as a barrier to comprehension, and lots have been written about 'self documenting' code and how to write code which is more comprehensible. So we do know how to write readable code. The real issue is whether readable code is truly valued by the team and seen as a necessity.

My experience is that after being given an overview of a system, perhaps at a white board, if the code is healthy and good package, class, method etc names are chosen and if the code as a good accompanying test suite, then with the help of an IDE, comprehending a system just from the code is possible. For me it normally takes an after noon or two and in the end I usually end up with a list of questions to be answered by someone who "owns" the code. Questions answered, I normally have a reasonable comprehension, and more importantly an understanding of the 'true' design , not what was originally envisioned by the architect at the beginning of the project, but has long since been superseded as the true design emerges during 'implementation'.

Other Documents
Having said this, two days of my time, and perhaps a day of someone else's is a lot to spend on comprehending a system, especially if I do not intend to work on it. If the code is not very readable it could take me a lot longer. It is ironic that the kind of organisation that does not value readable code is also likely to expect people external to the team to comprehend and evaluate the system in an instant purely from documents, forgetting that the code is the system and hence the most important document of them all. In many organisations this scenario exists so what then? I have found the use of a Software Maintenance Handbook very useful in the past. A guide for the explorer who needs to comprehend the system. A user guide, can be most informative too, especially when it comes to answering the question "what problem is this system trying to solve ?" But even with these documents, assuming that they are kept up to date, there will still be gaps in understanding, so what should you do?

Magical Tools
Lets assume that I had a magic wand that would produce the 10,000 foot view of a system at an instant, just by tapping my wand on the hard disk containing the code :) What then? would I really comprehend the system without speaking to anyone else? In my experience the answer is no. I would understand the structure of the system, the 'what', but I wouldn't know the 'why'. To know the why, I need to know abut the problem domain. In a complex domain space, my lack of understanding of the domain, would be a barrier to system comprehension in itself. So even with a magic wand there is no easy answers, and in the end you need to speak to some one, perhaps face-to-face.

Better languages
If your organisation doesn't allow people to take the time to explain the system to you, then perhaps the left side of our brain may be able to help out a bit. Domain driven design and object orientation both promise to help system comprehension through good naming and the use of modularity. Whilst some OO languages are turtles (objects) all the way down, most are not modular when it comes to high level 'architectural' abstractions. Newspeak is a new OO language that has borrowed ideas from Beta to allow you to define abstractions larger then a class. Once you have a language like Beta or Newspeak that allows you to capture high level design decisions in code, then you can use a code browser to project views of just the high level abstractions, filtering out all the details and providing the 10,000 foot view at an instant. Martin Fowler has been talking about such browsers under the label 'language work benches'. Martins focus has been DSLs and their comprehension and use, separate from the host language in which they reside, but the same idea can be applied to architectural layers, modules, components, libraries, packages etc, any abstraction that can be captured by your language and projected independently as a view. The Beta language already shows how this can be done, by allowing a graphical view of system components gleamed from the code. This idea should be familiar to anyone who has used TogetherJ, the Java/UML documentation tool.

Fix the root cause
As you can tell, my view is that system comprehension is often a people problem, born out of the ways people choose to organise themselves and the values and principles they choose to adhere to. The whys and wherefores for any given system should be tacit knowledge within the team that created that system. Tacit knowledge is born out of daily work and socialised informally across the team. It is only when we restrict the flow of tacit knowledge by creating artificial barriers within the team, or geographically splitting the team, or creating teams that are too large, that we create the need for formal communication. Formal intermediate work products are then handed-off from one part of the team to the next. These hand-offs are points of weakness and are best avoided.

Most software organisations are dysfunctional in this regard, by this I mean that their organisation is not well suited to their intended purpose, namely the production of high quality software. The solution? Change the organisation. I accept that most of us aren't empowered to make this type of change, but lets not forget that the root cause of poor comprehension is often self imposed.

Conclusion
So in summary, system comprehension is about communication. Ultimately the code is the design and hence the best medium for communication. With advances in language design the code can be made a better medium for expressing system structure at the 10,000 feet view, but comprehension is not all about structure, it is about meaning and purpose too, and to comprehend these requires a grasp of the domain. Gaining this knowledge often means talking to people.

As human beings competent in applying both sides of our brains, talking and listening as a way of comprehending a system should come naturally. When it doesn't then perhaps its a sign of a deeper problem, such as organisational dysfunction.

9th October 2008 - Expanded the section on 'Fixing the root cause' to clarify what I feel is the main barrier to system comprehension, poor organisation; in response to comments made by William Martinez.

Tuesday, September 02, 2008

Google Chrome - Vilan or the next logical step?

Its a strange feeling when an idea you think is gaining momentum suddenly becomes concrete before your very eyes, confirming that you were spot on.

It is not news that Google is a clear advocate of software-as-a-service, but I'm not sure that most pundits thought that Google would be making the bold step of creating their own client platform just yet.

Well Google Chrome is out and I'm sure that there is going to be plenty said about it. Is this the browser wars all over again? Well when I first read the news (on the BBC news website), I thought so, but when I took a closer look at the details released by Google I changed my mind.

As my last post on Flex makes clear, the current browsers as a client platform just aren't up to the job. The JavaScript interpreters are slow and the whole thing is single threaded, so poorly written JavaScript can lock up your whole browser. Flex has addressed some of these problems, but anyone who remembers Windows 3.1, would agree that Flex will never be able to avoid the browser equivalent of a Windows 3.1 blue screen :)

In the noise I'm sure that Googles words are likely to get lost so here is a link where you can read why Google chose to create Chrome and you can decide for yourself whether they are satisfying a real need.

Should we fear Google in the same way we use to fear Microsoft? I don't think so. This snippet from the Google cartoon on Chrome sums up their position for me (click on the image to make it full screen):

So they want to create something better, and we shouldn't be worried about vendor lock-in because "this better thing" will be open sourced. Over the years I have come to the firm opinion that innovation should never be held back by premature standards. Our current crop of browsers where originally designed to solve a different problem, and whilst they are familiar and use a standard metaphor, they cannot be said to address the needs of software as a service.

There is no reason why Googles new platform should not be open, supporting ActionScript and Flex along with ECMAScript (JavaScript). In fact given that Chrome will be open sourced, there is no reason why people can't write a Ruby or Python or even Smalltalk plug-in for Chrome themselves. This is all consistent with the RESTful idea of code-on-demand, which doesn't dictate what is meant by "code". It will be interesting to see just how open Google are with their new platform, and whether they provide extension points so that others can extend the platform to meet their own specific needs.

Infact if Chrome gains momentum, then other browsers like FireFox or Safari could choose to integrate Chrome, or even borrow ideas. We need to see what type of license Google chooses to use, but a BSD style license would allow both open sourced and closed sourced projects to embed Chrome code. Even if the code isn't directly re-used by other browsers then I'm sure the idea of a multi-thread/process, high performance, networked, client software platform will be.

Chrome seems like the next logic step to me, and the fact that it is coming from Google and is open sourced, compared to a closed proprietary solution like Apples Dashboard Widgets, is a positive step forward. We need to be vigilant and hold Google to account, but I think that Chrome should be cautiously welcomed.

Last point, in executing our responsibility to ensure that Google adheres to ethical competitive practices, why not clean up and clarify our language? I am not sure that the vague term "Cloud Computing" is helping, why not use the term "software-as-a-service" (as Google has chosen to do)? Or better still why not stick with the term "code-on-demand" as coined in Roy Fieldings REST dissertation? I believe language is important and labeling these new client platforms in a precise way will provide less wiggle room for marketers looking to exploit vague terms like "cloud computing" to their own ends. Code on demand makes the purpose clear and by using RESTful language, I think it will help to tie these new platforms into open standards like HTTP, ECMAScript and APP (Atom Publishing Protocol). Just a thought :)

Wednesday, August 20, 2008

Adobe Flex - ActionScript Flexes its Muscles

Recently I've developed a keen interest in software-as-a-service and what as been coined Cloud Computing. The idea is to use current web technologies and RESTful principles to provide software services to a larger audience. This has led me to take a more detailed look at Flex and its associated technologies. You can only get the best out of Flex by using ActionScript in conjunction. ActionScript is a dialect of JavaScript or to give it its proper name ECMAScript. JavaScript is a late-bound, dynamic, prototype based OO language. Its worth stating this because although most people are familiar with JavaScript, very few know its OO origins.

JavaScript is heavily influenced by Self and Scheme, a point brought home by its original name "LiveScript". The name JavaScript was dreamt up later by Sun and Netscape during the Browser Wars. Given its roots, I have always thought that JavaScript is under utilised and unfairly maligned. Beyond DOM hacking on the browser very few people know anything about JavaScripts OO credentials which are pretty impressive. Most peoples experiences are coloured by cross Browser incompatibilities and poor performance. Recently Dan Ingalls, of Smalltalk-80 fame, has taken JavaScript and shown what it can really do. This work has resulted in the Lively Kernal. Its worth taking a look at what Dan and his team have achieved. It is pretty impressive.

Over the years ActionScript as migrated away from JavaScript in an attempt to become "Java Programmer" friendly. Well its succeeded. ActionScript code looks very similar to Java. You can use type annotations and classes just like in Java. It also supports packages and limits you to one class per file, so it is really home from home for the average Java programmer. The idea is to provide an easy on ramp for developers moving to ActionScript from C++, Java and C# .

Having played with Self and seen what it can do, I was curious to know how much of Selfs dynamism had made its way into JavaScript and ActionScript. Well firstly, JavaScript retains slots just like Self. So JavaScript has representation independence. Unlike Self, a JavaScript Object only has a single parent slot, so no multiple inheritance. Oddly enough the parent slot is called the 'prototype' which I find rather confusing. In Self the parent slot is called the 'trait'.

Objects are created in JavaScript using constructor functions. Functions in JavaScript are first class objects, again something borrowed from Smalltalk/Self or is it Scheme? I'm not sure, but functions are objects also. If you must use a C-like syntax, then JavaScript is a pretty impressive son of Self in my opinion, inheriting most of the fundamental qualities of its father.

OK. How well does ActionScript stack up? ActionScript is a difficult language to get into. Macromedia/Adobe assume that all you want to do with ActionScript is web presentation. This means that there doesn't seem to a standalone interpreter you can use to play with ActionScript just on its own. Well I've succumbed and installed Flex Builder, the Flex IDE, and have been rather impressed with it. The "main" for ActionScript applications is the "onCreationComplete()" method hook inside a Flex web page. Without going into the details, you need to write a Flex app to use ActionScript. If anyone out there knows how to run ActionScript standalone then I would be very grateful to find out.

Fortunately FlexUnit does all this for you and you can get playing with ActionScript very quickly by writing tests. OK, back to the ActionScript language. I started writing JavaScript and compiled it using the ActionScript compiler in Flex Builder and it works. So ActionScript does everything JavaScript does.

What they have done is extended JavaScript without fundamentally changing the internals. I concluded in my blog post on classes versus prototypes, that prototypes where the more fundamnetal abstraction. Well, ActionScript proves this. They have added class declaration syntax mostly as syntactic sugar for the creation of prototypes. They have taken this further in ActionScript3.0 (AS3), changing the method lookup mechanism, by copying down method objects from the class hierarchy into a single 'trait' object at compile time. This speeds up method dispatch. I don't understand all the details, but the only restriction introduced is that an objects class is immutable. Classes themselves are still open, just as in Smalltalk or Ruby, so AS3 is as dynamic as these two class based languages (with the exception of 'become:' in Smalltalk). It could be argued that AS3 is not as dynamic as JavaScript. The thing to remember though is that class immutability only applies if you use the 'class' declaration. If you stick to prototypes then ActionScript is as dynamic as JavaScript and Self.

I am pretty impressed with ActionScript. Adobe/Macromedia have managed to extend a prototype based language, making it conceptually similar to C++, Java and C#, whilst retaining the power of prototypes under the covers. It is some achievement.

Even the type annotations in ActionScript are optional, just as advocated by Gilad Bracha. With or without type annotations Flex Builder does a very good job at auto-completion, using type inference no doubt. When you leave out the type annotations Flex builder provides warnings, but these can be ignored or even suppressed. So you can save key strokes if you wish whilst prototyping.

The main problem JavaScript has faced in the past is slow interpreters and incompatibilities across browsers. With ActionScript 3.0 Adobe has solved both of these problems. The Flash VM is ubiquitous with 98% market penetration. The copy-down method lookup mechanism in AS3 provides vtable like performance, whilst still retaining dynamic dispatch when needed. The history of the evolution of OO support in ActionScript and an explanation of the method dispatch mechanism in AS3 is provided here.

Flex/As3 is definitely a web technology to look out for. Its nice to see the work of the original Self and Smalltalk researchers still living on, and playing its rightful place in the future of the web. Good ideas don't die, they just mutate :)

Sunday, July 13, 2008

Dynamic Languages - The FUD Continues...

Cedric Beust has been spreading his anti-dynamic language propaganda in a presentation that he made at JAZOON. I really find it strange that someone who is respected in the Java community would spend so much effort trying to discredit alternative languages. In an attempt to set the record straight for the "busy" Java developer I tried to enter the comment below. Unfortunately Cedrics blog wouldn't accept what it felt to be "questionable content" so I have posted my comment here instead:

Hi Cedric,

I didn't mention why I believe there is so much Fear Uncertainty and Doubt when it comes to dynamic languages. Two reasons: Politics and Fear.

Dynamic languages have been hugely successful dating back to the 1950's so whether they are viable or not should be beyond debate.

So why are we debating?

The real issue is the right tool for the right job. The problem is that lots of programmers only know how to use one type of tool so aren't in a position to make an informed choice.

This inability to choose based on ignorance leads to fear. The politics comes from proprietary languages (Java, C#) where the proponents have a vested self interest in keeping us all fearful of alternatives.

I have been using dynamic languages (Smalltalk) since 1993 and Java since 1996, and I started out programming with Modula-2 and C in 1987. They all have their strengths and weaknesses and none of them are a silver bullet.

The simple truth is that for web applications dynamic approaches are massively more productive. Take a look at Seaside (Smalltalk), Grails (groovy) or Rails (Ruby) and its clear that Java has nothing to compare. The DSLs provided by these languages make web development a cinch. Productivity improvements of 2-3 times is not uncommon. This translates to a reduced time to market, and better response to business needs.

So the real question is why are these languages excelling in this way? You seem never to address this issue, assuming that the people that choose to use these languages are some how misguided or confused. Well they've been misguided since 1956 and the advent of Lisp :) They choose dynamic languages because they value an higher level of expression, allowing them to achieve more with less. This doesn't only apply to the web, it applies to any scenario where a high level, domain specific language is applicable.

You advertise your talk as a guide for the busy Java developer, yet you do very little to educate him and alleviate him of his fears.

Let me:

1. Programming is hard and there are no Silver Bullets.

2. The biggest determining factor for success are the skills of the programmers.

3. Dynamic languages are different, requiring different skills and a different programming style.

4. If you take the time to master these skills then you are in a position to choose the right tool for the job: Either static or dynamic, or perhaps both.

5. With the right skills and the right tools you have a built in competitive advantage. Human centric computing applications call for higher level languages. Dynamic languages allow for the creation of higher level domain specific languages in a way that static languages don't.

The last point deserves to be backed up. Take a look at the Groovy HTML builder as an example and compare it with Java JSP. An even better (all though more esoteric) example is Seaside in Smalltalk.

The domains where Java makes sense are shrinking. Given the performance of dynamic languages nowadays and the ability to inter-operate with lower level, high performance system languages like C/C++, I see Java and C# being squeezed.

If you want productivity and a higher level domain specific language then Ruby, Groovy, Python etc is a natural choice. If you are on the JVM or CLR then you can always fall back to Java or C# when you have to. If you are on a native VM then you can fall back to C/C++.

The right tool for the job, which may mean splitting the job in two (front-end, back-end) and using different tools for different parts. With messaging systems and SOA "splitting-the-job" is easy.

Dynamic languages will only get better, incorporating better Foreign Function Interfaces and better tooling support, in the same way Java did back in the late 90's. BTW adding type annotations is always an option if people think they are really needed, but like I say a sizeable community have thrived very well without them since the 1950's :)

Cedric. You do your self no service by dressing up your prejudices as scientific fact. How about a balanced expose?

Go on surprise me :)

Paul.

Friday, July 11, 2008

The Self Language

In my last post I said that I wasn't too impressed by Selfs idea of using prototypes. Well I've changed my mind. My initial complaint was a lack of structure. When you open the Self BareBones snapshot all you get is a graphical representation of a shell object and a waste bin object. You don't get to see all the classes in the image like you do with the Smalltalk Class Browser. There is no Class browser in Self because there aren't any Classes.

This doesn't mean there isn't structure though. If you get the lobby object you will notice a slot called globals, one called traits and one called mixins. As I mentioned in my last post traits are objects that are used to encapsulate shared behaviour (methods). Globals are a parent slot on lobby; inside globals are all the prototypical objects. Each prototype object has a trait object. The prototype holds state whilst the trait holds behaviour. So between the two you have the same structure as a Class. So you create new objects by copying prototypes which inherit shared behaviour through an associated trait object.

Since the traits slot is not a parent slot of lobby you must send the message 'traits' to access trait objects from the lobby. So 'traits list' gets you a reference to the list trait object and 'list' gets you the list prototype. Why is the lobby important? Well all new objects are created in the context of the lobby. So the lobby object acts like a global namespace.

My explanation makes it sound more awkward then it actually is in practice. The bottom line is that Self has a lot of structure, as much as Smalltalk in fact. The structure is just different and more granular. Working with this structure is actually very pleasant. You still think in terms of classes, but only after thinking about the object first. So with Self you create a prototypical instance of what you want then you refactor it into common shared parts (traits) and an instance part (prototype).

The traits are more or less Classes. Self objects support multiple parent slots, but by convention multiple inheritance is not used. Instead usually there is one parent trait and additional shared behaviour is achieved by adding mixins to additional parent slots.

I am beginning to agree with Dave Ungar, that the Self way of thinking about objects is more natural and more simple. What convinced me is the ease in which objects can be created:

( | x <- 100. y <- 200 |)

Is an object literal which you can create at the shell and get an instant graphical representation of. The graphical representation of an object in Self is called an Outliner which is basically an editor that allows you to view, add and modify slots on the associated object. The Outliner also has an evaluator, where you can type in messages and have them sent to the target object.

So in self you create objects by entering literal text, test them out by sending messages and extend them by adding new slots. This is all achieved with instant feedback. You then factor your objects into traits and prototypes by creating new objects and moving slots through drag-and-drop.

Is all this important? I'm not sure, but I think so. The fact that objects have a literal representation that includes their behaviour is quite interesting and I like the drag and drop refactoring. What I can say is that the Self approach is fun and feels more concrete, as if you are using real building blocks to create your program.

Would a Self approach lead to higher productivity? With a bunch of keyboard accelerators so that you didn't need to use the mouse much, then I think so. To me feedback is king, and Self offers plenty of feedback. I think that the Self approach also leads to a more exploratory programming style, which in my opinion is a good thing. Above all, manipulating objects as if they are 'real' is a lot of fun, which as got be be worth something in itself :)

Friday, July 04, 2008

Objects revisted - Don't Generalise?

I've been playing with Self and it has got me thinking about why prototype based OO is not more prevalent. Generalising and categorising a bunch of things, as all being the "same thing" is something we all do all the time. Yet we know that we shouldn't generalise this way since each individual "thing" is unique :) I have come across this paper that takes a philosophical look at the difference between prototypes and classes. It concludes that ultimately prototypes are more expressive but generalising into classes is "good enough" most of the times.

Just from a practical view point I find classes much easier to work with thus far. This could be due to my vast experience with classes versus prototypes. Classes impose structure which aids with comprehension I find. I need to play with Self some more, but at the moment I find myself translating Selfs idea of a parent "trait object" into the more familiar concept of a Class. Here is another paper that takes the opposite point of view, stating that prototypes are more useful on practical grounds.

The motivation for prototypes as I understand it was the fragile base class problem. Representational independence and mixins largely solve this problem. Bob Martin takes another slant on the idea of a brittle base class, by stating that base classes should be stable by design. So the fact that other classes depend heavily upon them should not cause a problem. Classes that change should not be base classes. So base classes should encapsulate stable policies.

One thing that is clear to me is that classification and classes can be viewed as an additional structure imposed upon an object (prototype) based environment. So prototypes are the more general mechanism. The Self image I have been playing with has an emulated Smalltalk environment built from prototypes. So based on this, prototypes are the more fundamental abstraction. Following this logic, then Lisp with it's multi-methods and macros (code as data) is also more fundamental and hence more expressive then a prototype based language.

So it all boils down to Lisp :) I guess what we have with OO is a language imposed structure that enforces the encapsulation of state. This structure reduces the gap between the base language (Lisp like) and the problem domain in many instances. So OO itself can be considered a domain specific language, were the domain is "the physical world". In many scenarios, classifying objects and sharing a common behaviour (class) object across a set of instance objects is an approach that maps well to how we mostly think about the world and hence is a "good enough" template with which to model our world in a number of useful ways. But we know from philosophy that classifications aren't concrete and are arbitrary to some degree. If we choose to apply this deeper appreciation of the world around us to our models then we must do away with some structure. To model a world where objects aren't constrained by classification, we can choose to use prototypical objects, allowing for object specific behaviour.

So there appears to be a trade off between structure and expressiveness. So it follows that we gain more flexibility and freedom of expression if we fall back to a language with a less rigid structure, like Lisp were we are free to model the problem in any way we wish. The downside is that we now have to take more responsibility for structuring our program ourselves.

The bottom line is usefulness I think. For most use cases prototypes do not seem to provide any additional utility over classes. I'm curious to know whether there are uses where prototypes excel. From what I've seen so far, the Self Demo could have as easily have been written in Smalltalk.

(An after thought: Ruby also allows you to add behaviour to specific objects. This is not the same as Self, since a Ruby Object must still have a class and doesn't have parent slots where it can inherit traits dynamically).

Thursday, June 12, 2008

Newspeak - The Hopscotch IDE

In my last post on Newspeak, I ended with the conclusion that I needed to further my education. Well, I've broken the back of Haskell and pure functional programming so I feel confident commenting on Newspeak again.

First thing to say is that I feel silly for doubting Gilad and his team in the first place. There is a lot I could say about Newspeak, such as the benefits of using Newspeak in a pure functional style with the associated ability to exploit multi-core processors, but I won't. There is now a paper on Newspeak and I am sure that many others will pick up on these salient points. No the thing that has blown me away is this paper by Vassili Bykov on the Newspeak IDE interface called Hopscotch.

I'm still half way through the paper, but Vissali's style and approach reminds me of when I first read the blue book by Adele Goldberg. In a world were we have all become pinned in by our prior context, Vissali as the ability to see the user interface problem afresh and has come up with an elegant approach which emphasises usability and productivity.

His argument is that the "form" metaphor that we have all become accustomed to, is inherently modal and leads to a fragmented user interface. Instead Vissali chooses to borrow the document metaphor from the web, which he extends so that it is domain object aware. So in Hopscotch a class becomes a document containing other domain objects such as nested classes, methods etc, each of which have their own presentation. The IDE is aware of the structure of the underlying class, and orchestrates show/hide, and navigation between the sub components of each class. Vissali explains this a lot better in his paper.

My main reason for blogging is that I haven't read something that sounds so appealing, elegant and obviously right, since when I first read the Blue Book on the Smalltalk language. To be honest, I haven’t felt so excited about software in a very long time indeed. I can’t wait to get my hands on Hopscotch, which hopefully should not be too long now, since Newspeak will be open sourced. I have a very strong feeling that Newspeak is going to turn out to be something very special indeed.

Friday, May 09, 2008

Haskell - Academia goes Bowling

True to my promise, I've spent a little time looking into Haskell. I've read some intros on the web, and I have even gone as far as to download HUGS an interactive interpreter, and I have started to work through a tutorial. I'm still on the first chapter so it's still early days.

First impressions are that I like it. Haskell as you would expect is very mathematical, with terms like "currying", "list comprehensions" and "guards" to learn, so I had half made my mind up not to like it. Then I started to use HUGS and actually started writing some Haskell code. The code itself is quite readable, helped by prefix notation and type inferencing I think.

Keen to get to the point (and not wanting to have to learn a bunch of mathematical theories to do so), I sought out a real world example of a Haskell program. Thankfully I didn't have to look far. Ron Jefferies has produced a number of very good articles where he walks through his experiences coding up a Bowling Game program using TDD and a number of different languages. The Haskell example (produced by an academic), drew a lot of interest, especially since Ron spoke disparagingly about it.

The program uses pattern matching and recursion, avoiding loops. Ron's point was that the recursive implementation although succinct, said nothing about the "tenness" of a game of bowling. For those who don't know a game of Bowling always consists of ten frames. This "tenness" of the problem wasn't expressed anywhere in the solution. A number of academics jumped in defending recursion and offering recursive solutions of their own. Until one guy found out that there was actually a bug in most of the implementations presented.

I'll leave the reader to follow the debate, but what all these academics had missed, and Ron with his practical instincts had sensed is that the programmer is always the weak link. Any solution that doesn't fit with the way the programmer thinks about the problem will prove difficult for the programmer to verify in his own mind. Hence the missed bug, despite, many "clever eyes" over the code. Ron's summary is that Haskell forces you to think the way it thinks (mathematically) rather than allowing you to express things the way you think.

Gilad makes the same point about pure functional languages. In contrast, pure object orientated languages allow you to model the world the way it is. Haskell encourages you to create a stateless functional model, which by itself is useless for real world problems, but supposedly great for verification tools. Where your pure functional solution meets the real world, for things like user I/O, you must fall back to impure actions. Impure imperative behaviour can be isolated from the rest of your code using Monads. I don't yet fully understand Monads, but they appear to be a way of representing an underlying type (e.g an Integer) by wrapping it in a new type (called a Monad) that provides additional values that transmit information about side effects. So for example "Nothing" (NaN) is a valid value for a Maybe Monad, which can be used as a wrapper type for Integers. So functions that return Monads are able to communicate side effects to other calling functions (meaning that the side effect is no longer a "side effect" but a legitimate return value) . I will blog more about Monads once I fully understand them.

So where are all these great verification tools? With the exception of a type system there doesn't seem to be any built into the language itself. So you end up with a program that is hard for the programmer to check himself, and which can't be fully checked by the compiler. So what's the point?

As an academic exercise, Haskell is interesting and I guess it will float the boat of the mathematically inclined, as a practical tool though, it wouldn't be at the top of my list. I'm still going to play with it some more though, and I'll let you know if I change my mind.

Updated 11th May 2008
Corrected my description of Monads after feedback from the Haskell community. I found Daniels comment on Monads particularly useful. It prompted me to read further and improve my own understanding. I don't pretend that my description of Monads is complete, and I would refer a reader interested in a complete explanation to look here.

Monday, May 05, 2008

Newspeak - Educating the Pilots

In my Man versus Machine post I decided to play devils advocate and question Gilads motives. Having seen Gilads presentation on Newspeak, his motives are now clear to me, and as far as I understand his intent, I am in violent agreement. So why my initial discomfort?

Before I answer this I should say, that I have always agreed 100% with Gilads' point on static. On his blog I questioned whether the elimination of static, was just another way of reducing coupling, to which Gilad responded that yes this is true, but that coupling was a general term and that he preferred to be specific. I think this is the root of my discomfort. By being specific, I fear that a lot of normal programmers will miss the point.

Most programmers do not understand the importance of coupling and cohesion, and most do not understand that objects are just one approach to reducing coupling and increasing cohesion amongst abstractions. Functional programming is yet another approach. The underlying problem is the same, and as old as computer science itself. In fact the whole static thing is merely a failure in abstraction. We have no way of declaring a suitable cohesive abstraction, so we make things static and have them coupled to everything. Just like dodgy globals in C when you can't think of a better way.

So what Gilad has done in Newspeak is to provide a new way of defining loosely coupled, cohesive abstractions that are larger than a single class, removing the need for dodgy globals (static). This isn't reducing the programmers freedom, rather it is replacing a rather poor default mechanism, with a more precise way of defining exactly what you want. If you wanted to you could still create "dodgy globals" in Newspeak, but you would need to define a "global" abstraction first. There is no default "global" name space in Newspeak, which is as it should be.

So this is the issue, new and improved mechanisms are fine, as long as the pilots understand why? With OO programming, lots of programmers understand the "what", but very few understand the "why". This is how languages like Java can claim to be Object Orientated whilst only delivering on half the story. It is also why many (most?) Java programmers are happy programming in a procedural style in blissful ignorance.

With Smalltalk, it gave very little scope for misunderstanding, because there was only one way to do things. You were constrained to think in Objects. Newspeak has this same property, and by replacing static with a better mechanism it will force developers to think about higher level abstractions and name spaces, rather than defaulting to static.

I guess it isn't Gilads job, but there is a big education task needed if languages like Newspeak are to ever take off. Most people don't get the basic concepts behind OO and Smalltalk, whilst Gilad is ramping things up to the next level by borrowing from Self and Beta. My education is only partial. I have read about slots in Self and they make sense. I didn't think of them as a way of defining immutable functions though, and as a way of eliminating state. The elimination of state, similar to the elimination of static is yet another way of reducing coupling. If nothing changes, then any dependent abstraction can't be adversely effected by unsuspected side effects. This intuitively makes sense.

So Gilad prefers:

identifier = letter, (letter | digit) star. (Immutable)

Over:

identifier ::= letter, (letter | digit) star. (Mutable)

In both examples identifier is a slot. This means that identifier is a method, that answers the value of the given expression. There are no instance variables in Newspeak. In the first example identifier is immutable, in the second it is mutable. In this presentation at Lang.NET Gilad makes the point that if you remove the ::= operator from Newspeak, then Newspeak becomes a purely functional language with no state.

What are the implications of this? Well this is where I hit the bounds of my education. I think I'm going to need to take a look at a purely functional language like Haskell and get to grips with Nomads before I can comment further. Even with a vehicle as powerful as Newspeak, the limiting factor on success is still the pilot :)

Making programming pay