Last updated on Mon May 31 11:12:28 CEST 2004
Lessons from ICFP 2003 Back to Lessons from ICFP 03
Precisely because of the high paced development during ICFP, problems that would normally take a long time to manifest in "real life" projects are exposed in the very first hours. This document summarizes the troubles faced by the Ruby team and proposes some possible corrections to be implemented in the next edition of the ICFP.

Work organization TOP

Insufficient preparation TOP

  • we had no significant communication before the beginning of the contest
  • no knowledge about each other
    • strong points, skills and available tools
    • commitment
  • lack of training in the tools we would use to interact

Rules of the game TOP

These were defined implicitly by the dynamics of the group. Stating them explicitly at the beginning could have saved time in some situations. For instance, more calls for help could have been made had it been clear that they were allowed (or even encouraged). The way things developed, people were not confident on their "moral authority" to do so. Another scenario is source-tree ownership: it would have been nice to know if it was OK to claim "ownership" (exclusive modification rights) of a module (in the large sense, understood as quite a big sub-system); we only adopted de-facto rules on that regard (see Granularity).

Unless the rules of the game are clarified up-front, newly arrived team members have to guess them through interaction with the preexistent "group core", which takes valuable time. Another side effect is that this could potentially hinder the collaboration of newcomers as they are second class members until they have stayed around for some time and understood the dynamics of the group.

Lack of "whole picture" TOP

  • no long-run planning, decisions were pushed away in time (some key decisions were taken only 4 hours before the deadline!)
  • lack of a vision to drive efforts: nobody could see where his ideas would fall in the whole picture (as it didn’t exist)

These proved to be the most important aspect in our failure to submit a solution within the 72H time-frame, as the consequences were grave and persistent

  • little knowledge of the code base (even at some abstract level). We were surprised to find new functionality in the CVS a posteriori.
  • duplication of efforts: consequence of the above, we replicated each-other’s effort several times (see Communication Problems)

As we didn’t have a driving vision, it was often impossible to see the real importance of one’s work in the final solution: we didn’t know if whatever we were doing was critical for the success of the project and supported by the whole team, or only a side-development of little importance and mostly ignored by everybody else.

Wrong estimations TOP

  • we started things we could not finish: we didn’t evaluate our skills properly
  • we were unable to estimate the resources (mainly time) required for the different tasks

Teaming TOP

Communication problems TOP

Feedback TOP

We didn’t give feedback to the rest of the team about the work already done:

  • time needed
  • problems found

This information would have been valuable even within the 72H period, as several modules could pose similar problems; moreover past experience would ease more precise estimations of future performance.

Project status TOP

Moreover, we didn’t know much about what was precisely going on at any moment:

  • no information on the work being performed in other packages, but the one we could catch "on the fly" in IRC
  • other people’s activity was mostly unknown. If somebody didn’t answer in IRC, it could as well mean that he had left ICFP for a break, was sleeping, had abandoned completely or was in deep hack mode. In the latter case, we didn’t know if we could expect working code on his return, or if it was only some exploratory work.

Lexicon TOP

Several misunderstanding were caused by the lack of a proper glossary. By the third day, it became clear that we had spent a long time explaining each other several ideas that in the end were mostly equivalent: the lack of a common vocabulary was key in this replication of our mental processes; had we been able to really explain things the first time, we wouldn’t have kept beating a dead horse.

Different needs TOP

The channels used proved to be insufficient for the diverse communication needs:

  • general discussion on infrastructure and vision (many to many)
  • quick technical questions (one to many plus return)
  • sub-teams, low-level coordination (see Granularity) (normally point to point)
  • announcements (new functionality or test cases implemented in CVS)
  • "virtual blackboard"
  • small talk to relax every once in a while

Logs TOP

IRC either couldn’t provide the required functionality or was used ineffectively, for almost all communication was performed in a single channel and it was impossible to discriminate messages. For example low-level coordination would need exclusive attention to IRC, whereas technical questions could normally be skipped safely: somebody would answer anyway, or the question would be repeated if the necessity persisted. This is very different from what would happen with essential discussions on the big picture, that could easily get lost in the IRC logs. Plus nobody would read the latter as

  • they were not released right away
  • only a small percentage of the whole content was important

Sub-teaming TOP

Every once in a while, several teammates would feel the need to work more closely, effectively forming a sub-group. In the Ruby team, such groups had the following characteristics:

  • completely ad-hoc
  • established on-the-fly via IRC
  • little consideration of skills and resources

We basically assumed that everybody was equally capable at everything, which could of course not be the case.

Some of the consequences were:

  • time lost when creating sub-teams
  • criteria such as time-zone closeness not taken into account
  • inefficient usage of our skill set

Work package granularity TOP

We failed to see that a class/module/file is not (in this case) the right granularity level to achieve parallel development.

After some time, we came broadly to the following de-facto standard:

  • work was done on a file-level
  • only one person could edit a file at a time, but this wasn’t rigidly enforced, and we didn’t have any support from CVS on that but conflict resolution
  • if changes in one file propagated to another, the necessary modifications would be explained to the current "owner" of the destination file so he could apply them

This was extremely costly, as

  • several people were involved in such operations
  • as we used a single IRC channel for the required internal communications we had to choose between tracking IRC closely (and wasting time in context switches) or ignoring it and potentially missing important news.

Technicalities TOP

Incorrect use of prototypes TOP

Some people had different expectations about some parts of the code: whereas some saw them as prototypes to be thrown away later, others believed them to be the real thing. This meant prototypes would stay beyond their normal lifetime.

Over-engineering TOP

In the first day of the contest we went through (too) long talks to get the first abstractions right. However, they later proved to be both overkill and not very useful, since only further ideas to solve the problem would prove the abstractions right.

Instead of first thinking about what would come next (with some detail) we tried to get things "right" up-front. This entailed constant class hierarchy redesign, moving files around, modifying interfaces… These were extremely costly operations resource-wise: they took a long time and blocked most people, as everybody had something to say about the basis of the system (see Granularity)

Moving interfaces TOP

Despite the great care we took to get them right the first time, they kept changing and causing several small glitches (this is related to Over-engineering).

We didn’t have proper integration tests or any system to ensure that interface changes were broadcast to all team members (see Communication Problems).

Excess of cleverness TOP

We explored methods way too complicated, instead of sticking to a simple idea that would be further refined if time allowed.


Lessons TOP

Most suggested practices are seemingly obvious but they won’t happen spontaneously if no care is taken.

Preparation TOP

  • teammates should know each other before ICFP. Valuable information includes
    • commitment level
    • skills
    • timezone and possible working hours
    • available computing resources
  • core members have to reach consensus on
    • tools used
    • the rules of the game

Structure TOP

Perhaps some organizational structure could be helpful. It’s always better to explicitly define it instead of letting it happen de-facto, this would at least address the "moral authority" issue.

Code ownership TOP

One of the most important rules of the game is the policy on code ownership. Allowing people to claim ownership of a whole subsystem (so that interface changes can be propagated by a single person) could prove very useful. This "ownership" should be enforced by the supporting software or effectively communicated through a channel reserved for that purpose.

Strategy TOP

A short discussion immediately after the release of the task sheet amongst all the core members is essential to ensure that

  • no member is from the beginning in a worse position to participate in the initial decision process for joining the project too late
  • work division can be performed (although on a very coarse level, see Granularity) taking into account the priority of the tasks and time-zone considerations

At the beginning, time is better spent discussing the long-term direction than getting the first abstractions right. Experience shows that they won’t be perfect anyway. A common glossary can prove very useful when different solutions are being discussed.

Work division should be performed, on a bigger level than just classes/modules. Different work packages should be as independent as possible; in the case of the ICFP 2003, several sub-projects are easily identified and could have been developed in parallel with little communication needs.

Having a 72H deadline, the ICFP doesn’t encourage the use of prototypes — you have hardly enough time to do things once, let alone twice. On the other hand, tracing bullets (as described by the Pragmatic Programmers) seem to match perfectly the requirements of such a contest: a preliminary working version of the system is quickly assembled (from this point on there should be something to submit) and then further refined as much as possible.

It isn’t possible to try several diverging, completely unrelated solutions; the main path should be chosen as soon as possible (say, within the first 18H), and if possible it should allow derivations later. By all means, everybody should know where the project is heading and understand the retained solution by the second day of ICFP.

Tools TOP

Tools should effectively support and/or enforce a number of policies explicitly stated at the beginning of the project and allow efficient communication, both active and passive (current status, most important interface changes, etc). Possible technical solutions include

  • several IRC channels for the different communication needs, supplemented by a tool with good support for push information exchange. For instance, every module could have its own channel to notify progress or important design decisions.
  • instant messaging for tight communication in sub-teams and for notifying the status of the developers
  • a configuration management system with support for file reorganizations (for instance Subversion instead of CVS)
  • conferencing tools with graphical capabilities
  • a Wiki to store the glossary and post explanations of the different solutions to a problem.
 

Copyright © MJFP
batsman dot geo at yahoo dot com