Internet is full of discussions about the importance of OpenSource: OpenSource code is available for everyone, allows you to learn, depending on the license allows you to modify and adapt the software to your needs and, if the code is Free Software, it costs you almost nothing. In this post I want to point out another important reason: research activity. I don't mean what kind of research you can do with OpenSource code, but what quality of code you can produce during your research. It seems to me that almost every research activity produces code of very poor quality, and puts such code under a prototype umbrella. Most of the time, producing a prototype make researcher feeling as they must produce nothing more than a proof of concept. This is true, but leads to a very waste of resources and efforts: if the proof of concept works, before it goes to production it needs refinements and code rewriting, on the contrary if the proof of concept does not work, well, it must be rewritten. Since the proof of concept (the prototype) is kept private and usually does not get out of the laboratory bounds, it is allowed to be a mess of code. Once the code will be distributed outside the laboratory, it will become more and more beauty and clean. Staying confined within research lab boundaries means that the prototype does not have users. This also means that the XP paradigm cannot be applied, or it does not make sense to apply it: there is no need to release often, there is no need to keep the code clean and the repository always stable (or at least filled with code that does not have compile errors).
Why and how OpenSource can help researchers producing good quality code? First of all, releasing a project or a prototype publicly means you could have users, and users can help you (or bother you) to find problems, suggest patches, and sometimes can make some work on your behalf. Most notably, releasing the code means you have to keep it documented, keep it clean and at least easy to compile and run. Releasing a product as OpenSource means you are subscribing a contract between you and (possible) users, and you are asked to keep the project on the rails. This does not mean you have to lead the project forever, but means that the project can live forever, surviviving problems.
So why does not researcher release OpenSource code? Well, most of them do, but a lot of project dies behind a laboratory walls. The main reason is that researches are afraid someone else can steal their brilliant ideas having access to their implementation. This lies two problems:
  1. the idea should be a model, not an implementation. The implementation is language and architecture dependent, and evvery good developer can do it;
  2. researches do not know OpenSource licenses, that can protected and guard their work and ideas.
Just as an example: I started studying and working on JikesRVM before version 2, when it was not a full OpenSource project. At that time the code was really undocumented, organized in a single directory (for code cleanup issues) and it was difficult to get internal details without digging the code. Now, that the project is hosted as OpenSource, its code structure is well organized, the documentation can really help new users and the list of contributors and users has grown as well as the list of features. As a counter-example, I worked in a research project about Ambient Intelligence called LAICA. In such project I have seen a lot of commercial partners producing very low quality code, and the reason I perceived was the whole project had has to be a proof of concept, and in the case it was proved to be good, other funds would have been available to support its production-rewriting.

So, when you are starting a new research project, ask yourself if you can release at least a part of it as OpenSource, and in the case, do it. I believe it's really worthing.

The article Throw your code away....or release it as OpenSource! has been posted by Luca Ferrari on February 18, 2010