söndag 30 mars 2008

Dedefending my case: Stop documenting your code!

Here is my response to Peters post - Do comment code!:

First off, I think it's okey to be critical, you should not take anything for granted, and of course we all have our own ways of programming. My post was more of a tip based on my own experience from working in a larger team with no/little code documentation.
Commenting code isn’t as much about pinning down knowledge, as it is communication. And when it comes to communication, redundancy is almost never a problem :)
Commenting code is communication, but you are communicating knowledge. The same knowledge is always communicated through the code. If you find it hard to "read" the code, then, in my own experience, the code is not well written and/or formatted. Make sure you have a coding policy stating how you should write code so that all code looks the same and can be easily "parsed by your brain". It's amazing how much easier it is to read code that is well written and formatted.

I'm not saying you should avoid all documentation, but documenting every class and function is always redundant if the documentation doesn't state anything new. You should always strive for having code that is easy to read, before putting comments around everything you do.
And another thing, without documentation, I might misunderstand the code I’m reading and think that I find defects or suboptimal solutions.
That's true. Of course you can have comments here saying "don't use this function this or that way", but perhaps there is another way of writing the function so that it can't be used incorrectly? If you write a method and document in the header that this function should only be used if some requirements are met, then I guarantee you that this function sooner or less will be misused. Make sure you add requirement checks (for example assertions) in the beginning of the function breaking the program, so that you crash early if someone make such misstates. But always strive for writing code that is impossible to use incorrectly.
And then there’s the case with APIs that you don’t have the code to… Ok ok, thats another story.
If you are shipping your code to another company, or group of programmers, then you always need to communicate what the code does, because the code you shipped is persistent and will not change. Thus, the comments you write will always be correct (if being correct when shipped). And if you don't ship along your implementation, then of course you need to document your code, but more importantly, document your project as a whole. Do this by describing how it should be used and what the subparts of the project does and how they communicate with each other. Focus on the big picture.
But hey, you can’t possibly mean that I shouldn’t document my Assembly code, do you? Super haxxor might read and understand ASM as I understand English, but the vast majority don’t.
Well, if the super haxxor wrote the code then he or she will in most cases be the only one reading it and than you don't need documentation. If the assembly code is only a small part of the project which is written in another language (say C++), extract the assembly code and put it in it's own function describing what it does. By doing so, you will also make the code more easy to test and validate! A better question might be, do you really need assembly code?
Saying that documenting/commenting code is wrong, is saying ‘I don’t care if other people are able to read my code, and if they don’t understand their lazy and dumb’. And thats lazy and dumb :-)
I'm saying that if you write your code so that a follows a well written coding policy, you will not have any problems with reading the code. If you feel that some code you have written needs to be documented, stop a second and think about it. Is there another way of formatting the code so that it's easier to read? What if I extract a part of this function as a new function, giving that function a good name, would that help? What about using coding policies for variable naming conventions? Adding prefixes such as m_ for member, p_ for parameter and l_ for local will for example quickly let you identify the scope of a variable.

Here are some final notes from me:
  • If you are working on a larger project, then your comments will become deprecated sooner or less. You must alway be aware of the fact that any code you write now will probably be changed in the future (no code is perfect).
  • Always strive for as little code as possible. The more code you have, the more code you need to maintain. The same goes for comments.
  • Write comments in unclear situations, but don't write trivial comments. Instead, define a good coding policy and start writing "easy to read" code!

torsdag 27 mars 2008

Twitter and Counter Posts

Gustaf recommended me this site called twitter, which seems to be a great way of finding out what your friends are doing right now. Like a mini-blog where you write max 140 lines of text. I have just tried it out, but recommend you all to test it and to find out if you like it. Hanselman wrote a nice blog on this twitter-thing - read it at:

http://www.hanselman.com/blog/TwitterTheUselessfulnessOfMicroblogging.aspx

And also, if you haven't read the comments on my last post, Peter posted a comment, which he also posted as a "counter-post" on his own blog, describing why you should comment code. There are some nice arguments, so please read it. Of course I'm planning a "counter-counter-post" on my blog, put you have to wait some more days before I will release it into the public :)

tisdag 18 mars 2008

Stop documenting your code!

I've been reading this interesting book for a while now, The Pragmatic Programmer by Andrew Hunt and David Thomas. I was actually planning to blog about it first when being finished, but my mind is just going crazy about some stuff I've read recently, and thus I feel the need to blog about it now!

The book is based on "tips" that can easily be memorized and used again later when you put your programming hat on. One of these tips, which especially caught my attention, and which is going to form this post, describes the following principle:
DRY - Don't Repeat Yourself
This basically means that you should not repeat knowlegde. Further the book states that:
Every peace of knowledge must have a single, unambiguous, authoritative representation within a system.
Well, you say, this is trivial, you should not write two peaces of code that does the same type of work. If you for example have two functions being very similar, then you use refactoring and extract the common code into a new function. If you have two classes representing two different branches of some common structure, then you extract what is common, put it in a new base class, and let the two classes inherit from this new base class. But this you probably already know, so I will not bore you with this anymore.

Instead, think about the following for a second: being a programmer, does repeated knowledge only apply to code? What about comments? What if you comment a function, doesn't this mean you repeat knowledge? For example, look at the following code:

// Sets the name of the player
// @param name The new name of the player to set
void SetPlayerName(string name)
{
mPlayerName = name;
}

This is obviously repeated knowledge, right? Both the name of the function and the implementation perfectly describes what it does. Is there really a need for documentation here? I know from experience that in many cases you write comments like this, just because you should write something.

So what about a more difficult case:

// Returns the name of the currently loaded level
// or "" if none is loaded.
string GetCurrentLevelName();

If the comment didn't state that "" is returned when no level is loaded, then you could actually think it should throw an exception when no level is loaded, or that you would get the name of the previously loaded level (or if you are using C# or Java that null should be returned). But once again, knowledge is repeated. Just look up the implementation and it will state exactly what it does.

string GetCurrentLevelName()
{
if (mCurrentLevel != null)
return mCurrentLevel->mName;
else
return "";
}

Fact is, the implementation is the only "valid" documentation, because it's the only one you will know for sure is true. How many times haven't you written a function, documented it, then later in the development process realized that this function needs to be changed. You change the code but are a bit too lazy to update the documentation. Suddenly you have invalid knowledge! Isn't it better then just to remove the comment and let anyone who wants to know what it does open up the implementation and actually read the code?

If you find it too hard to find out what a function does, you probably need to refactor the function by giving it a more descriptive name, extracting larger independent blocks of statements, rename your variables (members, parameters and local) to something more descriptive (i.e. don't use "temp"!) and so on. If this doesn't work, then you are either bad at reading others code (which is a good practice in order to become a better programmer!) (I hope you know how to read your own code!? :) ) or the code you are looking at is really bad and should be removed as quickly as possible. You don't want to maintain such code.

Reading code should be like reading a book! You don't want the author of a book to add a lots of comments everywhere describing everything a second time in other words. The same goes for writing code, you should not add comments repeating what you have already written in another language.

So, my tips to you is this: refactor your code, and remove those comments!

onsdag 12 mars 2008

Gimpel Software Bug of the Month

Finally someone made games for us programmers ;) Check it out!

Gimpel Software Bug of the Month

And if you find them to be a bit too trivial, here is a harder case (actually the post where I first found out about this "Bug of the Month" site):

Visual C++ Team Blog : Diagnosing Hidden ODR Violations in Visual C ++ (and fixing LNK2022)

That one was really tricky (I couldn't solve it), but was fun reading!

Have fun :)

lördag 8 mars 2008

C++ vs. C#

For you who think that C# is the master language, why really is it? C++ has been around for soon 30 years, and is still considered to be the one language for many developers. You might say that those developers are a bit out of date or conservative, only sticking to the one language they learned several years ago, but the thing is I really think C++ has some interesting stuff not supported by other languages such as C#.

Memory Control
For one think, C++ has total memory control. You can allocate memory when every you want to and then use it any way you like. Then, when you don't need the memory you release the memory manually. This means of course that it's easy to shoot your self in the foot and forget to release the allocated memory, but also that you aren't required to have a garbage collector running in background thread taking up important cpu power (if the cpu power is important). Also, garbage collectors can only release an instance when there are exactly no references left to it left, so if there are circular references then those can never be released (except when exiting the application).

The garbage collector can thus not remove something on demand. For example, at some point in your program you might know for sure that "this instance is not going to be used any more, if it is then there is a bug", but you don't know for sure that there are no references left to it (perhaps you put it in a list somewhere and forgot to remove it?) and thus the garbage collector might not remove it and your memory consumption will grow larger and larger. After running your program for some time you will notice that the memory usage becomes large, and because you have no control over memory you cannot simply search in the code for a missing "delete" statement (which you can in C++, or actually search for new with a missing delete). And because the garbage collector can clean up the memory when ever it wants to, you cannot rely on the destructor being called at an appropriate time. C# (or Microsoft) thus doesn't recommend you to use destructors, but rather to implement a method such as void Dispose() which you must call manually to dispose any unmanaged resources being held by that object (you can actually use destructors, but what if you in that destructor adds a reference to the object somewhere? Suddenly the object is alive and shouldn't be destructed...). But if you forget to call Dispose in your code, you could easily lock up resources and get unpredictable behaviors.

In C++ you have better control to whether put something on the heap or the stack. If on the stack, you know that the destructor will be automatically called when leaving the scope, something not possible to do using C#. In C# you instead need to implement the Dispose method, and then create an instance of that class in a using-block, which will automatically call Dispose when the instance leaves the scope. There is no other way to do this in C#. If you for example have a lock helper class that automatically locks a given object when constructed and releases the object when being destructed, you could write code like this in C++:
void doDangerousStuff(Object &a, Object &b)
{
LockScope lockScopeA(a);
LockScope lockScopeB(b);

// Do dangerous stuff on a and b
}
In C# you could write this:
void doDangerousStuff(Object &a, Object &b)
{
LockScope lockScopeA(a);
LockScope lockScopeB(b);

// Do dangerous stuff on a and b

lockScopeA.Dispose()
lockScopeB.Dispose()
}
But what if you got an exception in the middle of the code? Then the Dispose() method would not be called, so instead you should use the using keyword:
void doDangerousStuff(Object &a, Object &b)
{
using( LockScope lockScopeA(a) )
{
using( LockScope lockScopeB(b) )
{

// Do dangerous stuff on a and b

}
}
}
But this doesn't look good, and what if you need to add more locks on other variables, the code would just grow more and more to the right, and at least to my opinion that's something you should avoid to the max. C# thus doesn't give you a good solution to this problem.

Actually, in C#, you can control whether to put something on the heap or on the stack. Classes are automatically put on the heap when allocated, and structs are stored on the stack. This means that it theoretical should be possible to call the destructor when the instance of a struct leaves the scope, but unfortunately C# doesn't allow you to add destructors to structs, so no good luck there :(

Headers and Linkage Errors
Another positive thing of C++, which some of you might find confusing, is the possibility to declare a function in a header, but then never implement it. This lets you design a class without actually giving an implementation. If you then try to compile the program you will not get any compiler errors but instead linkage errors. This means that "the code looks good, but I couldn't find the definition of these functions". You cannot do this in C#. Instead you need to add a method with an empty body, and in that body you throw an exception or assert false. Thus, instead of getting linkage errors, you get runtime errors.

The Const Keyword
I just can't figure out why languages such as Java and C# doesn't support the use of a const modifier. In C++ you can state a class method to be const, meaning it can't modify the state of the object. This is very useful if you for example want to pass an object by reference to a function, but want to make sure it isn't modified. You then mark is as const, and the function can only call the methods declared on that object that are marked as const. The state of that object will thus remain unchanged.

In Java and C# this is not possible. In Java you can at most (at least to my experience) send primitive types such as int, byte and float which will be passed as values (copies) and thus any change of these values in the function will only affect the copy. In C# you can do something similar, but even structs will be passed by value, enforcing the state of that object not being changed. This though would require you to rewrite every class you want to be able to pass to a function without it being modified, which of course isn't a very pleasing solution. Quite the opposite actually, as it also enforces you to make copies of the object every time you want to use it in another function.

To get around this (at least what I think) Java and C# have implemented some classes that are immutable. One such class is the String class. If you have ever examined this class you might realize that there is no way to modify it, it is fixed. To change it you actually have to instantiate a new string which holds the changed value. So, functions taking strings as parameters cannot modify them, even though they are passed by reference. But is this the solution you want for your big universal state that you want to pass a function which should not modify it?

Macros
One last thing to mention about C++, and which is also not supported by C#, is macros. A macro can totally destroy your code, but also let you write complicated beautiful code in only a few lines. If you for example have a lots of classes that you want to register to a factory, instead of writing:
factory.RegisterClass("MyClassA", &MyClassA::Constructor);
factory.RegisterClass("MyClassB", &MyClassB::Constructor);
you can define a macro and simply write
#define REGISTER(class) factory.RegisterClass(#class, &class::Constructor)
REGISTER(MyClassA);
REGISTER(MyClassB);
Isn't that cool? You can actually do a lot more complicated stuff if you want to, and even generate default class implementations this way. With macros you can do anything! The closest thing to macros you have in C# is reflection. With reflection you can for example extract the name of a class dynamically, and get a function from a class based on a string (the name of that function), but if that class hasn't implemented that function you will instead get a runtime error. In C++ this error will appear when compiling!

Why C# Still is Cool :)
But to turn to the other side of the street, C# isn't that bad after all. Compiling C++ code will give you a headache if compared to compiling C# code, which will only take a few seconds. Also, running C# code isn't actually that much slower than you might think. In some cases you will actually find out that your code is equally fast in C# as in C++! And C# has a lots of cool features not supported at all by C++ (and many other languages) such as properties, attributes, partial classes, yield and lock statements, coalesce operator ??, just to mention a few. And with C# 3.5 you have support for even more nice stuff, such as lambda expressions, LINQ and extension methods.

Actually, the reason to why I'm not writing a post on why C# is better then C++ is that it's so trivial :) C# have so many things C++ misses out on, but I just felt I had to write this article so that people don't think that C++ is nonsense. Because it isn't! At least in the gaming industry, C++ is still the leading language, with the reason probably being "you have more control" and "it's fast"! And around 2009, the next version of C++ is planned to be shipped, called C++0x, which hopefully will give the language a boost to survive the future!