Problems with repetitive source code or better: Don't repeat yourself
Posted on 23/4/07 by Tim Koschützki
Introduction
As php programmers, we collect, organize, maintain and use knowledge and information. We document our knowledge in specifications, use it when we test and write the source code of our application.
Alas, our knowledge changes, everyday and quite quickly, too. When a client wants us to add, change or remove a feature, when government sets out a new rule that lets our program's functionality become nonsense, our knowledge changes. When our knowledge changes, we have to reorganize. Most people think, maintaining the project starts when it is released. Actually, maintaining starts with the first word about the project written anywhere. As programmers we are in a constant state of maintaining our codebase. As project managers we are in a constant state of maintaining our documentation, plans, glossars, etc.
Avoid knowledge duplication
When the project's requirements change, we have to alter all code that is written on top of that knowledge. We will spend a lot of time finding all that code. We will have to write tests that document that everything we changed behaves in the new way we want it to.
According to the Pragmatic Programmers the principle to avoid duplication of knowledge is called the DRY-principle ("Don't repeat yourself"). What does that mean? It means you should ensure that every bit of knowledge of your system, be it in the source code, its documentation or any other documentation, should have a single representation.
The alternative is to write down the same bit of knowledge on multiple places. When something changes, you have to update all places or else your requirements list, project documentation or source code documentation becomes out of date.
Possible problems with repetitive knowledge
Most problems caused by knowledge repetition, belong to one or more of the following categories:
- Enforced repetition - Programmers think they have no choice. Circumstances seem to enforce the duplication.
- Undeliberate repetition
- Repetition due to impatience
- Repetition due to many developers
- Repetition caused by deadlines
- We programmers don't recognize that we repeat information.
- We get lazy and put repetitions in because the situation seems to get easier with them.
- Different developers from the same team (or different teams) repeat information.
- This often happens in php webdevelopment. A client needs the job done today, so we hack away without thinking about repetition.
Let's look at them in more detail.
Enforced repetition
Sometimes we seem to be enforced to repeat information. Project standards might force us to develop documents about our source code, which of course will contain duplicate information. What can you do to make changes possible and avoid knowledge repetition?
- Use scripts and generators that generate code and all the documents.
- Write abstract explanations why you do something in the source code the way you do it. Don't repeat what the source code is already telling the reader.
- Document information to use code in your phpdoc comments and document details that someone doesn't need to know to use your code in the implementation.
Undeliberate repetition
Everybody has a bad day once. We often struggle with different problems, probably from our family, friends and what not. Don't let them too close to you!
Programming is a highly sophisticated action that requires lots of concentration. Only program when you are really conscious, feel fine and are relaxed. Try to avoid repetition. When you cannot avoid it now, then document that repetition and fix it later.
Repetition due to impatience
That's what I find myself doing quite often. I want the project to be done now and don't abide by the DRY-principle. This is the wrong way to go about it. Why? Because you will end up in a repetition mess making it impossible to find all the things you need to change to fix the repetition. Often projects fail due to such repetition messes. This happens when nobody, not even the project leader, knows what's going on and what documents are the once the team should use. Fixing repetitions will make you spend precious time and very much of it! Code without repeating information in the first place and you will be fine. You may save a couple of minutes today at the potential risk to lose hours later. Code disciplined!
Repetition due to many developers
This is the most difficult form of repetition to find out and to fix. The only thing, in my opinion, which you can do about it, is to enforce communication among the team and across multiple teams. Ensure everybody is hacking away at a different piece of the system to avoid conflicts. Set up discussion forums, trac and what not to avoid the repetition. Regular meetings help, too. Fix any repetition as you find it.
Repetition caused by deadlines
This is also another very difficult one and I must admit I am not really sure yet what the best way is to handle it. On the one hand I agree that when the deadline is close you must make the code work. As long as it performs well, is secure and fulfills all requirements you are fine. If the deadline is still a week away, you should code with repetition in mind, though. Avoid it at all costs, or else it could lose you many hours, which may in turn cost you the deadline.
Conclusion
Make reusability easy. Try to build up an environment that lets you reuse and alter information easily. If something is not easy, people will not use it. Think about it.
You can skip to the end and add a comment.
[...] bigger picture. An important aspect of improving existing code is to eliminate duplication. Check Problems with repetitive source code or better: Don’ t repeat yourself. The importance of this lies in future modifications to the code. The more code there is and the [...]
[...] Of course there are different forms of this. In one case the expression that generates the contents for the temp is free of side effects. You can simply copy that expression to the query method. Other cases are trickier.. for example if the temp is to collect a result, for example by summing over a loop, you need to copy some logic (the entire loop) into the query method. Sometimes a loop is to calculate multiple values. Here you need to duplicate the loop for each temp and then use "Replace Temp With Query" on each of them. As you can guess, there is a thin border to repetitive source code. [...]
This post is too old. We do not allow comments here anymore in order to fight spam. If you have real feedback or questions for the post, please contact us.
My Thoughts On Recent Events...
[...] With this in mind I went on a quest to find similar sites [...]...