From The Trenches: Software Quality (4)

In my previous post, I started a quick pass through the most common methods for ensuring quality at the lowest level: the day to day production activities. My assumption is that, if we are able to maintain a high level of quality throughout the entire development process, we will end up with a better, easier to maintain product, sustainable over several generations, and, as a side effect, a happier team.

We need to find cost-effective, lightweight processes, that maximize ROI. These processes should, ideally, not increase the number of collateral activities required, but, instead, should provide some easy to implement practices, acknowledged by everyone as valuable. Only by acquiring consensus, coupled with self discipline, they will be maintained when the pressure sets in. I am talking about a way of working that should ensure:

Better understanding of the quality level; clear visibility for everyone of what has been correctly done and what are the current issues
Predictable results
Increased return of investment (see footnote 1)

A healthy development process should also have positive side effects:

Shorter develop-prototype-test cycles (2)
Better spreading of knowledge throughout the team
Increased ownership and pride
Friendlier environment
A better distribution of effort throughout the entire length of the project

Some practices are team-wise, while others are linked to self management and can be employed even without a team-spread acknowledged process (3).

Working outside the product code - a workbench for rapid iterations:

Few things are as annoying as frequent interruptions or uncovering poor quality work that you have to deal with - like digging through messy code that has side-effects. These usually happen in a large code base, that has high coupling between components or that starts very slow. These two symptoms work hand in hand and, I think, are the biggest two impediments to constant re-factoring and team morale (starting slow leads to fewer runs which means less testing and fewer recompilations, which, in turn, means that programmers won't spend time improving existing code and will only add new layers on top of it). On the other hand, small applications have a very short modify-run-test cycle, that allows them to mature quickly and with fewer bugs. However, this advantage is usually lost in time, as the code and data gets bigger and bigger.

Is it possible to maintain the advantages of extremely short iterations over time? One solution that can often be applied is to isolate development from the main product, something that can be done easier especially when coding a new feature from scratch. The main goal would be to eliminate as much as possible the interruptions, by initially constructing a sandbox that (9):

Compiles and starts fast
Allows to write dirty modifications and test them incrementally
Allows to re-factor constantly what has been written, without impacting the rest of the code
Allows to focus only on new code, without worrying about other changes

Such a sandbox should allow very short prototype-compile-start-test-modify cycles. Even one at 2-3 lines of code modified - priceless! A programmer could hack test cases directly into the dev-code and run them tens of times a day to see the effects of his latest additions. He could easily re-factor everything until he is fully satisfied with the results (4):

Code that gets fast prototype-improved cycles matures faster.
When integrated, it has very few bugs because it was extensively tested and went through.
Keeping the development sandbox in sync with future additions permits to quickly check future bugs and ensure consistency in time.
The amount of dependencies to existing system is kept to a minimum. The code becomes self-contained, reusable.
Having more than one client for your code (5) requires that boundaries are well cut an clear. This helps maintain the code clean, as artificial dependencies cannot be created.

Sometimes it is not possible to work outside the main product. Many times, the code has been developed directly inside the main branch and, therefore, the amount of dependencies to other areas is so big that they cannot be decoupled. In the rush of development, it is very easy to create artificial links and not maintain clean cuts between modules, just because there is no system in place to enforce these boundaries. After all, it compiles! (6)

When developing outside the main branch, this kind of borders are naturally enforced by the development system, thus they cannot be crossed easily. However, if the development sandbox is not kept in sync with future additions and boundaries are not enforced further, dependencies will start to spread and the system will start to decay (7).

Test Oriented Development

In my previous paragraphs, I basically described an approach that resembles somehow to test driven development (TDD). It allows programmers to develop their code in separate applications (or test cases), iterate fast to get results fast and then integrate when ready. If these test cases are run and maintained throughout development, the benefits could be even higher as they require that the code is written in a modular fashion, they serve as test-bed for new features and bug-hunting and, equally important, they can be used as an entry point to understand functionality.

I will not get into more details regarding TDD, as I have not practiced it first-hand beyond what I've described above, but I think that great attention should be put into integrating automated testing during the production phase and, equally important, to find a framework and a state of mind within the team that fosters quality at this level, even in times of great pressure. Sometimes it may seem like a chore that tests need to be maintained as well but, at least from my perspective, it is easier and cheaper to maintain tests or develop functionality outside the main product then to dive into unknown, hacky code, that has side effects and unexpected links to other obscure areas.

Some may argue that this kind of test oriented development is not suited for all sorts of tasks and I agree. From what I've seen, however, I believe that greenfield development can be isolated from the main branch at least in its initial phases (be it UI or network or anything else). And, although the initial phases have passed, why not try to keep this isolation further and also maintain the sandbox in a working state? It can be used as a priceless testbed for future development and regression testing.

Other disciplines need fewer interruptions and rapid prototyping as well:

In order to spread the benefits of rapid iterations, the team should foster the development of such tools that require that the designers, programmers, artists restart the game (or the product at large) as few times as possible as fast as possible. The best development tool for a designer or for an artist is that one that allows him to see his work directly inside the engine, live, as he performs his changes. This would allow him to prototype as much as possible, ideally continuously. It is essentially the same need that programmers have regarding their code. Example of such tools include editors that have accelerated, game-identical, simulation capabilities, live connections to 3DS Max, tuning tools that connect to the running game, live editors. The shorter the produce-export-test cycle is, the more time is left for iterations and more time the team has to creatively play and improve the product.

On the other hand, a lot of bugs appear because, during frequent changes, something is lost on the way. Very useful are the tools that automatically validate integrity before the asset gets into the game. The more checks the better and, therefore, such tools should be developed and spread to other teams as well, reducing implementation cost through sharing of technology and knowledge.

Similar processes to TDD should be implemented in the art and design departments, to ensure fewer bugs at release time, less overtime and a clearer picture of where the project stands. Activities that support high quality also help spread the knowledge and reuse of existing and well tested modules. Code / asset reviews, tools that verify content, tests, using 3rd party APIs, creating sand-boxes, may seem, sometimes, as to provide an additional burden, but in the end, if applied wisely, should decrease effort, decrease costs and unleash creativity.

Sum-up

The more iterations the better the product has the chances to be. The lowest level at which a developer can iterate is the source code or asset production. When people iterate very fast, they can try new things, have less fear of failure and their work product is better tested and crafted. Even more, the productivity loss and frustration associated with going in and out of flow is diminished. Coupled with discipline and peer reviews, constant iteration and continuous testing is key to attaining high quality, both during day-to-day activities and, at a higher level, feature and product-wise.

Implementation

Although sometimes difficult to see from the developer's seat, a healthy development process has the tendency to accumulate value as it rolls on. I assimilate positive practices to a snow-ball: they are difficult to establish in the first place but they grow in value faster and faster as the team acquires more and more success stories (8).

Some practices will not be seen as valuable by all the team from the very beginning. As such, they need to be implemented in less stressful times, when people are more prepared to try new things. Their value is usually seen in time, after they have become established methods, adapted to each organization. Feedback from the team is very important, as it removes implementation barriers and increases commitment to respecting the new ways. I think that, when implementing such changes in methodology, to understand that there isn't a-size-that-fits-everyone solution. Therefore, management and people should invest time and thought into finding the right way of adapting industry proven practices to their line of work, through collaboration, based on common goals. (Our Iceberg Is Melting)

In the following post I will depart from the code level perspective, and reach into some higher level management frameworks:

Scrum and quality management (two previous posts here)
A process for quality management, that includes test driven development, white-box testing and black-box testing (mixing developers with testers and making testers part of the development team).
Case study: radical approach: one day per week, developer self managed time
Lean thinking
The importance of having all the team experience the product first hand, as its first beneficiaries (play the game!)

Footnotes:

(1) The production cost should decrease, both on the short run and on the lung run, by diminishing the need for suffocating debug periods right before the deadlines and turnover. Even more, remember that 80% of programming time is lost into digging and trying to understand what their predecessors have done. Decreasing this 80% should definitely reduce project costs or allow for more improvements.

(2) Actually, a well implemented quality strategy should have the side effect of prototype-ability. That is, a product that allows new changes to be integrated, tested and removed quickly, without major side-effects.

To achieve that, the number of extra activities that are needed to prepare the prototype should be decreased to a minimum. That can be done only if the code is well understood by everyone: concise, clean and modular. I will not get into the details of what clean code is, but I'd like to recommend two books on the matter: one is "The Pragmatic Programmer" and the other one is "Clean Code". Why clean code? Because no stakeholder is interested in productivity drops or having a product that has sluggish performance or frequent crashes.

(3) One practice I am not talking here about is pair programming. Some of the most rewarding moments I had in my professional life (and also moments of intense knowledge transfer) are related to pair programming experiences I had with more experienced engineers. I believe it should be encouraged as a good practice inside a team - formally or informally.

(4) Just to give some examples I've worked on, that used this kind of approach:

The character animation system from SH4 (together with a co-worker and friend): developed as a DLL loaded by a basic engine viewer. Start-up times? Few seconds. We could easily export animations from 3DS Max to the engine and test our system tens of times a day.

The pathfinding algorithm from SH4 AddOn: I've started hacking a C# GUI app that allowed me to visually check paths on a simple world map generated from game-data, without having to start the game. Again, compilation and start-up times? Seconds.

The AIFramework dll from SH5: hacked as a Python prototype, then coded it in C++ as a dll, loaded by a console application that hosted the test-cases. Restart time? Again seconds.

(5) At least, you have the main product and the application that was used to develop that code.

(6) Thus, even for simple changes like tuning a hard-coded parameter, the programmer suffers a compilation and a full restart, which is a very lengthy process. Sometimes, the boundaries are so poorly cut that even a slight change could trigger a major recompilation. When this cases occur, the only solution is to locally re-factor every time a problem is encountered, thus slowly improving code over time.

(7) Adding scripting support to a project, for instance, could, potentially, bring this kind of benefit as well. Since the boundary is naturally enforced by the compiler, the developer is forced to pay more attention to how he organizes his code. Also, well implemented scripting allows code modules to be developed and tested in an already started application, reducing the compile-start-test-modify overhead.

(8) The not-invented-here syndrome should be avoided at any cost. Good tools are expensive to build and they may not be justified financially if developed for a single project. However, the more sharing goes on and the more projects benefit from valuable technologies and practices, the ratio between initial investment and its returns becomes more and more favorable. Initial development cost is covered by multiple teams and, as sharing is increased, technologies converge and become more and more useful and easier to use. Again, the snowball effect.

(9) This is not conflicting with the continuous integration principle.The idea is to build a sandbox that allows fast prototyping and short modification times for (new) features, integrate when the feature is almost complete and then keep this sandbox in sync with next changes (an example: a new data import / export pipeline should be developed separately while the previous system continues to work and commit when ready and well tested). It does not mean working on an old branch or separated from the rest of the team. By contrary, changes should be presented continuously and designers and testers asked to validate functionality on a frequent basis.

1 comment:

Mihaela Georgescu said...: Dear, n-am gasit altundeva unde sa iti las comment. Vroiam sa te intreb daca ai cartea Brain Rules si daca o dai cu imprumut, ca imi suna interesant :)

PS: Am citit si eu Our Iceberg Melts, foarte tare e. Gasesti aici recenzia personala: http://bit.ly/dohefC; August 30, 2010 at 7:00 PM

From The Trenches

Pages

Saturday, August 28, 2010

Software Quality (4)

1 comment: