Saturday, August 28, 2010

Software Quality (4)

In my previous post, I started a quick pass through the most common methods for ensuring quality at the lowest level: the day to day production activities. My assumption is that, if we are able to maintain a high level of quality throughout the entire development process, we will end up with a better, easier to maintain product, sustainable over several generations, and, as a side effect, a happier team.

We need to find cost-effective, lightweight processes, that maximize ROI. These processes should, ideally, not increase the number of collateral activities required, but, instead, should provide some easy to implement practices, acknowledged by everyone as valuable. Only by acquiring consensus, coupled with self discipline, they will be maintained when the pressure sets in. I am talking about a way of working that  should ensure:

  • Better understanding of the quality level; clear visibility for everyone of what has been correctly done and what are the current issues
  • Predictable results
  • Increased return of investment (see footnote 1) 

A healthy development process should also have positive side effects:

  • Shorter develop-prototype-test cycles (2)
  • Better spreading of knowledge throughout the team
  • Increased ownership and pride
  • Friendlier environment
  • A better distribution of effort throughout the entire length of the project

Some practices are team-wise, while others are linked to self management and can be employed even without a team-spread acknowledged process (3).

Working outside the product code - a workbench for rapid iterations:

Few things are as annoying as frequent interruptions or uncovering poor quality work that you have to deal with - like digging through messy code that has side-effects. These usually happen in a large code base, that has high coupling between components or that starts very slow. These two symptoms work hand in hand and, I think, are the biggest two impediments to constant re-factoring and team morale (starting slow leads to fewer runs which means less testing and fewer recompilations, which, in turn, means that programmers won't spend time improving existing code and will only add new layers on top of it). On the other hand, small applications have a very short modify-run-test cycle, that allows them to mature quickly and with fewer bugs. However, this advantage is usually lost in time, as the code and data gets bigger and bigger.

Is it possible to maintain the advantages of extremely short iterations over time? One solution that can often be applied is to isolate development from the main product, something that can be done easier especially when coding a new feature from scratch. The main goal would be to eliminate as much as possible the interruptions, by initially constructing a sandbox that (9):

  • Compiles and starts fast
  • Allows to write dirty modifications and test them incrementally
  • Allows to re-factor constantly what has been written, without impacting the rest of the code
  • Allows to focus only on new code, without worrying about other changes

Such a sandbox should allow very short prototype-compile-start-test-modify cycles. Even one at 2-3 lines of code modified - priceless! A programmer could hack test cases directly into the dev-code and run them tens of times a day to see the effects of his latest additions. He could easily re-factor everything until he is fully satisfied with the results (4):

  • Code that gets fast prototype-improved cycles matures faster.
  • When integrated, it has very few bugs because it was extensively tested and went through. 
  • Keeping the development sandbox in sync with future additions permits to quickly check future bugs and ensure consistency in time.
  • The amount of dependencies to existing system is kept to a minimum. The code becomes self-contained, reusable.
  • Having more than one client for your code (5) requires that boundaries are well cut an clear. This helps maintain the code clean, as artificial dependencies cannot be created. 

Sometimes it is not possible to work outside the main product. Many times, the code has been developed directly inside the main branch and, therefore, the amount of dependencies to other areas is so big that they cannot be decoupled. In the rush of development, it is very easy to create artificial links and not maintain clean cuts between modules, just because there is no system in place to enforce these boundaries. After all, it compiles! (6)

When developing outside the main branch, this kind of borders are naturally enforced by the development system, thus they cannot be crossed easily. However, if the development sandbox is not kept in sync with future additions and boundaries are not enforced further, dependencies will start to spread and the system will start to decay (7).

Test Oriented Development

In my previous paragraphs, I basically described an approach that resembles somehow to test driven development (TDD). It allows programmers to develop their code in separate applications (or test cases), iterate fast to get results fast and then integrate when ready. If these test cases are run and maintained throughout development, the benefits could be even higher as they require that the code is written in a modular fashion, they serve as test-bed for new features and bug-hunting and, equally important, they can be used as an entry point to understand functionality. 

I will not get into more details regarding TDD, as I have not practiced it first-hand beyond what I've described above, but I think that great attention should be put into integrating automated testing during the production phase and, equally important, to find a framework and a state of mind within the team that fosters quality at this level, even in times of great pressure. Sometimes it may seem like a chore that tests need to be maintained as well but, at least from my perspective, it is easier and cheaper to maintain tests or develop functionality outside the main product then to dive into unknown, hacky code, that has side effects and unexpected links to other obscure areas. 

Some may argue that this kind of test oriented development is not suited for all sorts of tasks and I agree. From what I've seen, however, I believe that greenfield development can be isolated from the main branch at least in its initial phases (be it UI or network or anything else). And, although the initial phases have passed, why not try to keep this isolation further and also maintain the sandbox in a working state? It can be used as a priceless testbed for future development and regression testing.

Other disciplines need fewer interruptions and rapid prototyping as well:

In order to spread the benefits of rapid iterations, the team should foster the development of such tools that require that the designers, programmers, artists restart the game (or the product at large) as few times as possible as fast as possible. The best development tool for a designer or for an artist is that one that allows him to see his work directly inside the engine, live, as he performs his changes. This would allow him to prototype as much as possible, ideally continuously.  It is essentially the same need that programmers have regarding their code. Example of such tools include editors that have accelerated, game-identical, simulation capabilities, live connections to 3DS Max, tuning tools that connect to the running game, live editors. The shorter the produce-export-test cycle is, the more time is left for iterations and more time the team has to creatively play and improve the product.

On the other hand, a lot of bugs appear because, during frequent changes, something is lost on the way. Very useful are the tools that automatically validate integrity before the asset gets into the game. The more checks the better and, therefore, such tools should be developed and spread to other teams as well, reducing implementation cost through sharing of technology and knowledge.

Similar processes to TDD should be implemented in the art and design departments, to ensure fewer bugs at release time, less overtime and a clearer picture of where the project stands. Activities that support high quality also help spread the knowledge and reuse of existing and well tested modules. Code / asset reviews, tools that verify content, tests, using 3rd party APIs, creating sand-boxes, may seem, sometimes, as to provide an additional burden, but in the end, if applied wisely, should decrease effort, decrease costs and unleash creativity.

Sum-up

The more iterations the better the product has the chances to be. The lowest level at which a developer can iterate is the source code or asset production. When people iterate very fast, they can try new things, have less fear of failure and their work product is better tested and crafted. Even more, the productivity loss and frustration associated with going in and out of flow is diminished. Coupled with discipline and peer reviews, constant iteration and continuous testing is key to attaining high quality, both during day-to-day activities and, at a higher level, feature and product-wise.

Implementation

Although sometimes difficult to see from the developer's seat, a healthy development process has the tendency to accumulate value as it rolls on. I assimilate positive practices to a snow-ball: they are difficult to establish in the first place but they grow in value faster and faster as the team acquires more and more success stories (8).

Some practices will not be seen as valuable by all the team from the very beginning. As such, they need to be implemented in less stressful times, when people are more prepared to try new things. Their value is usually seen in time, after they have become established methods, adapted to each organization. Feedback from the team is very important, as it removes implementation barriers and increases commitment to respecting the new ways. I think that, when implementing such changes in methodology, to understand that there isn't a-size-that-fits-everyone solution. Therefore, management and people should invest time and thought into finding the right way of adapting industry proven practices to their line of work, through collaboration, based on common goals. (Our Iceberg Is Melting)

In the following post I will depart from the code level perspective, and reach into some  higher level management frameworks:
  • Scrum and quality management (two previous posts here)
  • A process for quality management, that includes test driven development, white-box testing and black-box testing (mixing developers with testers and making testers part of the development team).
  • Case study: radical approach: one day per week, developer self managed time
  • Lean thinking
  • The importance of having all the team experience the product first hand, as its first beneficiaries (play the game!)

Footnotes:

(1) The production cost should decrease, both on the short run and on the lung run, by diminishing the need for suffocating debug periods right before the deadlines and turnover. Even more, remember that 80% of programming time is lost into digging and trying to understand what their predecessors have done. Decreasing this 80% should definitely reduce project costs or allow for more improvements.

(2) Actually, a well implemented quality strategy should have the side effect of prototype-ability. That is, a product that allows new changes to be integrated, tested and removed quickly, without major side-effects. 

To achieve that, the number of extra activities that are needed to prepare the prototype should be decreased to a minimum. That can be done only if the code is well understood by everyone: concise, clean and modular. I will not get into the details of what clean code is, but I'd like to recommend two books on the matter: one is "The Pragmatic Programmer" and the other one is "Clean Code". Why clean code? Because no stakeholder is interested in productivity drops or having a product that has sluggish performance or frequent crashes.

(3)  One practice I am not talking here about is pair programming.  Some of the most rewarding moments I had in my professional life (and also moments of intense knowledge transfer) are related to pair programming experiences I had with more experienced engineers. I believe it should be encouraged as a good practice inside a team - formally or informally.

(4) Just to give some examples I've worked on, that used this kind of approach: 

The character animation system from SH4 (together with a co-worker and friend): developed as a DLL loaded by a basic engine viewer. Start-up times? Few seconds. We could easily export animations from 3DS Max to the engine and test our system tens of times a day. 

The pathfinding algorithm from SH4 AddOn: I've started hacking a C# GUI app that allowed me to visually check paths on a simple world map generated from game-data, without having to start the game. Again, compilation and start-up times? Seconds.

The AIFramework dll from SH5: hacked as a Python prototype, then coded it in C++ as a dll, loaded by a console application that hosted the test-cases. Restart time? Again seconds.

(5) At least, you have the main product and the application that was used to develop that code.  

(6) Thus, even for simple changes like tuning a hard-coded parameter, the programmer suffers a compilation and a full restart, which is a very lengthy process. Sometimes, the boundaries are so poorly cut that even a slight change could trigger a major recompilation. When this cases occur, the only solution is to locally re-factor every time a problem is encountered, thus slowly improving code over time. 

(7) Adding scripting support to a project, for instance, could, potentially, bring this kind of benefit as well. Since the boundary is naturally enforced by the compiler, the developer is forced to pay more attention to how he organizes his code.  Also, well implemented scripting allows code modules to be developed and tested in an already started application, reducing the compile-start-test-modify overhead.

(8) The not-invented-here syndrome should be avoided at any cost. Good tools are expensive to build and they may not be justified financially if developed for a single project. However, the more sharing goes on and the more projects benefit from valuable technologies and practices, the ratio between initial investment and its returns becomes more and more favorable. Initial development cost is covered by multiple teams and, as sharing is increased, technologies  converge and become more and more useful and easier to use. Again, the snowball effect.

(9) This is not conflicting with the continuous integration principle.The idea is to build a sandbox that allows fast prototyping and short modification times for (new) features, integrate when the feature is almost complete and then keep this sandbox in sync with next changes (an example: a new data import / export pipeline should be developed separately while the previous system continues to work and commit when ready and well tested). It does not mean working on an old branch or separated from the rest of the team. By contrary, changes should be presented continuously and designers and testers asked to validate functionality on a frequent basis.

Wednesday, August 25, 2010

Software Quality (3)

In my previous two posts (here and here), I've briefly touched some hot subjects:

  • Quality from the user perspective and how important it is to sales. I've presented  some types of issues that hinder smooth customer experience.
  • Internal quality as a requirement for sustainable product development. I've asserted that customer-perceived quality cannot be sustained for long on a shaky ground.
  • We've seen a nightmare scenario that could happen if a well defined and rigorously followed quality management process is not in place. I've showed how this could lead to excessive overtime, slowed-down development, higher costs, lower morale, a bill many times paid by future development.
  • In the footnotes, I've showed that quality is a focus of all true professionals and how true professionals openly discuss and defend their principles.

In this post I will discuss (self) discipline as a basis for building great products and productive environments. I will also describe some practices that the software industry at large uses to attain sustainable quality, thus increasing employee self-esteem and customer happiness.

As software applications become more and more pervasive, penetrating deeply into our lives, so is the increased concern for high reliability and excellent experience. Software market is a very mature place, with a lot of competitors striving to sell roughly similar products to increasingly demanding customers. Concerns with quality began few decades ago, when human life was put for the first time in the hands of software driven machines: think airplanes, rockets, cars, only to name a few. Since then, the talk about reliability has moved to desktop applications, servers, phones - basically, everything you can think of. Of course, the processes needed to ensure bug-free software-driven life support systems are different then those needed to put rapidly on the market a state-of-the-art entertainment application, but they all have something in common: attention to detail, discipline and commitment to excellence.

Discipline:

Maintaining quality is a long-running, tedious process, that demands strength to resist the temptation to bend the rules to finish your work faster. It is very difficult to achieve, as the pressure is high from all sides and the rationale behind sticking to processes is not always apparent and understood by everyone. Even more, the results of applying quality management principles cannot be predictably foreseen, as, if everything goes smooth, no problem arise and people may feel that they spent more work and more money on something that cannot be actually measured. Disaster strikes only if quality is not managed properly, but it strikes later, after glorious results have been shown off and, of course, then, something else can be found to blame. (I use the word "managed" but I don't refer only to managers. Indeed, they should create a framework for people to have courage to defend and encourage the quality of their work but everyone should be held accountable for his/her deeds.)

Discipline manifests in three ways:

  • Sticking to tedious processes that, at times, seem only to make the development harder.
  • Taking time to challenge our work and the processes even in times of crisis (Kaizen). 
  • Sticking to your principles and processes even when external pressure occurs.
  • Working constantly, planning and following plans throughout the entire duration of the project. Indeed, it seems very hard to focus after a difficult ending and when the deadline is two-three years ahead, when requirements will definitely change and people fear their effort will be in vain. It's difficult to explain and find the heart to undertake hard work from day one, yet this is precisely why many projects are delayed, quality is decreased and frustration accumulates. It is very tough, requires thinking ahead, a strong heart and discipline.

In the decades since software management established itself as a full fledged science, processes matured and many companies found ways to ensure enduring quality of their products. More and more, developers become aware of the impact their production discipline has and start to take pride in the sustaining quality of their work, instead of their quick hacks to a local solution. As success stories hit the headlines, we start asking ourselves how they did it.

In the next paragraphs I will discuss mostly from the programmer's perspective, but the same rules can be extrapolated to both design and art and even to project management itself.

Where does the programming time go?

Here is an astonishing fact, that every developer can confirm: each programmer spends up to 80-90% of his/her time not writing code, but trying to understand what he/she or his/her predecessors have done before. Many times he/she fails to understand all facets and introduce subtle bugs.

Digging through sources is, most of the time, a tedious and not rewarding activity. Thus, optimizing by 50% the time spend searching is something that could radically improve life of programmers and has visible results on the budget. Yet this is hard to recognize, because, after spending so much time understanding what others have done, programmers feel the pressure to quickly get out of there by hacking their way to a solution to report a success to their managers (see also The Broken Window Effect). In a culture where re-factoring is not understood, it is difficult to explain why we need the extra time. After all, the results may be even worse and more spaghetti layers added to the already overwhelming complexity. I can even say that full re-engineering of a poorly written module that has thousands of lines of code is so difficult, that we'd better not touch it. Is there an escape? I think so.

Testing and re-factoring:

Some time ago, I read a paper from Microsoft that said that there are only two effective ways to increase quality of code. One is peer code reviews and the other is permanent testing. Many procedures have been created to make room for these and, as time passes, more and more companies embrace test driven development (TDD), pair programming, scrum and other agile methodologies. Even in companies known for not applying agile principles, code reviews executed by peers or by an external audit committee are performed. The outcome of these practices is constant re-factoring. In a word, the code is not left to rot. It is constantly updated and adjusted to the latest specifications.

Is there a point to undertake massive re-factorings at once? I'd be very cautious about that. The risk of  getting a bigger mess or exchange one mess for another is extremely high. However, I am strongly advocating constant, small, incremental updates. When a programmer finds something that starts to rot, he must update it on the spot, not hack something else on top of it. This constant improvement will not change the software into something better over night, but will cultivate a sense of pride and a mentality of constant improvement, shifting the focus from hacking. After all, constant improvement is the second most important trait of real professionals after sharing knowledge with their peers. (Anyway, the two are very much linked and roughly two facets of the same trait).

Code reviews:

The vehicle to encourage re-factoring and sharing is peer code reviews. Two or more developers gather in front of a computer and the programmer just about to commit his/her sources explains to the rest what he has done and why he took the decisions he took. Then the audience suggests a number of changes and the code is committed after the they have been made.

It is very important to understand the benefits of code reviews and the programmer not to take it as he is verified. He is not. He just shares to the world his solution and spreads his knowledge. It is a time of joy, socialization and pride. He is challenged on points he might not have thought of. He is challenged to respect coding standards and, by verbalizing his solution while going once again through his work, he, many times, discovers bugs he might not easily find otherwise. (I've experienced myself code reviews and I was surprised to uncover bugs I've introduced, bugs that, otherwise, would have haunted me later when found by QA).

Code review is very powerful but, unfortunately, some people are stopped by their egos to practice it. In the professional world, such egos should not surface but, since ego is a human trait, it should be taken into consideration. Code reviews are not places for showing off nor evaluation sessions. They are fun gatherings since they basically are just a bunch of professionals discussing what they do and love the most. They also create personal bonds and disseminate knowledge.

Regarding the time investment, usually a code review lasts for something between 10 to 30 minutes, when conducted by 2-3 programmers. At an average of 15 minutes, that means 3 * 15 = 45 minutes, 3/4 man-hours. Consider that, if you find a bug in 2-3 sessions, finding it, testing it, fixing it, regressing it is, by far, more expensive, without considering the other benefits (clean-up, knowledge transfer, bonding, which are harder to measure)

Sum-up:

Up until now, we've seen how work discipline is important, what are the most difficult aspects to creating a disciplined working environment but how it positively impacts production. We've also touched re-factoring code, bringing it up to date and code reviews. In the next chapters, I will talk about other vehicles to ensure enduring quality of our products, starting from the base level - the production itself:
  • Working outside the main branch
  • The benefits of test driven development (two clients for the same code); white/black box testing
  • Short compilation and start-up times
  • Prototype-and-improve cycles
  • Scrum and quality management
  • A process for quality management
  • Case study: radical approach: one day per week, developer self managed time
  • Lean thinking
GO TO NEXT CHAPTER

Software Quality (2)

In my previous post, I talked about software quality from the user's perspective. In the final paragraph, I briefly exposed my view on the general framework to achieve such quality - the well known KISS (keep it small and simple) approach to software development.

General Process (statement):

Managing quality on the grand scale is best achieved through:

 1) start with a small set of features
 2) make sure they form a consistent core
 3) implement, adapt, test, polish, bring to perfection.
 4) incrementally add new elements; perceived quality comes first (usability, accessibility, wow moments, visual polish); experience is the product.
 5) resume from 2, redesign and re-implement as needed. change is good and healthy
 6) when the time, the budget and the quality is up-to-par, deliver
 7) offer support, talk.

The beautiful part is that quality management scales down from the whole to each of the components, to the very last line of code, design document or model. In a word, you can't cheat quality. You can't have high perceived quality on a rotten core. Eventually, it could happen for one product, but at the expense of future development and constantly increasing development costs. True professionals have a good and pragmatic understanding of quality and take great pride in applying these principles throughout their work. They stick to their guns to defend quality in all its dimensions when challenged* (see footnotes).

In my future posts, I will drill down and see how we can deliver a high quality feature, then extrapolate to the whole project in the form of a proposed process. But, before that, I will refer to a possible scenario, that can happen when quality management is not seriously considered from the early stages of product development - excessive overtime due to not meeting quality standards**.

The problem with overtime due to quality issues:

1) Faulty or no quality management procedure leads to:
       a) Product looks better on paper than in reality (everything appears as done yet the project has an unknown quality status)
       b) Lack of visibility and a clear estimate of the remaining work

2) Because of the overly optimistic view on the project status, new features creep in. In fact, it is very difficult to freeze development because no solid argument can be given. The papers look good, the project seems to be working and roughly healthy. It is this moment, when the game starts to shape-up, that the designers and artists push to add more and more features to the game, without considering very much the possible inconsistencies***. After all, we still have time to tune an polish and then tune again, right? Not quite!

3) Close to the deadline, as the team moves slowly from features to bug fixing, more and more bugs are discovered. This time usually coincides with the moment when the test team is ramped-up and the product is more and more benchmark-ed against quality standards. Slowly, the real picture creeps in.

4) A vicious circle becomes apparent: more resources are added to the end of the project, effort on integrating new people is unknown and not measured, programmers starts hacking through the code, new and reopened issues rise up, more overtime is needed to fix them, yet tired people make new mistakes and the morale goes down. Although the rate of fixing bugs can be positive and on a very good trend, the hidden code quality is decreasing fast, at the expense of future development****.

An even worse consequence can happen: at a certain point, all development is frozen or heavily slowed down for days or weeks because the build is unplayable and a large part of the team can't work.

Footnotes:


* Through civilized communication, active listening and understanding of all opinions. Professionals think in terms of benefit for the customer, cost, industry standards, future investments, return of investment. They don't think of themselves "I am a professional therefore I know what quality means". On the contrary, they are willing to challenge their current perceptions and improve them continuously. Professionalism means modesty and openness to dialogue.


** Overtime can also happen due to the desire to meet a certain market opportunity when the team willingly commits to some seemingly impossible goals, but I will not refer here to this kind of enthusiasm. Most commonly, however, it is a mixture of quality issues, excitement, not finding what is fun soon enough to fit the budget (in game industry) or other scope management issues, all to various degrees. It can be light - few extra hours, maybe a weekend or two once in a while, or it can be worse.


*** It is normal for this to happen because, now, they finally have live feedback on their creative effort. These guys are very passionate and take a lot of pride in their work.


**** As some of the main ingredients of attaining quality are professionalism, excitement and commitment from all the people, excessive overtime is an enemy of quality. While based on previous records we can estimate the needs for overtime in terms of budget buffers, it is important that:
  •  managers and teams proactively try to estimate and find ways to diminish the need for long hours in the office and keep the project on track early, since the conception phase.
  •  stabilization and debug periods are planned throughout the course of the project, in order to maintain the build as close as possible to "release quality" and have a good visibility on how it is doing. 
  • a process is in place to ensure that quality from the user's perspective is attained on solid grounds and that the project is not rotting inside - quality is sustainable on the long run.
While quality and deadlines are the responsibility of the manager, the whole team should actively participate to meet them. After all, it is a sign of professionalism of all sides involved.


GO TO NEXT CHAPTER

Tuesday, August 24, 2010

Software Quality (1)

Foreword:

I start this thread to lay down some thoughts that haunted me during the past period. I would like to convince the reader that quality matters very much, that one cannot obtain customer satisfaction  unless industry-proven practices are employed throughout the entire course of the project and that failing to implement solid quality management procedures can have very bad consequences. These consequences are reflected in development costs, sales, trust and can span across multiple generations of the same product. In the end, I'd like the reader to retain the idea that focus on quality is the responsibility of all team members. Some hints will be given on how teams achieve excellence in their work, to their own benefit and that of the customer. I write these posts based on my own experiences, articles, books, discussions with friends who also work in the software industry throughout the world, and some projections I made for the future.

Customer Perceived Quality:

Every (software) project has to maintain an optimum balance between quality, cost and time. Provided that we keep cost and time fixed (which is the case for many projects) we have only one degree of freedom left. We need to understand what "quality" is, in order to know how to properly handle it.

To discuss about quality from the user perspective is to discuss about project scope and the number of issues the customers encounter (including documentation). On the other hand, quality is a general term that encompasses also: internal knowledge (code, processes, graphical assets), internal and external communication, management and the sense of self fulfillment of all stakeholders (team, all parties). On a broader sense I would like to link quality to professionalism and success, excluding the financial part* (see footnotes).

In this post I will refer to customer perceived quality:

  • Project scope: arguably, but a project that has a higher number of features should provide more usage scenarios and therefore should have a higher (potential) value for the customer (personally I like simple stuff but that is another story).
  • Broken features ("hard" bugs): features that don't function the way they were obviously designed to function (like crashes or a button that does nothing when pressed)
  • Design errors ("soft" bugs): features that function according to design, yet they are hard to use, have uncertain value, don't really satisfy user needs, are inconsistent, etc.
  • Perceived quality - degree of polish, nice touches, attention to detail, wow moments and, very important, smooth performance
  • Product documentation - should be large enough to cover all aspects, yet concise and easy to read by the target audience - I will not cover documentation here, as it is a vast subject.

If the bare minimum functionality is not implemented, the project is no good. Imagine a submarine without a periscope: it's inconceivable and useless. On the other hand, a submarine crew that has fewer animations than initially planned could be considered acceptable, as long as it is consistent and the number is above a bare minimum to provide some immersion and give a clue of what the crew is doing. In this case, people have a fuzzy understanding of what the optimum number of animations is (of course, the more the better, but more providing diminishing returns) and less animations are not true barriers to functionality (personally, I would choose to have less functions but highly polished. Given the animation example, I would think very carefully if I want this feature at all if I cannot bring it close to perfection).

The key for "hard bugs" is to obtain a low reproduction rate, no game breakers and no fully broken functionality. There is a threshold that must not be passed but, a random crash once in a while (very rare!) is acceptable, as long as the load times are short and no (significant) progress is lost. Also, this kind of bugs are easily caught by the test teams, so they are usually fixed before the release. The least damaging are bugs like those in the "flickering textures far away when the light falls from a certain angle" class. They don't disturb too much, don't affect progress in any way and, as long as they happen randomly, could be considered almost no issue.

Design deficiencies are: inconsistencies, lack of usability, lack of accessibility, crappy functionality that no one understands.  Some design errors are quick hacks for a deeper (technical / concept / budget) problem. Everybody knows about them, nobody is proud of them. The second category of design deficiencies is what I call "the elitist design", which is "good core but misunderstood and poorly explained" (at large or by some less experienced players). On the opposite side, comes the "dumb-ed down symptom". Generally speaking, a large project scope fosters design inconsistencies. Some design errors are hard to spot and require extensive play / usability tests. The problem is that the team knows the product so well, that they become blind to this kind of issues. Therefore, external help is needed.

As long as the above deficiencies are kept to an acceptable level and the product has optimum performance in terms of speed and smoothness,  small enhancements like flooding inside the submarine, washed periscope lenses, provide real value and increase the perceived quality. On the other hand, easily observable visual or audio bugs, although may not deeply hinder game-play mechanics, dramatically decrease the sensation of quality. Very important, a visible and firm support policy is a certain way to improve perception from our customers.

To conclude, my feeling is that, in order to maximize the quality of a software product, the best approach is to "stay small and continuously polish". This leads to a product that has fewer but better polished features. Keeping the number of elements low maintains a higher degree of manageability and saves time for iterations.

Quality is very expensive yet, today, quality sells. Due to increased market competition, customers are having higher-than-ever quality demands and not meeting them is certainly affecting sales both on the short and on the long run - through tainted image and brand. Due to the ubiquity of the Internet, customers easily create to themselves an image of the project before purchase and this image sticks**. Web pages, reviews remain for years.

As the marketing people say, experience is the product so a lot of effort has to be put into the perceived quality: visual / audio polish, accessibility, beauty, usability, minimum workload for the user. Starting from a small core of must-haves, then incremental enhancements, gives time to perfecting the product. Late cutting results in hard-to-eliminate inconsistencies, panic and, in the end, shipping a less-than-par experience.

Footnotes:

* It could happen that a project that has a high quality standard and its customers are very happy with it is not performing well financially. However, a project that is not meeting quality standards yet is performing well financially cannot be categorized as a success. Cost is only transferred to the next generation, both in terms of perception (brand, company image) and in terms of internal quality (code, documentation, graphical assets, processes, etc...). 


** Management of expectations and of communities is more important than ever in the days of the Internet. After all, unhappy customers are the most vocal online, as they find the web as an accessible, wide-audience platform to express their frustration. Managing the product image actively through open communication channels with the customers is a way to ensure that the product is evaluated to its true level and it is not drowned in discontent.

GO TO NEXT CHAPTER