Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Applied Theory of writing bug-free code (sites.google.com)
22 points by clawrencewenham on Oct 10, 2009 | hide | past | favorite | 31 comments


I stopped reading at "strong typing". (well, not really... but I started typing this comment roundabout there)

First of all, what he describes is not called strong typing, it's called static typing. Ruby, for example, has strong typing (every object is one type and one type only), but not static typing (you don't declare the type of an object when you declare a variable).

Secondly, there are many powerful languages that don't have static typing. Is the author implying that those languages are inherently buggy?

Thirdly, the author is not advocating what he declares in the title. He proposes methods to catch bugs, not to avoid them. His methods are also flawed in that redundancy does not ensure bug-free code - only that you haven't made any silly mistakes in your coding. Bugs come from all sorts of sources, and "code typos" are only one minor source. Other sources include: maintenance changes (which cause unforeseen effects that were not previously covered by tests), design flaws, and changed external circumstances. None of those are covered by the technique he lists.

Sorry, but this is just a poor attempt to capitalise on a catchy title. Nothing to see.


Static typing doesn't necessarily mean type declarations, either. See OCaml. What it does mean is that the compiler knows the type of all variables at compile time, but it's OK if it figures that out for itself.


Static typing gives you 'redundancy of meaning' solution for a fairly rare and not especially harmful, or hard to detect by other means, error of holding some data types under identifier that you meant to hold values of different type.

It's a hard solution (vide generics and all the redundant typing) for minor problem and it cannot justify its existence on such basis.

For me static typing is just a hack on the side of compiler guys to make their work easier for them. It was claimed to have benefits for programmers at later date.


Have you written in anything from the ML family? I strongly recommend you take a look at OCaml or Haskell. I believe you will find that static typing, when done right, is immensely helpful.


I'm using language without static typing for most of my work and I can hardly remember any bug that was caused by having wrong type of data in unexpected place. I remember very well redundancy of using language with static typing, I know how much boilerplate code is needed in languages such as java, and how complicated can generics become in C# if you want to achieve fairly simple architectural things that can be concisely expressed in dynamically typed languages.


Ok. Those languages do static typing incorrectly. Try a language that sports type inference. It isn't redundant. There isn't boilerplate. The type system is far more powerful than in Java or C# (polymorphic types, polymorphic variant types, algebraic types and constructors, functors, etc).

You owe it to yourself to learn these systems. In many cases, there are type-side symptoms of logic-side bugs. If Haskell's laziness and purity scare you, try OCaml -- it's fast and straightforward.


You are right, even though you have to give him the point that a lot of the existing static type systems around by now are bad. Haskell is approachig a really nice, hassle-free type-system, but take, e.g., Javas typesystem? Bleh.


You're right, I meant static and not strong typing.


I'm forever intrigued by the mind's ability to discern patterns. Today I realized that just from the few examples that appeared on HN, my mind has - quite unconsciously - locked on the combination of google.com in the site slot and a title about programming and its methodology. "Oh, it's that yacoset person", my mind tells itself without words, and I feel a little jolt of recognition-cum-disappointment (for the pieces are usually fluffy and fussy). I distrust myself, and so I hover over the link to make sure, yet so far the pattern has held without fail.


Ouch.


What he is really arguing for is more redundancy in programming. Strong typing (he really means static typing), so you can type (with your fingers) thing more than once and you can put constraint enforcing wrappers around your values. Assertions an method contracts, TDD and unit tests, and finally NASA style parallel development.

Except for unit testing, I question most of it. The problem with too much redundancy is that it slows you down. If you create too much ceremony, it takes 20 mediocre programmers to do poorly what 5 good programmers can do well.

I would much prefer good tools to redundancy, for example type inference over Java style type repetition. Maybe you can use asserts as a programming aid, but not in production. If you may get null values, deal with them; try not to create an exception for someone else to deal with. And if we used NASA style coding, it would take 5 years to get a web site up.


In my view, contracts can do the "redundancy" thing much better than typing can. It's explicit, it's optional, it's a better guarantee that you're getting what you want, it's more flexible, etc. Contracts and TDD are my favorites right now, i guess.

TDD is particularly great, though, as it's an in-code description of what the program should do. If it doesn't pass the tests it doesn't do what you thought it should do--ie, there are bugs.

(Of course, then we have to figure out how to write the tests... but hey...)


this might be appropriate for NASA level programs, which i suspect are heavily spec'd and designed before any code is written, and then they want the code to work, probably without much iteration. i'm not sure if this is completely wise, since iteration on implementation can be useful, but if the client's specs don't change and you have very experienced coders, then iteration may not be that useful. very correct code is.

on the other hand, for (young) programmers learning the latest web 2.0 tools andtweaking if not changing the product frequently, iteration, and thus speed is incredibly useful.

i think both approaches, heavily simplified here, could work--assuming the folks involved aren't just following a recipe but engaging with the process and committed to making it work for their particular circumstance.


NASA has some rather particular requirements, though. If their systems fail their spaceships crash and explode. Possibly killing people.

It's a much stricter requirement than most people will ever deal with, but wow do you not ever want to get that wrong.


He begins with an argument for redundancy in such a general sense that it's almost a truism. One way or another, since you don't know for certain that your code works, you should add code (ie checks) to make sure that it does.

BUT, that general argument, in and of itself, doesn't say testing or strong-typing or asserts whatever is the answer. It seems to me that what is needed is checks that are highly tuned to the situation at hand. Just as simple unit tests are limited, Strong typing is limited to catching a small portion of bugs. I want a language in which I can add only verifications that I need, when I need them and skip the rest. I've never tried it myself but the automatic typing of some functional languages seems like the appropriate next step (and I'm currently programming in a strongly-typed language that is generally a pain in that way).


i don't know. in college i wrote static and dynamic program analyzers, and was in a very pro-java-typing group. when i moved to python i was sure the lack of typing would be problematic.

the lack of static typing has NEVER caused a bug. a few times i will have a bug because i mispell a variable name, thereby inadvertantly creating a new variable, but i have always found those problems on the next run of the program.

vastly more useful are creating tests and using them often.

to be fair to my college days, i still create object models (ish...i have my own version) and other diagrams, which i find crucial for design and frequent reference aftewards.

the whole "type" thing must be important for other problem domains.


Pretend there was a type "numbers between 1 and 4." Then, when you try to cast a 5 to that type, it would fail. Types are just sets of possible instance values; that most languages limit them to reasoning only about the outermost features of the type (e.g. capacity of 1 vs N, mutability, signedness or realness) makes them orders of magnitude less useful than they could be.


Pretend there was a type "numbers between 1 and 4."

Pascal and Ada have this I think. I think Eiffel has contracts to enforce that type of thing instead.


I've written a lot of Java, C++, and C#. I currently write a lot of MSBuild (not by choice) and Python. Static vs dynamic typing is neither a mutually exclusive choice, nor a choice which can occur in a vacuum. Use the right tool for the job, or else one of these things will happen:

Java/Swing: You will be woefully unequipped for the natural cascading property nature of GUI applications; forcing you to jump through giant hoops to get anything to not look awful.

C++/COM: You'll cast IUnknown to ISomethingElse so many times that it will lose meaning.

MSBuild: You will make a typo in a parameter name, but won't find out until two hours into your build when you get an error that could have been caught by a compiler.

--

Static typing works for problems that demand rigor, but may be hard to unit test. That includes most larger projects worked on by companies of overworked and slightly disinterested developers.

Dynamic typing works when you can iterate as fast as you can refresh the browser, or when your application is small enough that it is reasonable to full text search the entire thing anytime you refactor anything.

You can build dynamic type systems on top of static ones, but it is ugly. Don't be WPF's DependencyObject system cringe. It is safe, however, to layer dynamic code above static code. Feel free to use a dynamic language for scripting a statically build application.

You can build static type systems on top of dynamic ones, but they will leak. It is also awkward to layer static code above dynamic code. Odds are your code will be dynamic, but more verbose because of the type system getting in your way.


>>a few times i will have a bug because i mispell a variable name

Huh, that spelling problem would give me errors every page.

Python have no warnings for uniquely created/used variables? There is no possibility to demand some declaration of existing variables?

It is quite easy to parse Python(?), so the IDEs should catch this, at least?


Python requires you to initialize a variable before reading it, so:

  foo = object
  ofo.__call__()
will cause a (runtime) error. Thus, if you use your declared variables, errors will occur on mistypes (unless, of course, you have the typo declared, which would be a smell in itself (think of variable names being too close))


OK, what if it is a flag variable that is either undefined or an integer? That is, do you need to initialize all variables?

Also, how about this?

  foo = 10
  ...
  if ...
    ofo = 20
  ...
  formula = bah * ( ... foo ... )
Edit: Please tell me the second case isn't as catastrophic as it looks? That would give at least me no ends of problems -- I thought modern languages left that kind of sh-t in the 1990s? Better have a good culture for testing!


1) If you want to read the variable in some way, you need to initalize it. If you want to do weird things involving setting random attributes on an object and check if they exist.. well, you're on your own. Generally, I'd recommend for the flag variable to have a definite uninitialized value (say, None), as this is far easier to check.

2) The opinion of a lot of people on python is that the lack of such (impossible[1]) compiletime tests requires you to replace them with good unit-tests (which should be written anyway). So yes, it is as bad as it looks, but it is recommended to test such things :) Also, there are tools like pychecker and pyflakes, which do sanity-checks which would capture these errors (even though they might be an insane valid python program[1]).

[1] This is impossible, as it is possible to set variables in earlier stack-frames inside a nested call, or I might fiddle around in the locals of an earlier call to get data I want. Certainly, no one would do this (and whoever does such a thing should be stabbed multiple times if this hack persists longer than an hour (hey, I have built such a thing myself to debug a deadlock about resources, it printed the direct caller of the allocation- and deallocation-method, so I could quickly check for unmatched alloc-calls :) )), but it is possible, and thus, the compiler has to assume such things happen.


My second example would bite me quite often.

Do I get this right?

You have to do really hard line testing to avoid simple problems that could be avoided if just variables had to be declared at first use? :-(

But like old C compilers, there is "lint"?

I'm all for nostalgia for old C compilers (except a buggy one, that stole weeks of my life). But that is strange.

I thought Python was all about safety and no problems because of a well designed syntax and coding standards akin to bondage?

Ah well, thanks for information!


the type thing is important to the kinds of people who get a woody thinking about the maths that deal not in numbers.


the implicit assumption in all of these arguments is that it's more important to ship bug-free (or as close as you can get to bug-free) code than to ship quickly. In the real world, that is almost never the case. In fact, I can't think of a single situation in which it is ever the case.


Wow, are we really that dominated by the startup culture here? You can't think of a single situation where correctness is more important than speed?

I've spent most of my career working on systems handling money, at jobs where we actually bought and sold things. When there are bugs in the code, we lose real, quantifiable money. Sometimes being first-to-market with a product or feature is more valuable than the cost of bugs. Sometime a buggy feature makes more money than it loses. But, it's not a given, and it's naive to say that speed is more important than correctness in all (or even most) situations.


No that isn't the assumption. And when the article does discuss project deadlines it most certainly--and with cited references--talk about the trade-offs present in the real world.

The article was written to discuss ways of reducing bugs independent of any other need.

Furthermore, although you "can't think of a single situation in which it is ever the case", I myself actually picked one where a $4 billion machine with six lives at stake was an excellent example of when it DOES matter what the quality of the software is.


ok then, in 99.999% of all applications I would claim that shooting for virtually bug-free code is a sure-fire prescription for business failure.

BTW why did Nasa downgrade their redundancy to 2 systems in the Shuttle? I recall reading that during the lunar missions, they had 5 computers on board, all independent implementations of the same spec.


I understand your hyperbole, but the stress isn't necessary. The market will reveal where it wants quality and where it doesn't.

Also, the Shuttle has always had 2 independently written systems, never five. There are five computers running software written to the same spec, but the physical redundancy wasn't the scope of the article.


the redundancy idea scales as much as you want it to; this post remixed: http://socialfact.com/reader/seemore-how-much-are-you-willin...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: