Can anyone explain this, from the first paragraph, to a bog standard PHP-for-the...

scanr · on April 3, 2012

Here's my attempt:

- "Programming in the large" - programming produced by lots of people and/or meant to last a long time.

- "creating and maintaining boundaries" - when you have a lot of people working on code, it's very useful to be able to isolate code and define the boundaries between code explicitly. Things like interfaces and type safety tend to help do this (the "abstract" in "abstract and operational"). On the "operational" side, having modules that produce dynamic libraries with good versioning semantics helps.

- "that preserve large-system integrity, availability" - again, type safety, in-built null pointer protection and garbage collection all help make sure that code doesn't fall over as much as it could without those (there are some philosophical arguments in here that I'm not exploring).

- "concurrency" - Consensus appears to be forming around the idea that a good way to manage concurrency is with lightweight processes and message passing rather than threads and shared state. Rust does the former. Rust also encourages the use of immutable state which also helps avoid concurrency issues.

tomjen3 · on April 3, 2012

One ofmthe original ideas of oop was that it would be possible to encapsulate how part of anrogram worked from the person who used it. You shouldn't have to care about the inner workings of the storage layer, say, or the gui toolkit.

Unfortunately in reality things are usually not that well encapsulated. Things like threads, integration withe the system, etc often interfere as does things like the desire to test a particular piece of software.

Basically think of it as enabling the same benefits you get from having a general count function over having specific functions for arrays, directories, lists, etc but to an enti system.

bitcracker · on April 3, 2012

I would recommend to read this story about Ada which is the preferred language for highest quality software:

http://www.adacore.com/home/ada_answers/business_benefits

dkarl · on April 3, 2012

Some of the difficulties in large projects include:

1. It's easy to break things accidentally, and it's hard to track down all the usages of something when you change it. This is especially true when functions and types can be aliased under different names, changing a class affects all subclasses, etc. The more complex the application, the more you get pervasive usage of types defined within the application. This is a real change from smaller applications where most of the pervasive types probably come from the language's standard library, which can be trusted to be stable between major releases.

2. A large codebase means that whatever finicky housekeeping the language requires is almost guaranteed to get screwed up somewhere in your application. If this affects the integrity of the whole app, it can be very bad. Invalid memory accesses are an infamous example. It doesn't matter if you have a million lines of awesome, feature-filled code; if one buggy line of code crashes your app every ten minutes by dereferencing an invalid pointer, your app is unusable.

3. Fragmentation of types and libraries. Programmer X adds a Flugelhorn library on one end of the app, programmer Y adds a Flugelhorn library on the other end of the app, and nobody realizes it until incompatible Flugelhorn types start bumping into each other in the middle. This is less of a problem for a community where "programming in the large" means many people writing many small independent projects, but it could be a major annoyance for a language where the goal is to write large, complex applications such as web browsers.

4. Concurrency. Concurrency can be a coupling factor that requires programmers to know too much about global design and global state. It can also destroy performance and stability. Finally, given principle #2 above (every large running application contains screwed-up code somewhere) runaway tasks are a threat to the whole system if not contained. When done well, however, concurrency can reduce coupling and improve stability.

5. It can take a long time to rebuild a large codebase and run its tests.

I don't know much about Rust so far, but here's my stab at matching the features of Rust against those five problems:

1. Compile-time type checking is a big help. Rust also has some ways to enhance types without subclassing.

2. Memory safety, garbage collection, errors that propagate upward by default, isolated tasks. Error handling is unappreciated in this category, I believe. It's important that an application not accidentally suppress an error and continue, because it could corrupt data, return incorrect results, or behave insecurely. A language like C where errors can be swallowed through oversight or programmer laziness is dangerous in that regard. In Rust, errors propagate upward, unwinding the stack until they terminate a task or are explicitly handled.

3. Generic types help, and built-in support for Unicode text is a necessity. It also helps that the organization shepherding the language is likely to be the biggest user of the language. It's important to note that anything that helps with #3 also helps with #1. For example, if the language has a standard string type you can use everywhere, then your tests don't need to protect against Doug down the hall (or across the world) changing the string type and not realizing he broke your code.

4. Immutability, message passing, isolated tasks, per-task GC. It sounds like Rust is trying to make task isolation the default way to firewall errors off from the rest of your app. For a web programmer, this is exactly like the way a Java web container is supposed to work. If one web app in the container goes haywire, the container and all the other web apps should ideally be able to continue running without their stability being compromised.

5. I don't know what techniques are used by Rust to speed up compilation. A compiled language is at a disadvantage on this count, but compiling to object code is necessary for a systems language. Also, static type checking means drastically fewer unit tests (I don't want to start a flamewar about whether this is a good thing; it's just the way people code) so you get a little bit of your lost compile time back when you run tests.

kibwen · on April 3, 2012

  > I don't know what techniques are used by Rust to speed up compilation.

From what I've observed, forcing the developers to compile the language using itself is a great way to keep the programmers mindful of compilation speed. :)

In Rust's case, I believe the bottleneck is LLVM. You can disable optimization if you just care about fast turnaround times, but you're always at the mercy of LLVM to do the actual code generation. To that end, the Rust devs have recently begun an initiative to profile the amount of LLVM IR that they generate in the compiler, and they've made some good strides so far just from picking the low-hanging fruit.

As far as comparisons to other languages go, I'm not sure if Rust's compilation model is set up to be as fast as Go's, although (IIRC) Rust avoids the template-generation step that bogs down C++.

pcwalton · on April 3, 2012

We'll never be as fast as Go in compile speed: just to name a few issues, we do flow analysis (so we check whether your variables are initialized before you use them), our typechecking proceeds by unification (so you can say stuff like [] and the typechecker will figure out what the type of that empty vector is), and we have an extra IR-generating step (so we can take advantage of LLVM optimizations). It's a question of different goals; one of Go's primary goals was to compile quickly (which is a fine goal!), which isn't one of the goals of Rust (although we definitely want to compile fast, of course, and we have continuous integration benchmarks to that effect).

jentulman · on April 3, 2012

Thank you all for your replies. I'm entirely self taught and attempting to become a 'better coder' in the abstract whenever possible, things like this definitely help along the way.