Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
iOS 5 has garbage collection. Here comes MacRuby/iOS? (pogodan.com)
139 points by themgt on June 7, 2011 | hide | past | favorite | 94 comments


iOS 5 doesn't have garbage collection, it has a new compiler-level feature called Automatic Reference Counting which does pretty much what you think it does. It's basically retain/release without you having to explicitly write retain/release instructions.

(I don't think I'm violating any NDAs here, since ARC was up pretty big on one of the slides at the keynote. In any case, anybody who cares about this stuff is either at WWDC or at least has a dev account and has been pouring over the beta docs since they've gone online.)


ARC is touted on this public page: https://developer.apple.com/technologies/ios5


That's a... peculiar choice of icon.

edit: ah, now I get it. Cute.


I don't get it :( please explain.


When electricity moves through the air between two points like that its called an arc. (look up electric arc on youtube for examples)


Oh, so it isn't a visualization of how ARC works. It's just an image of something else that coincidentally has the same name. I would've never figured that out. Thanks.


It's an arc.


I think this is actually a bigger deal from a developers perspective than most of the other announcements. Since it's a compiler feature, does anyone know if it's compatible with older os releases?

Once you're familiar with the manual memory management conventions, it's not difficult, but this should make iOS development more approachable.


"ARC is supported in Xcode 4.2 for Mac OS X v10.6 and v10.7 (64-bit applications) and for iOS 4 and iOS 5. (Weak references are not supported in Mac OS X v10.6 and iOS 4)."

But how about looking at the docs yourself?


Aren't they behind a password/NDA wall? Perhaps I didn't see that information when I was looking at it?


I'm sorry, I didn't mean to be snarky. (Although if you're not a member of the dev program, I don't see why you care about this information?)


Any new platform that isn't ultradoomed is interesting. I might have been waiting for iOS development to become less brittle before getting involved (that is, if their policies towards developers and users weren't so repellent).


I'm not sure why you would say this isn't garbage collection. It performs automatic management and reclamation of resources without programmer intervention. It doesn't operate like the GC's of the JVM or .NET CLI, but then neither to the GCs for other languages like Lua, Python, or Perl.


> I'm not sure why you would say this isn't garbage collection.

Because it isn't. Garbage collection is something that happens at run time by a garbage collector via an algorithm like mark-and-sweep, tri-colour marking, etc, and as a block of code that's called at runtime it A) has a lot of information about your program and B) has a nontrivial performance impact.

Unfortunately I can't talk specifics (stupid NDA), but schrototo has said that it's a compiler-level feature, which would automatically disqualify it as being a bona fide garbage collector.

Speaking generically, any static feature is going to a great deal less powerful than its dynamic equivalent. Thinking of compile-time memory management as garbage collection is not just a leaky abstraction, it's a broken water main flooding into the street.

Think less "auto pilot" and more "cruise control". You still have to understand the accelerator and the break, you just don't have to use them as much.


A garbage collector doesn't need to use an algorithm like mark-and-sweep, or two space copying. It can be something as simple as a system that scans a block of memory looking for any sequence of bytes that appears to be a pointer into managed memory. You don't need special compiler support or extra program knowledge for garbage collection -- conservative collectors can be added to any language, like using the Boehm-Demers-Weiser GC with C or C++.

Automatic garbage collection doesn't even require a garbage collector -- the monolithic piece of code that performs the scanning and reclamation of objects -- it simply requires freeing the programmer from the task of managing the storage of an object. Fancy collector algorithms are meant to improve throughput and latency -- you could definitely create an implementation of Java that uses automatic reference counting but you'd be searching for something faster really quickly.

If I have a language where I create a number of managed objects and the compiler's escape analysis determines that these objects never leave the function's scope and so changes them to be an alloca then you would have garbage collection without involving a garbage collector.

Again, if that's what automatic reference counting is as sold by Apple then I'd consider it to be garbage collection.


For what it's worth, Apple sees this as distinct from garbage collection. For new projects they recommend using ARC, even on the Mac where "real" GC is (and will continue to be) available.


Are you willing to elaborate on this? Recommending it over garbage collection on OS X would run contrary to their previous evangelism of GC and its concurrent scalability.


I can't, since I don't know any more than the next guy reading the docs. I'm sure there'll be an in-depth WWDC session on this and I'm eagerly awaiting the videos.

(I haven't done any Mac OS programming with GC, so I really don't know anything about it, but I've always heard people say the GC is not all that great. So my own uneducated guess is that while GC is messy and complicated, ARC is simple & doesn't have any runtime overhead (they say in the docs that retain/release is now much faster, as is autorelease using the new syntax).)


> Automatic garbage collection doesn't even require a garbage collector -- the monolithic piece of code that performs the scanning and reclamation of objects -- it simply requires freeing the programmer from the task of managing the storage of an object.

Can you back that up? I'd like to see a source that disconnects garbage collection the feature from a runtime garbage collector the implementation of that feature.

Independent of the semantic dispute, I don't think you understand what ARC, the iOS 5 feature, actually is. It's not a technology that "frees the programmer from the task of managing the storage for an object". Just to start with, that is a decision that can only be made at runtime, and as has been discussed ARC is a static-time feature.


Reference counting is very definitely a form of automatic memory management, aka GC. Every GC expert will disagree with your colloquial definition. There are two sides to every reference - the guy who has the pointer and the thing pointed to - and the most efficient GCs use a combination of both. The difference is that the "logical ref count" is distributed for mark-sweep etc, but centralized in object for refcounting.


We're talking past each other. There's "Automatic reference counting, the dynamic garbage collection algorithm that runs periodically at runtime during your program's execution", and "Automatic reference counting, the static compile-time feature that ships in iOS5."

These two things unfortunately share a name, but absolutely nothing else. The iOS 5 feature is not, and has no relationship to, and does nothing remotely like, the garbage collection algorithm with the same name.


One reason is that reference counting, without some other backup reclamation mechanism, isn't guaranteed to collect all the garbage. For example: let's say you have two DOM nodes, A and B, where B is A's firstChild and A is B's parentNode, with both attributes implemented as references in the obvious way. Even if all other references to these objects disappear, making them both inaccessible, the reference from child to parentNode will keep the parent's reference count from going to zero until the child node gets reclaimed --- but the child node can't be reclaimed until the parent is gone, because the firstChild reference in the parent keeps _its_ reference count from going to zero. So neither object can be reclaimed first (or at all), and the storage for both leaks.

(The same logic applies to longer "cycles" of references --- if A refers to B refers to C refers to D refers to A, then none of the objects in the cycle can be collected first, and they all stick around.)

In CPython, by the way, there is a separate garbage collector, which runs periodically to detect and mop up cyclic structures. Other implementations --- Jython and PyPy --- don't use reference counting to begin with.


And if the values being garbage collected have no pointers in them, or the language makes cycles impossible (think functional immutable) the refcounting is all you need. It's still bona fide GC.

(This is a point of definition independent of the subject at hand. Refcounting simply is a form of GC. Imperfect for most languages and many datatypes but GC all the same.)


Python has a proper GC to remove cycles in reference counting, it's just not the "main" one.

http://docs.python.org/library/gc.html

  Since the collector supplements the reference counting already used in Python, you can disable the collector if you are sure your program does not create reference cycles. Automatic collection can be disabled by calling gc.disable().


Garbage collection specifically refers to a collector that reclaims resources at run-time (the "garbage"). According to the article, this is static analysis that inserts retain/release into the code at compile-time.


Isn't "automatic reference counting" just a specific type of "garbage collection"?

But yes, I think this feature really slipped through the cracks in the announcements I've seen so far (even Ars seemed confused about it)


I think this is done at compile time, so it wouldn't be considered garbage collection.


The compiler is LLVM though, "By enabling ARC with the new Apple LLVM compiler", which begins to blur the difference


LLVM is being used as a static compiler. There is no LLVM virtual machine - the compiler emits native machine code.


MacRuby does support AOT compilation, although I don't think it's a very production-ready feature yet and would likely be incompatible with a fair percent of existing ruby code


This isn't in MacRuby — it's in Objective-C. And it's explicitly distinct from garbage collection. This is an alternative to garbage collection that has generally been considered kind of impractical.

(And MacRuby's AOT compilation works pretty well from what I've seen.)


I don't think that makes much difference, as long as it's performed at compile time.


it depends. If it copes with circular references, then yes, it is suffiecient to use for gc.

If not then it's just what COM had.


Unfortunately, it doesn't handle cycles in the object graph[1], which was almost the entire reason I wanted GC. Manually dealing with cycles is one of the worst parts of working with blocks (the other is the lack of parameterized types).

Assuming that ARC works reliably and consistently, it'll be nice to lose some boilerplate, but the worst facet of reference counting -- object graph cycles -- still isn't fixed.

[1] There is support for zeroing weak references, but this is a very manual solution to the cycle problem.


That basically makes it useless. It would be an utter pain in the arse finding leaks in. I don't know why they cant chuck a generational collector in it such as the one in the last ObjC drop in it. Even microsoft with the woefully inadequate WM7 platform has a proper GC.


I wouldn't say useless -- just a lot less useful. Maybe they'll add cycle handling on top of it later.


The arguments over performance, compiler features, and the definition of garbage collection aside, I'm curious what members of the HN community think about the chances of Ruby becoming an iOS-approved programming language.

It seems as though the groundwork has been laid, and from an aesthetic perspective, Ruby seems like a nice fit for Apple. Plus, it's not Python, which is arguably Google's baby.


I think it's a certainty that it's a feature in the pipeline. The only question is when it will happen, which is really a function of performance and memory usage. I suspect this is a big step forward.


I don't think it's quite the certainty you're making it out to be. Ruby is a step sideways, if not backwards, from Apple's current strategy. We're far more likely to see an Objective-C without the C than we are to see Ruby as a first class development target for iOS.


Ruby essentially is Objective-C without the C.

If Apple was looking for a language that moved away from the low level C, what benefit would an "Objective" language bring over MacRuby, which is already using the native Objective-C system frameworks?


A non-runtime type system.

First class familiar Objective-C-style messaging and block syntax.

Stable, mature, well defined language invariants on par with Apple's requirements for its own APIs and languages.

"Objective-C without the C" would look more like Smalltalk or Strongtalk with a near identical syntax, not Ruby.


I may be misunderstanding you, but:

When you remove the C bits, Objective-C and Ruby use the same type system, conceptually speaking. They are both descendant from Smalltalk; the main feature differences really come down to the syntax alone.

I really like the idea of a higher level Objective-C-based language, I just don't necessarily see the business appeal of creating and maintaining a brand new language that only brings a more familiar, to Objective-C developers at least, syntax. Especially given the amount of effort Apple has been putting into MacRuby.


When you remove the C bits, Objective-C and Ruby use the same type system, conceptually speaking. They are both descendant from Smalltalk; the main feature differences really come down to the syntax alone.

Objective-C is typed -- not just the C part, but the 'Objective' part too. You can cast around type system, but it's there.

The new compiler even uses inference of those types in order to implement ARC.

Especially given the amount of effort Apple has been putting into MacRuby.

Not Apple, just a few people that also work for Apple.


The type system exists, but when working with the Objective half of the language, it is mostly meaningless, at least from a developer's point of view. With the C parts removed, you could replace every type definition with id and your program would run just fine.

The article says that MacRuby is bundled with Lion as a private framework. Surely they wouldn't bundle it if they weren't using it? And being a private framework, it is not there for the benefit of third-party developers.


Objective-C is typed, and that's considered (by Apple, and most developers) as a feature, not just a legacy inheritance from C.


Objective-C types are optional and the language is dynamic, and that's considered (by Apple and most Objective-C developers) a feature.

    id something = nil;
    [something countForObject: nil];
Completely valid, won't crash your program, and only enough type information to satisfy the compiler (but largely meaningless for anything but the most basic static analysis). The only requirement for the above to compile is that countForObject: is a selector defined somewhere in the include path for the file. Even that is a relatively soft requirement since you can pass arbitrary selectors to any object.

And none of this has anything to do with ARC, as far as I can tell.

There was a first class language on Mac OS X which was fully statically typed. It was deprecated with Leopard and never introduced on iOS. The dual nature of Objective-C is one of its attractive properties.


Most of what you just said is simply not true. Without the method types, the compiler will print a warning, infer the wrong ABI and generate the wrong code. If am ambigious match ia made, the wrong code will be generated. What you just wrote may work, but only because the compiler works to match against defined methods types, and even then it can and will get it wrong.

The support for 'id' is only intended to serve as a mechanism to get around the lack of parameterized types, and as part of ARC, the compiler does now infer the types for alloc/init.


> Without the method types, the compiler will print a warning, infer the wrong ABI and generate the wrong code.

Did you even try my example? No compiler warnings are generated (nor should the be). What do you mean by defined method types? It is simply looking for any selector which matches on any class because there is not enough statically available information to know any different. Messages are always passed dynamically.

Are we talking about Objective-C? Are you familiar with NSInvocation? Or performSelector:, performSelector:withObject:, performSelector:withObject:withObject:? Or NSNotificationCenter's addObserver:selector:name:object:? This is all done at runtime. No special type information is available to the compiler when using these. Objective-C messagse are always sent dynamically, so the only ABI concerns are how the stack is prepared, and not the interface of the class of an object. You can define methods and swap them out at runtime, this feature would be useless if everything had to be known at compile time.

ARC needs to know that the types of Objective-C objects, id still works fine, beyond that it needs to know no other type information from what I can tell.

It seems we are talking past each other. Objective-C is not like C++, though. All methods are virtual, always. The runtime goes through great pains to make that efficient and still allow complete dynamism. This is orthogonal from ARC.


> Did you even try my example? No compiler warnings are generated (nor should the be).

Only because it managed to match on a defined method type. If a class declaration hadn't been found at compile time with the given declared method, it would have issued a warning.

If the match was ambiguous and the types incorrect, it would have emitted incorrect code, and possibly a warning (or always, with -Wstrict-selector-match).

> What do you mean by defined method types? It is simply looking for any selector which matches on any class because there is not enough statically available information to know any different.

By 'defined method types', I mean methods declared on visible classes that match the given selector.

If it matches on the wrong one, the wrong dispatch function and/or the wrong function call epilogue will be emitted.

Method calls are ABSOLUTELY NOT ABI identical for all possible types. I can't possibly emphasize this enough.

For example:

  - (void) performWithObject: (NSObject *) object;

  - (void) performWithObject: (NSObject *) firstObj, ...;
The instructions emitted for a vararg dispatch ARE NOT the same as the non-vararg dispatch on all platforms, and incorrect method selection will result in undefined behavior on dispatch.

> Are you familiar with NSInvocation? Or performSelector:, performSelector:withObject:, performSelector:withObject:withObject:? Or NSNotificationCenter's addObserver:selector:name:object:? This is all done at runtime.

> No special type information is available to the compiler when using these.

Yes, it is. Methods have associated type encodings that describe the return and argument types, and that's used to perform runtime dispatch with NSInvocation. This is why NSInvocation is so slow -- similar to libffi, it must evaluate the types and construct the call frame at runtime. It does this by evaluating the type data associated with method implementations by the compiler.

Methods such as performSelector rely on specific type conventions (such as void return, optional single object argument) and will fail if used with targets that do not match the expected convention.


You're right. I hadn't realized the compiler not only ensures the selector exists, but does type C-style type checking on dynamic calls as well. I was surprised to see that two messages with the same selector but different parameter types required a type cast to use.

Of course IMPs aren't identical if they take different parameters. This doesn't affect interchanging Objective-C types though. Yes, the arity and order are important, but the compiler doesn't enforce anything beyond that a pointer is passed for id types.


Cheers to the peer comment regarding HN discourse. Unfortunately (?) I have more :)

> This doesn't affect interchanging Objective-C types though. Yes, the arity and order are important, but the compiler doesn't enforce anything beyond that a pointer is passed for id types.

This is true prior to ARC: all ObjC pointers are the same size, and hence ABI-compatible given equivalent arity/order. It's theoretically possible that a future ABI could be incompatible between two methods returning void vs pointer return value, but currently, all supported ABIs return pointer-sized values in a register.

However, with ARC, this changes. The type system has been effectively extended to denote the required referencing behavior for calling code. This means that for a given arity/order, you must also have equivalent referencing attributes.


I still find it disconcerting on HN when an argument ends with someone saying, "you're right". It's like the normal rules of the Internet just don't apply here. It's one of my favourite things about this community. To you, personally: kudos for maintaining that spirit!



Yeah, sorry to rain on your parade, bro. But these kind of ‘confirmed’ articles, in the end, help nobody. Users will be sad when it doesn’t happen, whereas neither Apple nor the MacRuby team have actually confirmed/promised this. Sorrow all around isn't good publicity.

Thanks for the update, though! :)


Automatic Reference Counting is not Garbage Collection, but it's a start, and can weakly be used as a sort of garbage collection. It is up on iOS5's site page, however, so that part is confirmed for now.

http://developer.apple.com/technologies/ios5/


I’m talking about “MacRuby on iOS 5 - CONFIRMED!!!”.


I don't like the subject line either, but he did specify early on that it was a tongue-in-cheek quip and didn't actually confirm anything.

Thanks for clarifying this solidly, though.


Very true. However, judging by the chatter about this article on, for instance, Twitter [1], many don’t really seem to get that the ‘confirmed’ part in the title isn’t true. So I simply wanted to emphasize this point for others, not the author as he knows it already :)

[1] http://twitter.com/#!/search/macruby


Yeah sorry for the confusion, I'll update the post title. I thought the "confirmed" meme was fairly well-known in the Mac community (e.g. http://arstechnica.com/civis/viewtopic.php?f=19&t=114278...), but I think notsomuch


He didn't say "no". He said "not yes". There's still hope, man! #glassHalfFull


Well, I simply can’t say ‘it will not happen’, because it is open-source software, so for all I know you are currently implementing it. However, I wouldn't hold your breath on someone else stepping up to make this work, because we have been saying this for a while now… C’est la vie.


Although I empathize with your wishes, your article is way too sensationalistic. Let me confirm (I’m not with Apple); MacRuby on iOS has _NOT_ been confirmed.


Agreed. ARC is not garbage collection, nor is it an indication in any way that MacRuby on iOS is coming.


Someone has made a comment on HN before like this: if the title of a submission is a question, the answer is most likely 'no'.


You dont need OS support for gc to do gc in a scripting language. Wax, https://github.com/probablycorey/wax which is a Lua binding for UIKit does all the gc for you. The framework just needs to do all the bookkeeping for you behind the scenes.


When the platform's native data structures are explicitly managed but not garbage collected, the language runtime can't just steamroller them, it has to notify the platform about each individually becoming disused. It's the worst of both worlds, you pay for O(n) work to handle garbage as well as not reclaiming memory immediately, and it's worse with an API that takes graphs of first-class objects rather than buffers which happen to contain the bits you want (which are incredibly risky, but the language runtime can handle that once and for all).


More important for MacRuby on iOS seems to be the "static" compile option (not functioning yet) see http://bostonrb.org/presentations/102 (http://vimeo.com/25213324) @ 32:03 - 32:48 in combination with http://bostonrb.org/presentations/106 (http://vimeo.com/25232412) @ 06:15 - 07:40; with this "static" option some Ruby features will not be supported. - Also note @ 05:26 - 05:58 in the first video :-)


I hope this will be better than LLVM Analyzer's warnings about missing retain/release, which in my experience catches most, but not all of missing retains/releases.


I wonder what effect this will have on HTML5 app development.

Would the Ruby developers who are considering building HTML5 apps go for native Ruby instead if given the option?


HTML5 apps are going to be heavy on JavaScript. It is a matter of personal preference or dislike of JS which is a big factor here.

More important though is how performant and tight the app is, I think most developers who are wanting to have the best performance will choose native.


The effect WHAT would will have on HTML5 development?

And would ruby developers use ruby if given the option? Somehow, I think so.


Not on topic but anyone knows why Lion comes with Ruby 1.8.7 and not 1.9 branch?



Does this mean that we can write apps that target iPhone and iPad with MacRuby?


This is probably going to hurt performance a whole lot more than proper GC, because now the compiler throws in a whole bunch of retain/release statements where they aren't really needed.

Unless they're doing some pretty strong analysis to remove redundant reference operations.


I don't know about you, but 99% of my objective C code is quite formulaic with regards to memory management.


Sure, but can the compiler prove that?


I'm fairly sure that the LLVM compiler is smarter than the smartest Objective-C code I could write, but your mileage may vary.


I'm fairly certain that it doesn't do cross-translation-unit inter-procedure analysis, so it can't tell if, for example, the function from another file that you call does anything special that would make the removal of a retain before passing the object to the function an invalid optimization.

It doesn't know that the object is only used in a thread-local context, so in many cases (eg, function args) it might not be able to assume that there is no concurrent refcounting action going on elsewhere, which eliminates other optimizations.

By definition, compilers have to be conservative about what callers can do, when humans can simply say "This function is not thread safe", or similar.

This is one reason that "real" GC is generally used instead of reference counting in all high performance garbage collected language. (In throughput-oriented systems, the GC is even a stop-the-world GC, since that removes quite a bit of concurrency/atomic operation overhead).

The other big one is that you need a garbage collector anyways if you want things to work in the presence of cycles.

EDIT: Perhaps someone with a Mac and access to this compiler can post some output of code with this feature enabled. I'd be curious to see what sort of output the compiler actually gives.

This speculation is annoying :)


Disclaimer: I haven't actually tried playing with this yet, or read the docs, so this is pure speculation:

It may not need to do any kind of complicated proving or cross-unit analysis, and instead opt to simply rely on the fact that every good Objective-C citizen follows the same social naming conventions. Things coming back from methods with "alloc", "copy", or "new" in the name are owned by the current method and need to be disposed of properly (release it at the end of the method, or if it escapes this method, add it to the autorelease pool). Everything else can be assumed to be taken care of and we need only bump the retain count if it's getting assigned to something with a scope larger than the current method.

It wouldn't catch everything, but it could be made predictable in behavior and eliminate a large percentage of the boilerplate.


For 99% of the 99%, yes.

  self.blueView = [[BlueView alloc] init];
  [self.blueView release];
etc


If you took advantage of properties you could do self.blueView = nil; and have the release be done for you.


I believe he IS taking advantage of an auto-retained property, which is why he needs the release... because the property setter has issued its own retain, which will be released when the property is set again...


With synthesized ivars, you must do:

  self.blueView = [[BlueView alloc] init];
  [self.blueView release];
When you allocate to get the proper retain count; I can demonstrate why by using a temporary variable:

  BlueView* bv = [[BlueView alloc] init]; 
   //Retaincount = 1
  self.blueView = bv;
   //Reaincount = 2
  [self.blueView release];
   //retainCount = 1
And then in dealloc

  [self.blueView release];
   //retainCount =0;
  
So in dealloc I could call your function, but there are things in various object packing schemes (KVO most importantly) where calling the setter like that screws stuff up. It's better to just call the getter (which should be side effect free) and release the returned object.

So looking at retain counts in the original example:

  self.blueView = [[BlueView alloc] init];
  //Retain count 2
  [self.blueView release];
  //Retain count 1
and in dealloc

  [self.blueView release];
  //Retain count 0
If I just did what you said, I'd get:

  self.blueView = [[BlueView alloc] init];
  //Retain count 2
  
and in dealloc

  self.blueView=nil;
  //Retain count 1, rut ro, memory leak


Which for KVO reasons is a really bad way of releasing objects when 'self' is being deallocated.


Hmm, I don't quite follow. I'm new to Cocoa programming and just followed the advice from the book Cocoa up and running.


If anyone does KVO on your object, they will be sent a "property changed" notification when you call self.myvar = nil in your dealloc method.

It's quite likely that whoever gets notified of the property change will attempt to access your object, but now your object is in a half-destroyed state.


I think it's still easier to ensure performant code as reference counting is more deterministic than GC. I assume you can turn off ARC for compilation unit if you want, so performance critical code can be manually managed.


Naieve reference counting incurs a bus lock across all processors on every assignment, more or less. That's quite painful.


95% of code can run an order of magnitude slower than "optimal" with no discernable degradation of the user experience.

If a bus lock degrades your performance too much, either change your algorithm, or isolate that section of code and do it in another language that doesn't suffer the same performance issues.


How many processors (cores) do most iOS devices have?

GC is still supported on MacOS X.


One core, but the Apple A5 in the iPad 2 and iPhone 4GS/5 (the flagship devices for iOS 5) has two cores.


Also autorelease only applies to NS objects, not plain C data.


I hope this is true




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: