hi, obviously you have more experience here--I have just provided details straight from memory and did not want to disseminate misinformation and apologize if you or anyone else got that impression.
Many of the lock-free algorithms use CAS native machine instructions, however. What exactly do you mean by "reasoning about them is hard"? Do you mean in formal methods sense, or just difficult to grasp in general? I am just glancing over your ISMM'06 paper which looks very interesting--it seems you are using CAS instructions there, too (which makes me wonder that I was probably too unspecific in my "simplifying locking", which at least for my sloppy thinking includes "lock-free" algorithms too, sorry for that...)
Reasoning about lock-free algorithms is hard because there is no mutual exclusion. Locks enforce a critical section; you guarantee that only a single thread will execute in a critical section at any given time. This allows you to change state local to that critical section without worrying about other threads interfering.
Lock-free algorithms have no critical sections. All threads can execute all instructions at all times. This means that when you update shared state, you have to think about the implications of what happens when another thread touches the same state this thread is trying to update. Keeping this level of concurrency in your head at all times is difficult, which means that reasoning about the correctness of an algorithm is difficult.
Many of the lock-free algorithms use CAS native machine instructions, however. What exactly do you mean by "reasoning about them is hard"? Do you mean in formal methods sense, or just difficult to grasp in general? I am just glancing over your ISMM'06 paper which looks very interesting--it seems you are using CAS instructions there, too (which makes me wonder that I was probably too unspecific in my "simplifying locking", which at least for my sloppy thinking includes "lock-free" algorithms too, sorry for that...)