First we get awesome CPy C-extension emulation and now this? Thanks, team!
Completely serious: when can PyPy obsolete CPython? Why don't more CPython core devs work on PyPy? Besides perhaps some remaining gap for C-extensions support, what else is missing?
C-Extension support is still not complete and comes at a massive overhead. That's better than no support at all but the performance impact might be a problem in some cases.
PyPy is also still several versions behind implementing the latest Python 3.
So while PyPy is definitely a great project, groundbreaking in many ways and a very useful for a lot of applications, it's not a replacement for all applications, for now at least.
Also while the actual PyPy Python interpreter is reasonably simple overall the project is incredibly complex. It at least appears as if there are parts of the PyPy project not even all PyPy developers understand. Making PyPy the reference implementation is probably questionable for that reason alone.
Perhaps that not everyone is writing long-running services using Python?
PyPy 5.3.0 is twice as slow as Python 3.5.0 for some work I regularly deal with (1.1s for PyPy, 0.6s for CPython). Another project it is ~40% slower (7.7s vs 5.6s).
Additionally, PyPy just released an alpha containing compatibility with Python 3.3 (which was released four years ago). There are some nice additions to the language in 3.4 and 3.5.
I do write some code using FFI, but I have a cffi layer in there for PyPy. I do not interact with the CPython API.
Currently I don't have an interest in taking time to try and make PyPy less slow for my use case, for many reasons that I'm not going to get into here. (But I will note I'm primarily writing code that focuses on being easy to distribute. Most users aren't going to have PyPy installed anyway.)
Anyway, the main thrust of my post is just that Python isn't a monoculture in which PyPy is always a better solution.
I have some data generation scripts in a bash pipeline where pypy is significantly slower due to startup overhead. When you run anything in a loop ten thousand times, the startup time matters a lot for some people.
I remember hearing Guido comment that CPython is much easier to hack on compared to PyPy. I think this reason alone is why other Python implementations will remain separate from the 'reference' CPython.
The majority of PyPy development is done in RPython which probably makes hacking on the runtime available to something like 100x as many developers. The only limiting thing is knowledge and exposure, but mostly exposure.
Sorry, CPython is much easier to hack on than PyPy. To start with, for every change you do recompilation takes 4GB of memory and half an hour to complete.
This is a rather slow feedback cycle.
Secondly, while the syntax of RPython may be clean the necessary incantations to actually achieve something aren't at all obvious.
Platforms I guess. Plain C code is much more portable than JIT and PyPy is reinventing JIT implementation as opposed to using something like LLVM, so they're much more limited in terms of supported target platforms.
Oh, my mistake -- I had assumed that it did use LLVM. Good point about portability, CPython has tons of supported targets and PyPy just a handful. Also CPython is really easy to build IMO.
LLVM isn't as widely portable as generic C, either. ARM and x86 are well supported in LLVM, but CPython runs fine on MIPS, Sparc, PPC, etc (via GCC or other C compilers).
In addition to the other stuff, I believe CPython also fits in much less memory, since it uses reference counting for its first layer of garbage collection, and doesn't have to store JIT output anywhere.
Great to see. I'm biassed of course, but reversible debugging is the future (or should that be the past? Sorry.) If you'll excuse the shameless plug, http://undo.io/ for the same with C/C++. I'd like to think we were at least a little influential in this! :) https://vimeo.com/160863576
Is the tech world really holding back pypy3 for a lack of $40k? I would expect there are multiple individuals in this thread who could top that off without feeling much sting. Let alone the tech companies they work for who would benefit greatly from it.
Just out of curiosity, at what level of wealth or yearly income would you think someone would have to have in order to drop $40k on a project like that?
I meet a lot of different engineers in different situations around SF all the time. Recently met one who is at a point that he owns several (>2) Teslas... Someone with $400K in liquid assets could do it without being embarrassed. I would be surprised if there wasn't as least one person reading these comments in that situation. Would that person care deeply enough to say "Here! Just finish it!" ?? Not likely. However, I'm very surprised that there isn't at least one multi-million dollar company who would value this higher than few person-months of internal engineering time.
If I had to guess it just simplifies the implementation. If you have deterministic load addresses then you can just reload the program on replay. If you have ASLR on, then you need to record where each library/stack/heap is and arrange for them to be placed at the same place when you replay. It's doable (though I don't know offhand how to specify the stack/heap location from userspace) but extra work.
I've done similar things before – in PANDA we don't strictly need to snapshot the full device state when creating a recording; it would be enough to just keep track of which memory regions are I/O and reconstitute that mapping. But QEMU's savevm/loadvm saves and restores that mapping as a side effect so it's easier to just let that happen.
Edit: also, instead of disabling ASLR system-wide, it might be better to just use "setarch `uname -m` -R <pypy>", which disables it for just a single process.
If the implementation records the state of your program at each time interval as "the values stored at particular memory locations", it's much simpler to put those values back at those locations than it would be to recognize the memory map and decompose the state in order to play it back at the arbitrarily established offsets used with ASLR.
As a practical matter, ASLR should have little if any impact to nearly all programs design, so disabling it at development/debug time should not come at a big cost.
BTW this post states "There is no fundamental reason for [ASLR] restriction, but it is some work to fix."
EDIT: I see, your focus is specifically on recording. Disregard this comment, good point.
> Only works on Linux, and only with Address Space Layout Randomization (ASLR) disabled. There is no fundamental reason for either restriction, but it is some work to fix.
Reversible debugging (also known as record and replay) requires making the program behave exactly the same way every time. That way you can "execute backward": at statement n of the program, restart it from the beginning, and stop at statement n-1 (in practice you would take snapshots of the program state along the way so you don't have to re-execute all the way from the beginning of the program).
In the example given here, you could get away with it since control flow doesn't depend on the value of id(object()). But for example, if you had:
Ah, you've caught me out on my sloppy use of English.
What I should have said is that the address space layout is a source of non-determinism that would otherwise need to be accounted for in the recording of a program.
Further, I hold to the view that given a program, knowledge of the thread execution schedule/scheduler and the results of external calls, a program's execution should be deterministic.
Thinking about this now, I realize disabling ASLR shouldn't be enough to fix this problem since recording and replaying are sufficiently different.
> What I should have said is that the address space layout is a source of non-determinism that would otherwise need to be accounted for in the recording of a program.
Ah yes, it needs to be deterministic in the context of a time-traveling debugger, I was thinking about the general case, sorry.
> Further, I hold to the view that given a program, knowledge of the thread execution schedule/scheduler and the results of external calls, a program's execution should be deterministic.
Surely it is, if you precisely know all timings and IO and can replay them there's nowhere for non-determinism to creep in.
That's not really an acceptable scale of required knowledge for a time-traveling debugger though.
I believe in PTVS you can set the current executing line higher up in a function and execute those lines again, thus further mutating the state of the program. But, with this, you can roll back to earlier points of execution. Thus, un-mutating the state of the program. You can continue to roll backwards in execution to before the function was called, further and further until you observe some earlier code making a change that eventually led to the problem you are debugging.
If you logged the stack at every step along with every variable in memory, yeah. The point of debugging like this is to not have to insert 40 log.trace() statements with 30 variables just to track it all down. It's interactive and does it all for you.
Completely serious: when can PyPy obsolete CPython? Why don't more CPython core devs work on PyPy? Besides perhaps some remaining gap for C-extensions support, what else is missing?