Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Reverse Debugging for Python (morepypy.blogspot.com)
169 points by scribu on July 8, 2016 | hide | past | favorite | 53 comments


First we get awesome CPy C-extension emulation and now this? Thanks, team!

Completely serious: when can PyPy obsolete CPython? Why don't more CPython core devs work on PyPy? Besides perhaps some remaining gap for C-extensions support, what else is missing?


C-Extension support is still not complete and comes at a massive overhead. That's better than no support at all but the performance impact might be a problem in some cases.

PyPy is also still several versions behind implementing the latest Python 3.

So while PyPy is definitely a great project, groundbreaking in many ways and a very useful for a lot of applications, it's not a replacement for all applications, for now at least.

Also while the actual PyPy Python interpreter is reasonably simple overall the project is incredibly complex. It at least appears as if there are parts of the PyPy project not even all PyPy developers understand. Making PyPy the reference implementation is probably questionable for that reason alone.


Perhaps that not everyone is writing long-running services using Python?

PyPy 5.3.0 is twice as slow as Python 3.5.0 for some work I regularly deal with (1.1s for PyPy, 0.6s for CPython). Another project it is ~40% slower (7.7s vs 5.6s).

Additionally, PyPy just released an alpha containing compatibility with Python 3.3 (which was released four years ago). There are some nice additions to the language in 3.4 and 3.5.


> PyPy 5.3.0 is twice as slow as Python 3.5.0 for some work I regularly deal with

Is this because you are using ctypes or a C extension written against the CPython API? If not, have you filed a bug report?


I do write some code using FFI, but I have a cffi layer in there for PyPy. I do not interact with the CPython API.

Currently I don't have an interest in taking time to try and make PyPy less slow for my use case, for many reasons that I'm not going to get into here. (But I will note I'm primarily writing code that focuses on being easy to distribute. Most users aren't going to have PyPy installed anyway.)

Anyway, the main thrust of my post is just that Python isn't a monoculture in which PyPy is always a better solution.


Thanks for providing context.


I have some data generation scripts in a bash pipeline where pypy is significantly slower due to startup overhead. When you run anything in a loop ten thousand times, the startup time matters a lot for some people.

That said, PyPy is awesome. :-)


Why not write the loop in Python itself?


I remember hearing Guido comment that CPython is much easier to hack on compared to PyPy. I think this reason alone is why other Python implementations will remain separate from the 'reference' CPython.


The majority of PyPy development is done in RPython which probably makes hacking on the runtime available to something like 100x as many developers. The only limiting thing is knowledge and exposure, but mostly exposure.


Sorry, CPython is much easier to hack on than PyPy. To start with, for every change you do recompilation takes 4GB of memory and half an hour to complete.

This is a rather slow feedback cycle.

Secondly, while the syntax of RPython may be clean the necessary incantations to actually achieve something aren't at all obvious.


Lots of RPython development can be done in the repl, code can be debugged with pdb. Not nearly as difficult as it is often made out to be.


I don't know much about pypy, but do you need to compile it each time? Just use pypy to live interpret the interpreter.


Platforms I guess. Plain C code is much more portable than JIT and PyPy is reinventing JIT implementation as opposed to using something like LLVM, so they're much more limited in terms of supported target platforms.


Oh, my mistake -- I had assumed that it did use LLVM. Good point about portability, CPython has tons of supported targets and PyPy just a handful. Also CPython is really easy to build IMO.


LLVM isn't as widely portable as generic C, either. ARM and x86 are well supported in LLVM, but CPython runs fine on MIPS, Sparc, PPC, etc (via GCC or other C compilers).


You can compile PyPy without its JIT.


Yes, although without JIT it's slower than CPython. So, what's the point?


If you are happy using CPython, you can't be terribly concerned with performance. The point is, pypy will run on more platforms than its JIT targets.


Sure. But if you're already using CPython (as most users are), and you know Pypy will be 3x slower on your platform, why switch?


You have no reason to. We're talking about a hypothetical future where PyPy has become the reference implementation.


In addition to the other stuff, I believe CPython also fits in much less memory, since it uses reference counting for its first layer of garbage collection, and doesn't have to store JIT output anywhere.


Great to see. I'm biassed of course, but reversible debugging is the future (or should that be the past? Sorry.) If you'll excuse the shameless plug, http://undo.io/ for the same with C/C++. I'd like to think we were at least a little influential in this! :) https://vimeo.com/160863576

Greg


Also worth noting that Undo is hiring: http://undo.io/about-us/careers/


No pricing on the website, so it's likely to turn off many people here.


amazing as usual :)

no python 3 as usual :( 99% of my production code is python 3.


> no python 3 as usual :( 99% of my production code is python 3.

Contribute work or funding[0] to pypy3?

[0] http://pypy.org/py3donate.html

[1] http://doc.pypy.org/en/latest/release-pypy3.3-v5.2-alpha1.ht...


Is the tech world really holding back pypy3 for a lack of $40k? I would expect there are multiple individuals in this thread who could top that off without feeling much sting. Let alone the tech companies they work for who would benefit greatly from it.


Just out of curiosity, at what level of wealth or yearly income would you think someone would have to have in order to drop $40k on a project like that?


I meet a lot of different engineers in different situations around SF all the time. Recently met one who is at a point that he owns several (>2) Teslas... Someone with $400K in liquid assets could do it without being embarrassed. I would be surprised if there wasn't as least one person reading these comments in that situation. Would that person care deeply enough to say "Here! Just finish it!" ?? Not likely. However, I'm very surprised that there isn't at least one multi-million dollar company who would value this higher than few person-months of internal engineering time.


Cool, but why do you have to disable ALSR for recording to work?


If I had to guess it just simplifies the implementation. If you have deterministic load addresses then you can just reload the program on replay. If you have ASLR on, then you need to record where each library/stack/heap is and arrange for them to be placed at the same place when you replay. It's doable (though I don't know offhand how to specify the stack/heap location from userspace) but extra work.

I've done similar things before – in PANDA we don't strictly need to snapshot the full device state when creating a recording; it would be enough to just keep track of which memory regions are I/O and reconstitute that mapping. But QEMU's savevm/loadvm saves and restores that mapping as a side effect so it's easier to just let that happen.

Edit: also, instead of disabling ASLR system-wide, it might be better to just use "setarch `uname -m` -R <pypy>", which disables it for just a single process.


If the implementation records the state of your program at each time interval as "the values stored at particular memory locations", it's much simpler to put those values back at those locations than it would be to recognize the memory map and decompose the state in order to play it back at the arbitrarily established offsets used with ASLR.

As a practical matter, ASLR should have little if any impact to nearly all programs design, so disabling it at development/debug time should not come at a big cost.

BTW this post states "There is no fundamental reason for [ASLR] restriction, but it is some work to fix."

EDIT: I see, your focus is specifically on recording. Disregard this comment, good point.


Cf "Current issues" section:

> Only works on Linux, and only with Address Space Layout Randomization (ASLR) disabled. There is no fundamental reason for either restriction, but it is some work to fix.


Here's a python program that should be deterministic, but isn't without disabling ASLR:

    print id(object())


Why should this program be deterministic?


Reversible debugging (also known as record and replay) requires making the program behave exactly the same way every time. That way you can "execute backward": at statement n of the program, restart it from the beginning, and stop at statement n-1 (in practice you would take snapshots of the program state along the way so you don't have to re-execute all the way from the beginning of the program).

In the example given here, you could get away with it since control flow doesn't depend on the value of id(object()). But for example, if you had:

    if (id(object()) >> 12) & 1:
        print "foo"
    else:
        print "bar"
You would need to guarantee that id(object()) always returned the same value so that the execution follows the same path.


Ah, you've caught me out on my sloppy use of English.

What I should have said is that the address space layout is a source of non-determinism that would otherwise need to be accounted for in the recording of a program.

Further, I hold to the view that given a program, knowledge of the thread execution schedule/scheduler and the results of external calls, a program's execution should be deterministic.

Thinking about this now, I realize disabling ASLR shouldn't be enough to fix this problem since recording and replaying are sufficiently different.


> What I should have said is that the address space layout is a source of non-determinism that would otherwise need to be accounted for in the recording of a program.

Ah yes, it needs to be deterministic in the context of a time-traveling debugger, I was thinking about the general case, sorry.

> Further, I hold to the view that given a program, knowledge of the thread execution schedule/scheduler and the results of external calls, a program's execution should be deterministic.

Surely it is, if you precisely know all timings and IO and can replay them there's nowhere for non-determinism to creep in.

That's not really an acceptable scale of required knowledge for a time-traveling debugger though.


> Surely it is, if you precisely know all timings and IO and can replay them there's nowhere for non-determinism to creep in.

This was my original point, the address space layout information leaks into python land via things like id.

> That's not really an acceptable scale of required knowledge for a time-traveling debugger though.

That's what recording captures (though, you don't need to know the times at which things happened).


"no longer necessary from revision ff376ccacb36."


Shouldn't his be a "reversible debugger" ? or a time-travel debugger as the Elm debugger is called?


No, the way things work is that people name things whatever they want.


How is this more powerful than what I can already do in PTVS or PyCharm? I can move around the current executing line freely back-and-forth!


I believe in PTVS you can set the current executing line higher up in a function and execute those lines again, thus further mutating the state of the program. But, with this, you can roll back to earlier points of execution. Thus, un-mutating the state of the program. You can continue to roll backwards in execution to before the function was called, further and further until you observe some earlier code making a change that eventually led to the problem you are debugging.


Then I guess http://www.pythontutor.com could do this long before!


Can it do this for arbitrary programs running on your command line?


Not sure, seems only in browser


Python Tutor is different; it records a very small subset of the information available. You can't, say, reverse and then run some introspection code.


Does the author provide no link to the project source or have I missed it?


Isn't logging enough to track down a lot of such problems?


If you logged the stack at every step along with every variable in memory, yeah. The point of debugging like this is to not have to insert 40 log.trace() statements with 30 variables just to track it all down. It's interactive and does it all for you.


Usually I do not need every variable in memory. Often the relevant spot can be very well circumscribed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: