Two simple criticisms or potential misunderstandings:
The three nulls in the load segment (between the code and data) are included in both the code and the data highlights as well as its own highlight, which is a bit unintuitive, as the start of the string looks to be /0/0/0Hello. It looks like these are supposed to be between spans like the other non-highlighted nulls, but they are included in a parent span, bin_segment0. (Issue submitted.)
Also, I wish the arrow heads did not opaquely overlap the numbers. Adding opacity="0.3" to the svg tag fixes this for me.
This is cool for a variety of reasons: It makes parsing a readelf output a bit easier, it's a nice/small/functional Rust demo (for Rust idiots like me), and the output can be redirected into html-based documentation easier than a command line tool's output.
This is great. I wish every binary format had a visualizer like this. Or maybe even a generic tool that can take in descriptions of binary formats to create new annotations on the fly (like Wireshark).
I've seen similar tools to annotate other binary formats like gpg and asn.1
Thanks for writing objdump! I'm probably one of those people.
The most recent thing I did with objdump was use it to patch in a signature that needed to be computed over the text/data/bss segments after compilation. This was for a bare metal embedded system, and being able to do this at the elf level (1) made it easier, and (2) allowed me to use the same setup for multiple targets that each had their own oddball binary/ihex conversion and flashing machinery.
Just tried it, does what it promises.
It’s more than a basic hex viewer because of the extra information being displayed when hovering with the mouse on the different bytes, and the arrows linking the pointers/offsets to their targets.
(I'm surprised this is written in Rust and doesn't use the object crate--did the author do this in part to learn how elf works?)
Speaking of visualizing virtual memory, one of the things that I haven't seen a nice prior tool for is breaking down the memory map of a process on a per-section basis--/proc/pid/maps only tells which libraries are providing which sections. I've built something like that for my own needs, but it's the sort of thing that I would have expected would easily come out of some other tool.
(Not the author) I’ve learned to not get my hopes up about the capabilities of external format-parsing libraries when building tools like this (to the point of often not bothering to evaluate them for fit-for-purpose any more), as they often expose only a high-level fully-decoded representation that’s unsuited to examination of “why” something decoded to what it did.
As such, they often won’t be of any help in a situation where you’re trying to use them to diagnose where exactly a corrupted piece of data is going wrong — which is one of the biggest use-cases for such tooling!
I've been using object and gimli crates extensively, and I can assure that "expose only a high-level fully-decoded representation" is the exact opposite of what they do. In fact, my biggest criticism (for gimli in particular) is that they lack sufficient high-level representation.
The object crate exposes all of the ELF types directly. The one thing it doesn't do is give you this from its format-agnostic object::read::File type, you have to start from object::read::elf::ElfFile instead. As a bonus, it also gives you all of the processor-specific defines so you don't have to look up what the value of, say, the x86-64 relocations are: https://docs.rs/object/0.25.3/object/elf/index.html#constant...
My point wasn't about whether you can get into to the nitty-gritty leaf nodes of complex structures in the fully-decoded data; it was about whether the representation it outputs losslessly represents unparseable data elements while still doing best-effort to decode what it can, such that you end up with a representation that was decoded "as much as possible" where the decoded parts can be used to figure out why the non-decoded parts didn't decode, while also not obscuring what the non-decoded parts "say".
I'm using "high-level" here to mean "was successfully transformed through all the decoding/lexing/parsing/cross-reference stages", and "low[er]-level" to mean "failed to be transformed by some of those stages." Which is non-normative, I guess, but this sort of "layers of decoded-ness" representation is what you expect from tools like Wireshark or binwalk.
Yes, object/gimli are also this kind of low-level. Basically, when you parse the file, you're not actually parsing the file, but you're parsing each element of the structure one bit of a time.
So parsing FileHeader will make sure a) the data is correctly aligned [since it's UB in Rust to have underaligned data] and b) that the e_ident bits are actually the magic number for an ELF file. Want to list all the sections? That's when it's actually going to check that a) the section header offset actually points to valid data and b) the number of section headers exist and is sane, but again, it doesn't actually verify that the section headers themselves make any sense whatsoever.
You could try to build it on top of Rizin[1][2] library. In particular see the `dm` commands and subcommands. Let us know if something is unclear or missing or doesn't work as you would expect.
> I'm surprised this is written in Rust and doesn't use the object crate--did the author do this in part to learn how elf works?
No. When I started the project I was expecting to just read data into the ELF structs, in style of C. (Un)fortunately, it's not possible to do safely, so I started looking into crates to do that, and was stumbling upon data deserialization ones. In particular, the first attempt was in nom. In hindsight, that wasn't particularly smart, and specific object-file-parsing ones would be better. I don't regret implementing reading manually, despite it looking pretty ugly, because attending to NIH syndrome is fun.
> Speaking of visualizing virtual memory, one of the things that I haven't seen a nice prior tool for is breaking down the memory map of a process on a per-section basis
That is planned. It's noted in readme, and in issue #3 I go over how it can look like[1].
Hello, author here (I edited the bio on github to show). Don't know how I missed it on HN, must have been the grey link.
First I'd like to say that right now this is just the first release, and it's a bit raw so far. That's why I was hesitant to post on HN yet, expecting a more harsh but merited critique. I am reading through the thread for bugs and suggestions. Thanks for that.
"Violet Orange Gradient" -> Executable and data, I think (It looks like this is copied to address 0x10000+0x80 and executed, but I'm not familiar with x86.)
Another way to frame this is that the primary viewing format is portable html, and can easily be viewed locally or shared with someone else or incorporated into a blog post, etc.
The true value here is the backend and its ability to return data in a form that can be visualized, and the output format/UI can be adapted/enhanced as the project matures.
Yeah, if you just need a single page visualisation like this then HTML is hard to beat. It's easy, cross platform and you don't need any libraries to use it.
Doing this with Qt or GTK would be much much more work. Especially if you're using Rust which doesn't have any really good GUI options yet.
Completely unrelated rant. Before Rust really takes off, can someone please, pretty please, for the love of "Bob" and all that is marginally donut-shaped, can somebody pleeeeeease tell the Rust community to adopt inheritance-focused hierarchical package naming conventions???
Way back in the day, in like, 1995, there was this thing called Perl. Perl was awesome. You'd think, "I want to make a custom LDAP client". And you'd look in CPAN for an LDAP module, and it'd be there (https://metacpan.org/pod/Net::LDAP), and you'd use it. A while later you realize you want to manipulate Active Directory SIDs. So you create a new module that inherits Net::LDAP, and publish it (https://metacpan.org/pod/Net::LDAP::SID).
On the plus side: everyone can install your module to get extra functionality with their existing code; people don't need to reinvent the wheel completely to do something a little different; and it's easy to see which module provides what/inherits from what.
On the down side: boring names for modules. (is that a downside?)
The three nulls in the load segment (between the code and data) are included in both the code and the data highlights as well as its own highlight, which is a bit unintuitive, as the start of the string looks to be /0/0/0Hello. It looks like these are supposed to be between spans like the other non-highlighted nulls, but they are included in a parent span, bin_segment0. (Issue submitted.)
Also, I wish the arrow heads did not opaquely overlap the numbers. Adding opacity="0.3" to the svg tag fixes this for me.
This is cool for a variety of reasons: It makes parsing a readelf output a bit easier, it's a nice/small/functional Rust demo (for Rust idiots like me), and the output can be redirected into html-based documentation easier than a command line tool's output.