Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Baochip-1x: What it is, why I'm doing it now and how it came about (crowdsupply.com)
330 points by timhh 1 day ago | hide | past | favorite | 71 comments
 help



Hello wonderful people! I'm bunnie - just noticed this is on HN. Unfortunately due to timezones I'm about to afk for a bit. I'll check back when I can, and try to answer questions that accumulate here.


To anyone from crowdsupply listening, please turn down your VPN check. I am not stripping my privacy protection to use your site.

*edit, Crowdsupply does a full block on multiple VPN providers. There is no way to access their site without turning off your VPN.


You do realize that violating export regulations is a much bigger risk than losing a few individuals relying on snake oil security?

CrowdSupply checks on purchase and will withhold goods until you, individual or other entity, do confirm you respect export regulation.

I'm not saying it's totally unrelated, only that there do have a dedicated non technical but legal check.


Can you explain what is the connection between closing the site to VPN users and violating export regulations?

I'd assume the point is that they think that the possibility of serving the website to an individual physically within a prohibited country constitutes unacceptable liability.

Digikey and Mouser do not do this.

> few individuals relying on snake oil security

Please don't.


VPNs for desktop users have very few security use cases since most traffic ended up being https, but they're very useful for evading geoblocks.

My last mile is hostile, the VPN is very important.

Wouldn’t it require making a purchase and providing a shipping address? How would a VPN get in the middle of checking the physical address?

Wait a minute, why can't I reply to bunnie's top-level comment? Anyway, here's what I wanted to say:

Adding your CPU to another company's silicon is a genius move, well done. I wonder why companies don't sell their spare die space to others, is it because of trust/risk?


Crossbar is unusual to start with in that they wanted to do open RTL - so for starters, there's to a first order no companies even willing to discuss open RTL designs. Beyond that - mainly risk. I had to pinky swear that whatever I added would not break the chip, cause timing closure issues, delay the schedule, consume too much area or power, impact yield, I had to run my own validation and review program while meeting their dev methodology, etc. etc. I had to exercise an enormous amount of self-restraint to not push harder and do more interesting things as it was. It's very hard to build up inter-personal trust, and they had to take a calculated risk letting a schmuck like me potentially foul up a multi-million dollar mask set. Hats off to them for making that bold decision, it would have been easier to say nope, too risky, no benefit, cut it from the code base.

Having paid for multi-million dollar mask sets for ASICs before, I can confirm that this would take a lot of trust on Crossbar’s side. Great job on working with them.

Yeah, true, it's all downside for them for this, basically. Still, there must be some price for which companies will let other companies use die space, but maybe that price is higher than just doing the thing yourself...

In the space of possibilities this can be abstractly thought of as a Caravell [1] harness gone wild. But if you had to price access to the project in a commercial sense, then, the pricing is going to be quite high. Because it's not just the cost of the masks - there's a whole lot of talent and skill in the team that does the "backend" processing. That is, once the RTL is done, it goes through multiple passes of place/route/timing, ATPG, DRC, LVS...and that's just to get to the tape-out. After that there's still more to do with the chip probe, packaging and reeling.

The open-source argument is that if we could make that back-end part more transparent, then, we could improve the tooling and thus decrease the labor. But, even a single mistake at these backend steps can scuttle a whole mask set. The methodology is incredibly incremental, scripts are handed down for generations and there are magic settings in them that make things "just work" and nobody quite remembers why or how but it was probably a lesson learned the hard way so we just leave it that way. And it's not just the money - the iteration time through a fab is months. So you have to be a bit careful about prioritizing your experiments and your risk budget when trying to make progress in this field.

I am lucky in my case because what I want to do aligns with their original commercial interests, so the strategic benefit makes things worth the tactical risk. Frankly a big part of the project overall was just figuring out how to scope things so that we both came away reasonably satisfied in terms of risk and outcomes. Would I like things to be more open? yes. would I liked to have put an opentitan core in there? yes. Would I have been able to take advantage of more back-end support to do a faster CPU? yes. But, we had to constantly balance tactical risks, and even if I don't agree with all their decisions, I have to respect their experience.

[1] https://github.com/efabless/caravel


That's very informative, thank you.

I'm basically ignorant of this entire space--I have mostly worked on SaaS products--so please forgive the question if it's too naive but as (the first?) someone who has just experienced this new and rare way of bringing a design to life are there any obvious process/tooling/whatever improvements you noticed that might make it less risky (and therefore less rare)? Reading your blog posts, the crowd supply materials, Xous docs, etc the burning thought at the front of my mind has been "there needs to be a lot more of this". Is there a path towards that?

There's actually a whole space of shared-mask tapeouts. You might have heard of TinyTapeout [1]/LibreLane [2] and the general concept of "MPW" masks - multi-project wafer masks. These effectively share cost among hundreds of developers, bringing the cost of a tape-out down.

If you're lucky enough to have an affiliation with certain institutions, there are programs that basically give academics the experience I had for a nominal fee. TSMC has a finfet program [3] which powers Soclabs [4] to provide an environment that exceeds Baochip's capabilities. If you look through [4] notice the block that says "Users' HW circuits" - that's basically what my logic is on Baochip. The problem with these is you need to be academic and I think there isn't a clear path to commercialization, and of course lots of NDAs. China also has a program called "One Student One Chip" [5] where students can tape out quite sophisticated SoCs as part of their course work.

It's probably just a matter of time before these academic programs yield a commercially compelling chip, and then that would pave a path for a transition program from the academic program to industry.

Another option is, if Baochip is quite successful, it in itself could serve as a "proof point" that may encourage other companies to allow hitchhikers. When the co-designed IP works, then it's a sales upside for the company, so there is some incentive alignment.

The trick is figuring out how to mitigate the possibility that the IP doesn't work, and bridging the gap between people with ideas and people with tape-out experience. I'm lucky in that in my first jobs out of college I did a deep dive into silicon, even designing custom transistor and standard cells for a bespoke nanophotonics PDK that I helped to develop, so I had the shared language to communicate with both classic chip companies and open source community.

There's an enormous cultural gap between the chip community and the open source community, but everyone's curiosity in this thread and participating in this dialog with questions like yours helps close that gap and thus manifest more hitchhiking opportunities in the future.

[1] https://tinytapeout.com/

[2] https://github.com/librelane/librelane

[3] https://www.tsmc.com/english/dedicatedFoundry/services/unive...

[4] https://soclabs.org/project/tsri-arm-cortex-m55-aiot-soc-des...

[5] https://ysyx.oscc.cc/en/project/intro.html


Thank you for the references this is absolutely fascinating stuff.

> Wait a minute, why can't I reply to bunnie's top-level comment?

The powers that be here think they've found a bunch of "hacks" to curb off low quality comments.


This is wonderful! Also what a fantastic partnership that allowed adding a new CPU to that die. Kudos to them.

I had a lot of trouble finding out which open source license applies. Wikipedia’s RISC-V page doesn’t seem to say; its citation for being released under open source doesn’t seem to say which one either.[0] Could be wrong. Exhausted after working all day. But it’s not front and center…

On the RISC-V site I thought it might be more prominent too but if it is I missed it. I found some docs there licensed Creative Commons. Is that the license for the entire CPU? Even layouts and everything that is past the ISA to actual silicon?

[0] https://www.extremetech.com/computing/188405-risc-rides-agai...


RISC-V is a family of instruction sets (which have various chips implementing them). Think "X86-64". It looks like the baochip-1x is using the VexRiscv CPU. The HDL is available here under MIT: https://github.com/SpinalHDL/VexRiscv

Thanks - so that one is MIT licensed. Is that the license for all RISC-V, ie ISA plus actual silicon designs?

To clarify - RISC-V is an architecture, and that is an open specification. However, as an architecture it only specifies things like, what the instructions are and their encodings. It doesn't actually give you a CPU that does anything, just an abstraction of how to describe a CPU to a common standard.

Anyone is permitted to implement a RISC-V CPU, which would then involve coding something up in an RTL. The resulting RTL artifact may be open or closed source depending upon the developer's preference. In the case of the Vexriscv, that particular one implementation is MIT licensed. There are other implementations that also have MIT licenses, but because it is up to the core's implementer to pick a license, not all RISC-V cores are open source.

In fact, some of the most commercially successful RISC-V cores are closed source licensed.


> Those with a bit of silicon savvy would note that it’s not cheap to produce such a chip, yet, I have not raised a dollar of venture capital. I’m also not independently wealthy. So how is this possible?

What kind of order of magnitude of cost are we talking about?

What are the next steps - is there some service to cut the wafer and put into a package for you?


The masks alone are single digit millions, but with all the design tools and staff costs typically tens of millions is the benchmark number for a tape out in this node.

After coming out of the fab, the chips go through probing, packaging and reeling.


> The masks alone are single digit millions,

Ah, another reason why hardware erratas get fixed so rarely (I assume - along with retesting of course).


Yes, exactly. A lot depends on your expected volume. Essentially, masks are your fixed tooling cost for chips. You then amortize that over your full volume. It’s easier to justify another mask set to fix bugs if you are going to be selling oodles of chips and the cost ends up being negligible and much harder to justify it if the volume is low. Years ago, I was CTO at a startup when our chips came back from fab. Everything looked good except for a silly error that our chief architect had made. He felt horrible for a couple weeks. He was a great architect (meticulous and precise) and I kept telling him that it was no use crying over spilled milk. Engineering is hard. But there went another few million dollars of precious venture capital up in smoke for the replacement mask sets.

I knew the masks were expensive, but not that they were that expensive. Of course it's all a question of total quantity you use that mask for, but still...

It all depends on the node. Masks in 130nm are maybe in the $10k's-$100k's range. Masks for the latest TSMC nodes might cost you $30-40 million per set. The masks are pretty much a modern marvel in their own right - I'd wager they are some of the most precisely manufactured human objects in existence.

Most chips have basically one revision after first tapeout, because it's hard to get everything right first time. Small revisions can sometimes be done in the metal layer only, which is cheaper.

Can you share something about the subsequent per-chip manufacturing costs?

Rule of thumb is that a processed wafer from 28nm and older is around $3k/wafer and the cost goes up kind of exponentially towards the smaller nodes. Also, in general, the fab wants you to order a "FOUP" of wafers at a time - that's 25 wafers at a go.

bunnie your book "Hacking the XBox" taught me how to get started on reversing electronics, took the fear out of the process, and replaced it with fun. Thanks for the multi-decades long effort you've made to make these tools available and accessible and approachable, your contributions to the hacker community are immeasurable and I cannot say thank you enough.

Thanks man!


Thank you for sharing! Comments like this make all the effort worthwhile. <3

Bunnie did a really good talk a couple months ago that has more of the background beyond what's in the blog post:

https://www.youtube.com/watch?v=H5CR-7TJtm0


> What’s a banker going to do with the source code of a chip, anyway?

Hand it to someone who does know what to do with it. It's not as important who initially gets the source so much as having it available when it is needed.


Great work on the chip, I’m really onboard with the trusted computing aim!

Is there a way to bootstrap binary code into the reram? I’m thinking being able to ‘hand-type’ in a few hundred byte kernel rather than use a flashing tool


The chip comes from the factory with a boot0/boot1 chain that is fully reproducible and buildable from source. Developers can replace boot1 with their own version, where you could add the feature you're thinking about.

Cool project. Why is it called the Baochip/Dabao?

Is it big Bao? Or take-away (just learnt the second meaning), or something else?


Personally, I love eating "bao" (a style of dumplings), but also coincidentally, a homophone of "bao" in Chinese (different character 保, similar sound) has a meaning of "protect; defend. keep; maintain; preserve. guarantee; ensure". So it means both things to me - one of my favorite foods, and also describes the technology.

"dabao" is just a pun on that - means "take-away" or "to-go". The dabao evaluation board is basically a baochip in a "to-go" package.


That would explain the naming of OpenBao, a fork of Hashicorp Vault. Goes with the other fork's name (OpenTofu) as well as the meaning you just mentioned.

I think it’s take-away, or to go. Like when you order some food to go.

This is about transparency just like the Precursor, right? How can I know that my Baochip-1x is really what it says it is?

The Baochip is packaged in a form of package that is inspectable using IRIS. [1] It does not give perfect verification but it's the best I can offer until we have more open PDKs.

[1] https://bunnie.org/iris


Very cool! So there’s 5x riscV cores available?

Yes, 1x Vexriscv RV32-IMAC + MMU, and 4x PicoRV32's as RV32E-MC for I/O processing, configured with extensions to enable deterministic, real-time bit-banging without having to count clocks.

That reminds me a lot of the xmos xcore mcus with 8 cores. I am curious what kind of synchronization primitives have you added and why?

I'm actually working on a comprehensive write up on exactly this topic that should be out sometime next week!

Just ordered 2 to play with!

thank you~~

Sounds like the Parallax Propeller 1/2 as well.

It's a good model for MCU stuff. There were people pushing Chip Gracey (Parallax) to use RISC-V instead of his custom ISA when he designed the P2 a few years ago, but he chose to do his own thing. Which has made compiler development difficult.


This seems more on the RPI side rather than propeller, propeller was never a really good choice for production integration. This looks like it could hold its own in many contexts.

Nice! I love the specialized io processors. Fantastic work!

I run a hardware company now (thankfully in the age of AI), as a direct consequence of reading Bunnies book 'hardware hacker'

Thank you Bunnie.


<3 makes all the effort worth it to hear stories like this. Thanks for sharing!

Why the few closed-source components on the system? You mention the bus, USB PHY etc -- are those things harder to design than the CPU core?

In general, things that are not strictly digital (PHYs, regulators, PLLs, ADCs) contain significant amounts of foundry IP that would be hard to release as open source. But also, some parts of the chip, for example the AXI bus fabric, were licensed simply as a risk reduction measure. If the bus fabric is bad, you've wasted millions of dollars on a mask set with little recourse. I tried to pull in some open source AXI fabrics and it wasn't pretty...a lot of rework required and even then still some bugs made it through to tape-out. Over time more and more of this can be opened but it all takes time, money, and people willing to do it.

Can I ask how bad it is with lower speed I/O? Less than 20Mbaud, so RS232, RS485, CAN, USB PD/Type-C, 10base-T, ...? The transceivers are rarely integrated, do they need different processes or do they just prefer flexible I/O pins or ...?

Hmm...it's not just the speed. Actually, the I/O pads themselves are closed source because there's a lot of process magic in them - from the ring seals to the ESD protection, the foundries consider these to be part of what makes them different from each other, so they protect those designs.

So for example, many projects bitbang USB full-speed using plain old 3.3V I/Os but by the spec the signals have to have some slew rate limiting in a form that isn't found on standard I/Os. And also, if you're doing it right, you're taking the differential signals in on USB and not just reading them into two separate single-ended pads but you're actually subtracting the analog values to get the full benefit of differential signaling's common mode rejection properties. Thus even a lower speed USB PHY has some specialty circuits in it to achieve these nuances.

As another example, RS232, by the spec, would be a +/-3V to +/-15V driver, which is actually really specialized in the chip world and quite uncommon due to the negative voltages. PHYs that drive I/Os is one of the enduring pain points for open source PDKs - they are hard to develop, "boring" because they are "just wires", but absolutely essential to get right and bring into existence if you want to talk to anything interesting.


They are likely licensed IP.

It's pretty exciting to see a small chip with an MMU. I wonder if it would be possible to get sel4 running on this?

I'm also curious about the current draw, but I couldn't find anything?


I imagine sel4 could be possible, but I haven't done any specific checking for compatibility.

Current draw - depends on the operating mode, etc. A dabao board with all its regulators and overhead draws around 30mA @ 5V. The CPU in "WFI sleep" (clocked stopped, instant wake-up, all memory preserved) will draw about 12mA @ 0.85V. There's a "deep sleep" mode that requires an effective reboot (clock stopped, no memory preserved) to come out of where it's down to under 1mA @ 0.7V. These latter low power modes require an external power management architecture that can vary the voltage of the core so you can achieve lower leakage states.

I think comparatively speaking, the Baochip doesn't have strong low power numbers. I have always imagined it as more of a chip that gets stuck into a USB device, so it's plugged into a host with a fairly ample power reserve, and not a coin cell battery.


I didn't know there were partially open source RISC-V. I might have missed it in the article, but what was the reason for having some parts closed source?

It’s not the RISC-V core itself, it’s just some of the surrounding architecture to support the CPU, to turn it into a SOC. So the USB drivers, the AXI memory interfaces, and the analog components, like PLLs for generating clocks, or even the IO pad drivers. These components take the fully open RISC-V core which works in a simulator and makes it work like a normal physical chip would.

MMU's have held sway for nearly 60 years, but I wonder if in ten years time when the AI is the whole stack/runs the whole stack and the majority of us wont be running anything but prompts they will be required. I have a big interest in how the AI will penetrate into the hardware level, not just as a sci-fi fan/author but as an electronics engineer/programmer. I should add that I doubt AI hardware will penetrate much into the embedded market due to cost.

Big fan of this project by the way.


It seems it had hardware support for secure mesh. Anyone know what that is?

With the right equipment it is possible to probe the inside of a chip, allowing an attacker to measure or even alter internal signals down to the transistor level. Expensive, but very useful if it lets you extract a crucial shared secret.

The traditional defense against this kind of invasive attack is to put a grid of sense wires on the outermost metal layer, and measuring whether it has been tampered with: you can't get to the important bits without cutting through the security grid, but any kind of modification to the security grid triggers a self-destruct.


Moonton Mobile legend mm level15

A bit sad to see another famous hacker turning to the "dark side" --- as "security chips" are a treacherous slippery slope, no matter who controls them. Just because it's "open source" doesn't mean it's a good thing.

Edit: give Stallman's "Right to Read" another read.


On the other hand, Pandora’s box has been opened, and the double-edged sword of cryptography has been unleashed on the world. Having open source security/trust systems is valuable.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: