>> RISC-V isn't magically immune to spectre for no reason.
Indeed. RISC-V is an ISA and says little about how any given implementation may accelerate branches or load cache lines. It's entirely feasible to create a CPU based on the RISC-V ISA that is vulnerable to these attacks. There are proprietary RISC-V implementations in the wild today that may or may not have such properties, and while these deeply embedded implementations are likely to be simple, in-order non-speculative processors we can't really know.
Recently the RISC-V ISA was found to have an under specified memory model[1] that could manifest as memory ordering problems in more complex (and as yet hypothetical) implementations. This was discovered by Princeton researchers using a formal verification system. Since then RISC-V developers have been working[2] toward correcting this by fully specifying the RISC-V memory model (including two variants with different degrees of rigor.)
To my mind this specific case argues that open design may see real benefits from public scrutiny.
Neither the article nor the title imply that. They are saying open is better, colaborative is better, and that RISCV is a chance to start over with security at the forefront.
For example to fix Spectre attack, you would need some extra "speculative cache lines", which you would then copy to real cache lines once you decided they really need to be loaded or dump if not (having not flushed any observable cache lines). That would cost you a lot of extra silicon. Which would cost you dollars. Alternatively, you could just reduce speculation, which would cost you performance. The third alternative is what we're doing now: nothing. That costs you security.
You outline only a single solution of which there are many. The thing RISC-V gets you is the ability to iterate on an implementation while still running the body of existing code. As ridiculous as it seems, this hasn't been possible before. RISC-V didn't technically need to exist but it did politically and legally.
I don't think your equation is fact or rigorous, I respect your work but it call bull on it. And there are way more than 3 options, this is a false dichotomy.
I think this equation actually makes sense when only the same people/minds are changing the factors. If the same team is given the task of improving a product, chances are they will be close minded and only able to tweak say security at the expense of the other two.
To break this equation, you need a new paradigm, which often comes from outsiders adding their perspective. And open source is a great way to do this.
In fact, in this case, it's more probably immune because it's less mature. Only with advanced prefetching and speculative execution do you hit the Spectre of data that could have been.
Either way, this is the time everyone and their brother, sister, mother and child will have something to sell you, so it's not surprised the RISC-V people are out to sell their architecture now.
> this is the time everyone and their brother, sister, mother and child will have something to sell you
One can only hope! At this point we could stand to have some architectural fragmentation. I'm pretty happy to see RISC-V blatantly capitalizing on this situation.
Yes. Only when you completely eliminate the conventional kind of speculative execution [0] can you have a hope of being immune to these sorry of problems.
Why do you think such an architecture is completely invulnerable? The Summary #2 slide lists the following as a benefit of the Mill:
- Can load across protection boundaries
Valid data is usable; invalid data cannot be seen
At the very least, this probably makes ASLR pretty easy to defeat across protection boundaries. Are you really willing to bet that there will never be a more clever exploit that can determine more information? In slide 51 of the IPC talk[0] they explicitly say that they put permissions checking after the cache, and that this is essential for their performance and power claims.
Not really. What nobody has said AFAICT is that the attacks all stem from the problem of running untrusted code on your processor. It's not just apps or websites, we let JS code run our machine for every little ad banner. We allow our cloud infrastructure to share cache with someone else's. These things are all done in the name of convenience. Only when security is considered a top priority will these problems really go away. We must stop running untrusted code. Even. In. A. Sandbox.
We will always have bugs. The amount of money and resource efficiency we gain from sharing is too big to ignore. These issue will be addressed and yes more issues will be uncovered. Our sandboxes will get better, our kernels will harden and our CPU designs will improve. It’s actually not possible to throw our hands in the air and stop as you suggest. From every mistake we grow stronger because we don’t run away. It’s great that these issues have been uncovered. It sucks dealing with the fallout but that is only short term.
While this is acceptable for the average application, Meltdown and Spectre should have made it clear to even the last idiot that shared infrastructure is. not. suitable. for security critical applications. Especially not for critical cryptographic applications.
And we’d have less innovation... perhaps at some point that is what we decide we need and well that is one reason for us in the technology space to take security seriously to help keep from allowing regulation into industry... as they say it only takes one asshole to screw all up for the rest of us...
It isn't well argued, but the point is that if you have two chip designers/ISA shops in the world, both may make the same mistake. If the ISA is open, you (hopefully) have a hundred, and not all will fool themselves the same way.
Also, if you do come up with a better way, you've a much greater chance of persuading someone to implement it.
> No announced RISC-V silicon is susceptible, and the popular open-source RISC-V Rocket processor is unaffected as it does not perform memory accesses speculatively.
True but misleading. Rocket is an in-order chip. Not only it does not perform memory accesses speculatively, it does not speculate at all.
Presumably an in-order chip is slower than the sophisticated modern out-of-order processors around nowadays and faces marketing headwinds accorindgly. It's only fair to gain a marketing tailwind from the problmes caused by aggrssive speculation.
a pipeline flush and unprefetched instruction fetch on each branch (20-25% of insns) would dramatically slow down the CPU. With branch prediction you can potentially have correctly predicted branches take zero cycles, even without speculative execution.
In established computer architecture terminology, speculative execution is reserved for speculating past branches. (Speculation meaning something that may not pan out...)
This shouldn’t be downvoted. The reason to have a branch predictor is so you can speculatively execute down the predicted path when you hit a branch. Although I think the Spectre attack requires not only speculation, but the ability to perform loads out of program order.
The direction of a branch isn't going to be known until a couple stages into the pipeline ("EX" in the classic RISC version). Ideally, the preceding stages in the pipeline will be fed correctly - that can either be the instructions immediately following the branch (as in the "branch delay slot(s)" found in some architectures) or the instructions at where the CPU thinks the branch will lead.
No one has fabricated a BOOM processor. It's a synthesizable core that runs on FPGA.
>> It definitely seems vulnerable to Spectre.
The question has been asked but there is no answer yet; it depends on the details of BOOM branch prediction and no one has made any claims yet about BOOM one way or the other.
> The question has been asked but there is no answer yet; it depends on the details of BOOM branch prediction and no one has made any claims yet about BOOM one way or the other.
It's a pretty standard out-of-order design that executes loads speculatively. The BOOM spec has a sentence beginning
"As BOOM will send speculative load instructions to the cache…"
How would such a design not be vulnerable to Spectre?
> It's a pretty standard out-of-order design that executes loads speculatively. The BOOM spec has a sentence beginning
> "As BOOM will send speculative load instructions to the cache…"
> How would such a design not be vulnerable to Spectre?
Hypothetically (I'm not familiar enough with BOOM to say whether this applies in this case or not), if the "speculation load depth" is limited to 1, then it seems it could be invulnerable? From the spectre paper, the exploit uses code like
if (x < array1_size)
y = array2[array1[x] * 256];
So if the speculation load depth would be limited to 1, the cpu could speculatively load array1[x], compute "array1[x] * 256", but not load "array2[array1[x] * 256]" until the branch has been resolved?
Now, I don't know enough about micro-architecture to say whether such an idea makes any sense in general. Secondly, it seems Spectre is only the first on a long series of side channel attacks that target speculation, so it might be that the hypothetical "speculation load depth limited" design I envisioned above is vulnerable to some other variation of the attack.
What does "synthesizable core" mean here? Does it simply mean that someone has written the Verilog that could actually produce a working core and made it available?
That means two things. One, that the design description ("RTL code") can be pushed through synthesis and place-and-route tools and turned into gates. This is useful for putting the design onto an FPGA or getting a piece of silicon (ASIC chip).
But it also means "not custom", as in, the gates are not auto-generated from "synthesizable RTL" but rather than laid out by hand. Full custom can eke out some improvements over synthesizable logic, but it's also more expensive and can preclude testing out designs on FPGAs before manufacture.
Most modern chips are mostly synthesized with some of the more critical components (and more "regular" structures) handled using custom or semi-custom methodologies.
Verilog has some constructs that can't actually be synthesized (i.e. converted into configurations of gates) by tools, as well as coding patterns that can result in problems for synthesis. Unsynthesizable code is still used for things like test harnesses.
Perhaps they meant "announced RISC-V silicon" as in something you can buy (if not right away, at least at some specified time in the future)? AFAIU the BOOM tapeout was a test run, with no intention if giving or selling the chips to outsiders.
If we take this stricter definition, is there any RISC-V hardware out there except the SiFive ones, which are apparently simple enough that they are indeed not susceptible to these issues?
This is a piece without substance since it basically says "Oh hey, competing chips actually selling to customers have some issues. We promise we'll be better when our lab designs finally end up in your hands. Trust and fund us now!" We'll see.
Anyone interested in what a better hardware/software-security architecture for RISC-V might look like should check out these exemplary projects:
CHERI already runs FreeBSD on a FPGA machine. I previously suggested merging Rocket with a version of CHERI as a start for that reason. Then, Draper is modifying or has modified RISC-V to use SAFE architecture. They're creating a brand around it as below.
> This is a piece without substance since it basically says "Oh hey, competing chips actually selling to customers have some issues. We promise we'll be better when our lab designs finally end up in your hands. Trust and fund us now!" We'll see.
I think they're fairly clearly stating that the licensing and community arrangements around RISC-V make it an excellent testbed for high performance mitigations to these problems.
Are you concerned with spectre 1+2 across protection domains or in the context of running untrusted code within the same process (e.g. a like a JS VM)?
Untrusted code within the same process seems like it'd be harder for the CPU do something about, since it can't know what portions of the address space are trusted and not.
SiFive seems to be selling the only publicly available chips as of yet. The FE310 (~300MHz 32-bit microcontroller, FreeRTOS capable) can be bought on the HiFive1 and one other dev board, or as bare chips. They say they'll ship a FU500 (~1.5GHz quad-core application processor with supervisor mode and memory protection, capable of booting Linux and FreeBSD) dev board (board called HiFive Unleashed) this quarter.
> Cache and out of order buffers (speculation) is what is behind latest security issues.
That's one way to look at it. Another is that opaque CPU designs are a big issue (at least for meltdown, which is a bug; spectre is more of a broken design philosophy).
Today there is a benefit from the higher code density of CISC. 1. Less memory bandwith expended moving instructuons into I-cache. 2. Less die area spent on I-cache per unit of "code functionality", however you mesure that.
RISC only made sense when logic cycle times, memory cycle times, and I/O cycle times were at rough parity.
If you read the rest of the thread, you'd find that both of your points are moot, as RISC-V code density is no worse than x86-64.
As for your claims on RISC not making sense, I'll just say it makes more sense than ever, as complexity leads to issues, which could affect security, and security is important in the present networked world.
Having complex instructions is not the only way to make a complex processor. With a very simple ISA, to even come close to what a modern processor can do with a certain amount of time and power requires a ridiculously complicated design.
This is why Spectre is as ubiquitous as it is. Speculative execution is not a feature someone implements just for the sake of it.
>To even come close to what a modern processor can do with a certain amount of time and power REQUIRES a ridiculously complicated design.
Requires is debatable. If someone manages to match it with a simpler design, it's an improvement. RISC-V open implementations are surprisingly very competitive already.
>With a very simple ISA,
Seems to imply that this is a minus, but what CISC really means in the present world is that there's a CISC->RISC translation layer, which is not free, and a lot of work passed down to toolchain developers, as they have to generate code for a CISC ISA.
In x86/amd64, the complexity is such that no individual can be expected to understand the whole thing, and CPU designers, low level software and compiler developers have to deal with this (and will inevitably screw up). Why this isn't seen as a massive problem baffles me.
More like a mixture of something like CISC->VLIW and CISC->"micro-ops". Remember x86 implementations can actually combine x86 ISA instruction pairs into a micro-op [0].
RISC is just a concept that was a great fit to the constraints the industry had in the nineties. It was great for achieving high clock speeds back then. The constraints are just different nowadays.
In RISC-V parlance, it'd be the new vector extensions. They're pretty cool as they've made the width variable. Refer to 7th workshop proceedings.
>micro-ops
Are a fancy word. It just means a (hidden) RISC architecture. There's a bunch of RISC architectures out there, but they get down to doing more or less the same things.
Most RISC and CISC architectures in use today are evolved from ones existing back in the 80s, and carry a lot of baggage. RISC-V benefits from having a clean slate at this, using the experience of existing ISAs.
>can actually combine x86 ISA instruction pairs into a micro-op
More complexity, and more tickets for the implementation bug lottery.
> >can actually combine x86 ISA instruction pairs into a micro-op
> More complexity, and more tickets for the implementation bug lottery.
Well, the slides (and accompanying paper) you linked to earlier in this thread makes the cases for using macro-op fusion in high-performance RISC-V designs.