Thanks! What kind of performance do you get with some of those more intricate po...

eholk · on July 1, 2013

Lately I've been more focused on getting the features running and less on performance, so I haven't completely explored the impact of using these structures.

I'd be interested to hear more about what you're doing. Do you have any links you could share?

BruceIV · on July 1, 2013

I've actually only really just started, I'm less than a year into my PhD, so I don't have anything published yet. My current project involves doing some non-trivial calculations on a large matrix of multi-precision integers - the obvious way to lay out the data (an array of dynamically resizable vectors) is dead wrong for GPU (the memory bandwidth is completely swamped by fetching from a different pointer for each thread - I got 100x slowdown from the CPU reference implementation), so right now I'm redoing the experiments based on a vector of fixed-length arrays (the 0th elements of all the vectors from the first formulation go in one array, the 1st elements in the next, and so forth).

iskander · on July 3, 2013

"Array of structs" -> "Struct of arrays"?

I guess if you generalize this transformation sufficiently you end up doing whole-program flattening like NESL and DPH. Or is there a less intrusive way to pull that off?

carterschonwald · on July 14, 2013

Sounds like in his case the unboxed vector soa rep might be enough, though I'd enjoy hearing more about this too. (Doing some gpu work with Haskell a bunch this fall)

BruceIV · on July 4, 2013

I don't think there's a less intrusive way to pull it off, but with sufficient compiler support you should be able to make it fairly transparent to the programmer.