[Bf-committers] SSE/AVX in cloth simulator
plucinski.mariusz at gmail.com
Thu Dec 19 18:28:36 CET 2019
Thanks for the answer.
Am Do., 19. Dez. 2019 um 16:44 Uhr schrieb <darkdefende at gmail.com>:
> > 2. I tried explicitly enabling auto vectorization in GCC, but it didn't
> > change much. Is that normal? If not, which flags should be used?
> The biggest issue with the current cloth sim is that it is using a very
> strictly single threaded way of computing the cloth simulation.
> Other programs and games use newer methods that are multi threaded and
> can in some cases even be run on the GPU with large speed gains.
> Or method while slow, is in terms of simulation accuracy not bad. In some
> cases it is even better than some of the multi threaded methods. But
> I don't really think many people care about simulation accuracy. They
> just want nice looking simulations.
>From the user PoV, I agree that accuracy seems to be pretty okay,
especially since the improvements introduced in 2.8. But to achieve good
results you either need unnaturally big collision distance (although it's
acceptable in many situations) or increase collisions' and simulation's
quality (which is where performance issues show up).
> 3. If there's no other way, I may be ready to try to rewrite critical
> > of the simulator for SSE/AVX. In such case, could you give me a guideline
> > on how to do it correctly (with ultimate merge into master in mind)? I
> > noticed Cycles code has #ifdefs for SSE support inside its mathematical
> > functions, would similar approach be useful here?
> While I think you might be able to speed this up a bit by doing this, I
> don't think this is the way to go. The bottle neck will still be the
> inherent single threaded nature of our cloth solver.
I definitely agree that replacing the algorithm by one able to utilize
multiple cores or the GPU would be more beneficial. At the same time, it's
simply much more work. My idea was rather aiming at a low-hanging fruit -
better utilization of the single core with the algorithm we have right now,
instead of a bigger effort of total overhaul.
I've been looking into integrating one of the newer multi threaded cloth
> solvers. Namely https://www.cs.utah.edu/~ladislav/liu13fast/liu13fast.html
> and a modified version that builds on top of the previous paper:
> For the first paper (Fast Simulation of Mass-Spring Systems), I've
> noticed that quite a few people has implemented this and uploaded the
> code under the MIT licence to github:
> I were planning to see if it is able to simulate cloth fast and with
> nice results.
> If you want to help me with this I would be really happy.
That sounds like an interesting idea! Although as usual, time may be a
problem, but I would be anyway happy to help.
A problem on my side is also almost total lack of knowledge in the area of
simulations. Which is also why I was thinking about low-level optimizations
of the existing algorithm in the first place.
How far are you with your solution so far?
More information about the Bf-committers