[Bf-cycles] correlation between GTX680 dual-precision FP and cycles performance?

Brecht Van Lommel brechtvanlommel at pandora.be
Tue Mar 27 20:14:32 CEST 2012


I basically don't know at this point, it requires careful analysis of
the code running on such a GPU. Maybe it is a matter of tweaking some
parameters, or maybe the new scheduling really is problem that is very
difficult to overcome. At least two other GPU raytracers seem to run
slower on it than the GTX 580, so that is worrying, but I just have
not done any testing or analysis of Cycles running on this GPU.

We also don't use 64 bit precision, so that should have no influence.


On Tue, Mar 27, 2012 at 4:26 PM, Matt Gray <mjg at see3d.co.uk> wrote:
> Hi Brecht,
> Hoping for your comment on my speculation as to why the GTX680 is not
> the Cycles powerhouse a lot of people hoped it would be:
> http://blenderartists.org/forum/showthread.php?239480-2.61-Cycles-render-benchmark/page17
> It is possible the performance deficit results from nothing more than a
> lack of software optimisation (CUDA 3.0 for the GTX680?), but the
> anandtech review of the card listed quite a few other examples where the
> 680 has been 'detuned' as far as compute is concerned.
> The brush-stroke summary being that the GTX 680 is Nvidia's first
> 'efficient' architecture in a long time, at least for gaming, precisely
> because a lot of the heavy-lifting silicon for compute purposes was
> removed. This would not be unexpected on what is supposed to be a
> mid-range GPU like the 460/560 generation, as it was rumoured to be
> before a spot of re-branding occurred.
> A few quotes from anandtech:
> " The CUDA FP64 block contains 8 special CUDA cores that are not part of
> the general CUDA core count and are not in any of NVIDIA’s diagrams.
> These CUDA cores can only do and are only used for FP64 math. What's
> more, the CUDA FP64 block has a very special execution rate: 1/1 FP32.
> With only 8 CUDA cores in this block it takes NVIDIA 4 cycles to execute
> a whole warp, but each quarter of the warp is done at full speed as
> opposed to ½, ¼, or any other fractional speed that previous
> architectures have operated at. Altogether GK104’s FP64 performance is
> very low at only 1/24 FP32 (1/6 * ¼), but the mere existence of the CUDA
> FP64 block is quite interesting because it’s the very first time we’ve
> seen 1/1 FP32 execution speed. Big Kepler may not end up resembling
> GK104, but if it does then it may be an extremely potent FP64 processor
> if it’s built out of CUDA FP64 blocks."
> "So NVIDIA has replaced Fermi’s complex scheduler with a far more
> simpler scheduler that still uses scoreboarding and other methods for
> inter-warp scheduling, but moves the scheduling of instructions in a
> warp into NVIDIA’s compiler. In essence it’s a return to static
> scheduling. Ultimately it remains to be seen just what the impact of
> this move will be. Hardware scheduling makes all the sense in the world
> for complex compute applications, which is a big reason why Fermi had
> hardware scheduling in the first place, and for that matter why AMD
> moved to hardware scheduling with GCN."
> "What makes this launch particularly interesting if not amusing though
> is how we’ve ended up here. Since Cypress and Fermi NVIDIA and AMD have
> effectively swapped positions. It’s now AMD who has produced a higher
> TDP video card that is strong in both compute and gaming, while NVIDIA
> has produced the lower TDP part that is similar to the Radeon HD 5870
> right down to the display outputs."
> So my questions:
> 1. Does Cycles use dual precision FP (FP64?)?
> 2. If not, does the poor performance result scheduler and other
> architecture deficiencies?
> 3. If yes, how much of the poor performance derives from the lack of
> dual-precision grunt?
> 4. Or, are we jumping the gun branding the GTX680 as poor, and optimised
> builds will surprise?
> Many thanks
> mjg
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles

More information about the Bf-cycles mailing list