[Bf-cycles] CMJ causing an infinite loop

Lukas Stockner lukas.stockner at freenet.de
Mon Apr 14 17:37:39 CEST 2014


The new backtrace is below, p0 is the p passed to the cmj_sample_1/2D 
routines since p was already used in the cmj_sample arguments.
I just tried the exact same scene, just with 2000000 samples instead of 
2000000000, and it worked perfectly. I set the sample number so high so 
that it would only stop from the adaptive stopping, but 2000000000 is 
just ridiculously high, so this issue probably won't ever affect a 
regular user.
Regarding the adaptive sampling: I think the problem is that 
quasi-random numbers aren't completely random. With the 
Van-Der-Corput-sequence, for example, every even sample is <0.5 and 
every odd sample is >= 0.5, so only taking the even samples would give 
highly wrong results. For now, I've solved it by adding a separate PRNG 
dimension that is used to decide whether the sample goes into the "even" 
buffer (>= 0.5) or not (< 0.5), as long as the individual Sobol 
dimensions are uncorrelated, it should work just fine now. Indeed, from 
what I can tell, it works just as well as CMJ sampling (without the 
hangs, of course).

(gdb) info threads
...
   32   Thread 0x7fffc038f700 (LWP 15038) "blender" cmj_permute 
(p0=1808237007, N=2000000000, s=1750424544, p=333699077, l=44721, 
i=1416588325)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:101
   31   Thread 0x7fffc0e96700 (LWP 15037) "blender" cmj_permute 
(p0=1163952146, N=2000000000, s=1460230019, p=2102387334, l=44721, 
i=2571159536)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:108
   30   Thread 0x7fffc199d700 (LWP 15036) "blender" cmj_permute 
(p0=-291720934, N=2000000000, s=1897020005, p=3686984926, l=44721, 
i=2529640769)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:105
   29   Thread 0x7fffc24a4700 (LWP 15035) "blender" cmj_permute 
(p0=-919820061, N=2000000000, s=856720172, p=2893564769, l=44721, 
i=684596378)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:109
...
(gdb) thread 29
...
(gdb) backtrace
#0  cmj_permute (p0=-919820061, N=2000000000, s=856720172, p=2893564769, 
l=44721, i=684596378) at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:109
#1  ccl::cmj_sample_2D (s=856720172, N=N at entry=2000000000, p=-919820061, 
fx=fx at entry=0x7fffc24852a0, fy=fy at entry=0x7fffc24852e0) at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:174
#2  0x00000000036358cf in path_rng_2D (fy=0x7fffc24852e0, 
fx=0x7fffc24852a0, dimension=20, num_samples=2000000000, 
sample=<optimized out>, rng=0x7fffc2496b00, kg=0x7fffc2497610)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_random.h:149
#3  path_state_rng_2D (state=0x7fffc24861c0, state=0x7fffc24861c0, 
state=0x7fffc24861c0, fy=0x7fffc24852e0, fx=0x7fffc24852a0, dimension=0, 
rng=0x7fffc2496b00, kg=0x7fffc2497610)
     at 
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_random.h:279
...


> I haven't seen this before, can you figure out the values s, N and p
> that are causing this problem? Or attach a .blend file with the
> problem? I can't see it from your backtrace. As long as 0 <= s < N it
> should not hang, but this permutation function is of course pretty
> tricky so it's difficult to spot the bug from just the code.
>
> For adaptive sampling, Sobol is supposed to be good at this I thought.
> But I guess you look at the even samples because that's easy to do
> memory efficient compared to looking at the first half of the samples,
> and makes it possible to stop at any time rather than each time the
> number of samples doubles. Perhaps you can probabilistically use
> either the even or odd sample, with the random number based on a hash
> of the sample number and pixel xy? CMJ should be fixed of course, this
> is just an idea.
>
>
>
> On Sat, Apr 12, 2014 at 6:36 PM, Lukas Stockner
> <lukas.stockner at freenet.de> wrote:
>> Hi,
>> while working on adaptive stopping I noticed that sometimes tiles just
>> freeze when using CMJ sampling. They are marked as active, but don't
>> return from the kernel.
>> After running in GDB, stopping and printing out the thread states, it
>> seems that they enter an infinite loop in cmj_permute(). The entire
>> thread states are below, 5 rendering threads work fine, 3 are hanging.
>> The "kernel_write_pass_data" is one of my changes, but it's completely
>> unrelated to sampling and works correctly with Sobol.
>> Is this a known problem? I can't imagine that my changes cause it to
>> break since I changed nothing related to CMJ. Sadly, I can't just use
>> Sobol instead since it gives correlation problems (the stopping works on
>> the difference between even passes and all passes, and with Sobol they
>> don't converge to the same result. A solution would be to decide whether
>> to add a sample to the "even buffer" with a new RNG sample).
>>
>> Lukas Stockner
>>
>>
>>
>>
>>
>>
>> (gdb) info threads
>>     Id   Target Id         Frame
>>     44   Thread 0x7fffb1e21700 (LWP 25975) "blender" cmj_permute
>> (p=1102306825, l=44721, i=1576503139) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:107
>>     43   Thread 0x7fffb2eb6700 (LWP 25974) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     42   Thread 0x7fffb39bd700 (LWP 25973) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     41   Thread 0x7fffb44c4700 (LWP 25972) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     40   Thread 0x7fffb4fcb700 (LWP 25971) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     39   Thread 0x7fffb5ad2700 (LWP 25970) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     38   Thread 0x7fffb65d9700 (LWP 25969) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     37   Thread 0x7fffb70e0700 (LWP 25968) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     36   Thread 0x7fffbd3cf700 (LWP 25967) "blender" 0x0000000003637a1d
>> in kernel_write_pass_data (L=..., evenSample=true, writeVarData=true,
>> writeConstData=false, weight=1, sample=258,
>>       buffer=0x60a2028558d0, pd=0x7fffbd3c1d20, kg=0x7fffbd3c2610) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_passes.h:148
>>     35   Thread 0x7fffbded6700 (LWP 25966) "blender" cmj_permute
>> (p=2243010963, l=44721, i=3758030672) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:104
>>     34   Thread 0x7fffbe9dd700 (LWP 25965) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     33   Thread 0x7fffbf4e4700 (LWP 25964) "blender" 0x00000000034ee00f
>> in cmj_permute (p=3837528855, l=44721, i=3522537535) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:102
>>     32   Thread 0x7fffbffeb700 (LWP 25963) "blender" 0x00000000034f24f2
>> in ccl::bvh_intersect_instancing (kg=kg at entry=0x7fffbffde610,
>> ray=ray at entry=0x7fffbffccd50, isect=isect at entry=0x7fffbffcc8d0,
>>       visibility=visibility at entry=256) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_bvh_traversal.h:194
>>     31   Thread 0x7fffc0af2700 (LWP 25962) "blender" 0x00000000034ee00f
>> in cmj_permute (p=2537997320, l=44721, i=1494394943) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:102
>>     30   Thread 0x7fffc15f9700 (LWP 25961) "blender" 0x0000000003532c5d
>> in fetch (this=0xc110, index=<optimized out>) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_compat_cpu.h:43
>>     29   Thread 0x7fffc1dfa700 (LWP 25960) "blender" cmj_permute
>> (p=3687792016, l=44721, i=4285244509) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:98
>>     28   Thread 0x7fffd0f27700 (LWP 25958) "blender" 0x00007ffff2ef71f8
>> in pthread_join () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     27   Thread 0x7fffc8e63700 (LWP 25957) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     26   Thread 0x7fffc996a700 (LWP 25956) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     25   Thread 0x7fffca471700 (LWP 25955) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     24   Thread 0x7fffcaf78700 (LWP 25954) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     23   Thread 0x7fffcba7f700 (LWP 25953) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     22   Thread 0x7fffcc586700 (LWP 25952) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     21   Thread 0x7fffcd08d700 (LWP 25951) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>     20   Thread 0x7fffd256e700 (LWP 25950) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     19   Thread 0x7fffd3075700 (LWP 25949) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     18   Thread 0x7fffd3b7c700 (LWP 25948) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     17   Thread 0x7fffd4683700 (LWP 25947) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     16   Thread 0x7fffd518a700 (LWP 25946) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     15   Thread 0x7fffd5c91700 (LWP 25945) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     14   Thread 0x7fffd6798700 (LWP 25944) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>>     12   Thread 0x7fffdcc00700 (LWP 25941) "blender" 0x00007ffff2efd41d
>> in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     11   Thread 0x7fffe1a0e700 (LWP 25940) "threaded-ml"
>> 0x00007fffeef1af7d in poll () from /lib/x86_64-linux-gnu/libc.so.6
>>     9    Thread 0x7fffe253c700 (LWP 25938) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     8    Thread 0x7fffe3043700 (LWP 25937) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     7    Thread 0x7fffe3b4a700 (LWP 25936) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     6    Thread 0x7fffe4651700 (LWP 25935) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     5    Thread 0x7fffe5158700 (LWP 25934) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     4    Thread 0x7fffe5c5f700 (LWP 25933) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     3    Thread 0x7fffe6766700 (LWP 25932) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>>     2    Thread 0x7fffe726d700 (LWP 25931) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> * 1    Thread 0x7ffff7fb87c0 (LWP 25927) "blender" 0x00007fffdb6bc36c in
>> ?? () from /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so
>>
>> _______________________________________________
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-cycles
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles



More information about the Bf-cycles mailing list