[Bf-cycles] CMJ causing an infinite loop
Lukas Stockner
lukas.stockner at freenet.de
Mon Apr 14 17:37:39 CEST 2014
The new backtrace is below, p0 is the p passed to the cmj_sample_1/2D
routines since p was already used in the cmj_sample arguments.
I just tried the exact same scene, just with 2000000 samples instead of
2000000000, and it worked perfectly. I set the sample number so high so
that it would only stop from the adaptive stopping, but 2000000000 is
just ridiculously high, so this issue probably won't ever affect a
regular user.
Regarding the adaptive sampling: I think the problem is that
quasi-random numbers aren't completely random. With the
Van-Der-Corput-sequence, for example, every even sample is <0.5 and
every odd sample is >= 0.5, so only taking the even samples would give
highly wrong results. For now, I've solved it by adding a separate PRNG
dimension that is used to decide whether the sample goes into the "even"
buffer (>= 0.5) or not (< 0.5), as long as the individual Sobol
dimensions are uncorrelated, it should work just fine now. Indeed, from
what I can tell, it works just as well as CMJ sampling (without the
hangs, of course).
(gdb) info threads
...
32 Thread 0x7fffc038f700 (LWP 15038) "blender" cmj_permute
(p0=1808237007, N=2000000000, s=1750424544, p=333699077, l=44721,
i=1416588325)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:101
31 Thread 0x7fffc0e96700 (LWP 15037) "blender" cmj_permute
(p0=1163952146, N=2000000000, s=1460230019, p=2102387334, l=44721,
i=2571159536)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:108
30 Thread 0x7fffc199d700 (LWP 15036) "blender" cmj_permute
(p0=-291720934, N=2000000000, s=1897020005, p=3686984926, l=44721,
i=2529640769)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:105
29 Thread 0x7fffc24a4700 (LWP 15035) "blender" cmj_permute
(p0=-919820061, N=2000000000, s=856720172, p=2893564769, l=44721,
i=684596378)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:109
...
(gdb) thread 29
...
(gdb) backtrace
#0 cmj_permute (p0=-919820061, N=2000000000, s=856720172, p=2893564769,
l=44721, i=684596378) at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:109
#1 ccl::cmj_sample_2D (s=856720172, N=N at entry=2000000000, p=-919820061,
fx=fx at entry=0x7fffc24852a0, fy=fy at entry=0x7fffc24852e0) at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:174
#2 0x00000000036358cf in path_rng_2D (fy=0x7fffc24852e0,
fx=0x7fffc24852a0, dimension=20, num_samples=2000000000,
sample=<optimized out>, rng=0x7fffc2496b00, kg=0x7fffc2497610)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_random.h:149
#3 path_state_rng_2D (state=0x7fffc24861c0, state=0x7fffc24861c0,
state=0x7fffc24861c0, fy=0x7fffc24852e0, fx=0x7fffc24852a0, dimension=0,
rng=0x7fffc2496b00, kg=0x7fffc2497610)
at
/home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_random.h:279
...
> I haven't seen this before, can you figure out the values s, N and p
> that are causing this problem? Or attach a .blend file with the
> problem? I can't see it from your backtrace. As long as 0 <= s < N it
> should not hang, but this permutation function is of course pretty
> tricky so it's difficult to spot the bug from just the code.
>
> For adaptive sampling, Sobol is supposed to be good at this I thought.
> But I guess you look at the even samples because that's easy to do
> memory efficient compared to looking at the first half of the samples,
> and makes it possible to stop at any time rather than each time the
> number of samples doubles. Perhaps you can probabilistically use
> either the even or odd sample, with the random number based on a hash
> of the sample number and pixel xy? CMJ should be fixed of course, this
> is just an idea.
>
>
>
> On Sat, Apr 12, 2014 at 6:36 PM, Lukas Stockner
> <lukas.stockner at freenet.de> wrote:
>> Hi,
>> while working on adaptive stopping I noticed that sometimes tiles just
>> freeze when using CMJ sampling. They are marked as active, but don't
>> return from the kernel.
>> After running in GDB, stopping and printing out the thread states, it
>> seems that they enter an infinite loop in cmj_permute(). The entire
>> thread states are below, 5 rendering threads work fine, 3 are hanging.
>> The "kernel_write_pass_data" is one of my changes, but it's completely
>> unrelated to sampling and works correctly with Sobol.
>> Is this a known problem? I can't imagine that my changes cause it to
>> break since I changed nothing related to CMJ. Sadly, I can't just use
>> Sobol instead since it gives correlation problems (the stopping works on
>> the difference between even passes and all passes, and with Sobol they
>> don't converge to the same result. A solution would be to decide whether
>> to add a sample to the "even buffer" with a new RNG sample).
>>
>> Lukas Stockner
>>
>>
>>
>>
>>
>>
>> (gdb) info threads
>> Id Target Id Frame
>> 44 Thread 0x7fffb1e21700 (LWP 25975) "blender" cmj_permute
>> (p=1102306825, l=44721, i=1576503139) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:107
>> 43 Thread 0x7fffb2eb6700 (LWP 25974) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 42 Thread 0x7fffb39bd700 (LWP 25973) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 41 Thread 0x7fffb44c4700 (LWP 25972) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 40 Thread 0x7fffb4fcb700 (LWP 25971) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 39 Thread 0x7fffb5ad2700 (LWP 25970) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 38 Thread 0x7fffb65d9700 (LWP 25969) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 37 Thread 0x7fffb70e0700 (LWP 25968) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 36 Thread 0x7fffbd3cf700 (LWP 25967) "blender" 0x0000000003637a1d
>> in kernel_write_pass_data (L=..., evenSample=true, writeVarData=true,
>> writeConstData=false, weight=1, sample=258,
>> buffer=0x60a2028558d0, pd=0x7fffbd3c1d20, kg=0x7fffbd3c2610) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_passes.h:148
>> 35 Thread 0x7fffbded6700 (LWP 25966) "blender" cmj_permute
>> (p=2243010963, l=44721, i=3758030672) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:104
>> 34 Thread 0x7fffbe9dd700 (LWP 25965) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 33 Thread 0x7fffbf4e4700 (LWP 25964) "blender" 0x00000000034ee00f
>> in cmj_permute (p=3837528855, l=44721, i=3522537535) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:102
>> 32 Thread 0x7fffbffeb700 (LWP 25963) "blender" 0x00000000034f24f2
>> in ccl::bvh_intersect_instancing (kg=kg at entry=0x7fffbffde610,
>> ray=ray at entry=0x7fffbffccd50, isect=isect at entry=0x7fffbffcc8d0,
>> visibility=visibility at entry=256) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_bvh_traversal.h:194
>> 31 Thread 0x7fffc0af2700 (LWP 25962) "blender" 0x00000000034ee00f
>> in cmj_permute (p=2537997320, l=44721, i=1494394943) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:102
>> 30 Thread 0x7fffc15f9700 (LWP 25961) "blender" 0x0000000003532c5d
>> in fetch (this=0xc110, index=<optimized out>) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_compat_cpu.h:43
>> 29 Thread 0x7fffc1dfa700 (LWP 25960) "blender" cmj_permute
>> (p=3687792016, l=44721, i=4285244509) at
>> /home/lukas/BlenderSrc/blender/intern/cycles/kernel/kernel_jitter.h:98
>> 28 Thread 0x7fffd0f27700 (LWP 25958) "blender" 0x00007ffff2ef71f8
>> in pthread_join () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 27 Thread 0x7fffc8e63700 (LWP 25957) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 26 Thread 0x7fffc996a700 (LWP 25956) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 25 Thread 0x7fffca471700 (LWP 25955) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 24 Thread 0x7fffcaf78700 (LWP 25954) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 23 Thread 0x7fffcba7f700 (LWP 25953) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 22 Thread 0x7fffcc586700 (LWP 25952) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 21 Thread 0x7fffcd08d700 (LWP 25951) "blender" 0x00007fffee61fb42
>> in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> 20 Thread 0x7fffd256e700 (LWP 25950) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 19 Thread 0x7fffd3075700 (LWP 25949) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 18 Thread 0x7fffd3b7c700 (LWP 25948) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 17 Thread 0x7fffd4683700 (LWP 25947) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 16 Thread 0x7fffd518a700 (LWP 25946) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 15 Thread 0x7fffd5c91700 (LWP 25945) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 14 Thread 0x7fffd6798700 (LWP 25944) "blender" 0x00007ffff2ef9c84
>> in pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> 12 Thread 0x7fffdcc00700 (LWP 25941) "blender" 0x00007ffff2efd41d
>> in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 11 Thread 0x7fffe1a0e700 (LWP 25940) "threaded-ml"
>> 0x00007fffeef1af7d in poll () from /lib/x86_64-linux-gnu/libc.so.6
>> 9 Thread 0x7fffe253c700 (LWP 25938) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 8 Thread 0x7fffe3043700 (LWP 25937) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 7 Thread 0x7fffe3b4a700 (LWP 25936) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 6 Thread 0x7fffe4651700 (LWP 25935) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 5 Thread 0x7fffe5158700 (LWP 25934) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 4 Thread 0x7fffe5c5f700 (LWP 25933) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 3 Thread 0x7fffe6766700 (LWP 25932) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> 2 Thread 0x7fffe726d700 (LWP 25931) "blender" 0x00007ffff2efbf60
>> in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> * 1 Thread 0x7ffff7fb87c0 (LWP 25927) "blender" 0x00007fffdb6bc36c in
>> ?? () from /usr/lib/x86_64-linux-gnu/dri/nouveau_dri.so
>>
>> _______________________________________________
>> Bf-cycles mailing list
>> Bf-cycles at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-cycles
> _______________________________________________
> Bf-cycles mailing list
> Bf-cycles at blender.org
> http://lists.blender.org/mailman/listinfo/bf-cycles
More information about the Bf-cycles
mailing list