Announcements

Between mid-October and November, the content on AREA will be relocated to the Autodesk Community M&E Hub and the Autodesk Community Gallery. Learn more HERE.

GPU Cache Population Crashing with 4 GPUs... fine with just 1

GPU Cache Population Crashing with 4 GPUs... fine with just 1

Eden_Soto2
Not applicable
3,213 Views
30 Replies
Message 1 of 31

GPU Cache Population Crashing with 4 GPUs... fine with just 1

Eden_Soto2
Not applicable

4x 1080 Ti's... C4D R21 up-to-date, latest NVIDIA Studio Driver, latest C4DtoA plugin, here's the log...

gpu_cache_population_log.txt

I've deleted the cache in AppData\Local\NVIDIA\OptixCache\arnold-6.0.1.0_driver-441.66, but each time I try to run it with all four GPUs in the system, it fails (see screenshot below for final crash entry)... I delete the cache before each new attempt.

If I run it with just one GPU in the machine, it completes successfully... with all four in, it's a guaranteed crash.

For what it's worth, I can render with all 4 GPUs no problem with Redshift, so the GPUs themselves are fine.

Would love to know if anyone knows why this could be happening and/or how to get around it

5604-annotation-2020-01-07-154700.png

0 Likes
Accepted solutions (1)
3,214 Views
30 Replies
Replies (30)
Message 21 of 31

Eden_Soto2
Not applicable

Thanks, will try it out now

0 Likes
Message 22 of 31

Eden_Soto2
Not applicable

Opened an existing project I rendered with the CPU, switched to GPU... after a long period of seemingly nothing happening, C4D quit to the desktop... emptied the NVIDIA cache folders and tried again... another crash... so it's still a no-go for me using the GPUs in C4DtoA

0 Likes
Message 23 of 31

thiago.ize
Autodesk
Autodesk

Thanks for trying it out. It looks like this is likely a different bug from what we fixed -- what we fixed should help Dante out.

0 Likes
Message 24 of 31

thiago.ize
Autodesk
Autodesk

I don't know if this will help, but nvidia released a new driver yesterday (442.19), so it might be worth a shot.

0 Likes
Message 25 of 31

Anonymous
Not applicable

Hi everyone - just to add my experience again. It seems as if the new GPU code still is not handling textures as well as the CPU mode. I removed all my 2080ti's and went down to 2 RTX2080ti's (with NVlink to double the texture memory pool, will that work?) and large scenes still crash the renderer 100% of the time. I was under the impression that the new GPU code will MIP and tile down all textures and load on demand - is that still the case?
I have all the latest drivers (including the new 442.19). All debug modes work fine (subd, displacement, etc), but more than a few textures is a guaranteed crash. The CPU handles them great.
I just saw that a new Arnold version is available to download - I'll try this ASAP!
Also to note, I am using textures on references, and reference instances. I'll post a few logs tonight. If anyone has any ideas, I'd love to hear them - thanks!

0 Likes
Message 26 of 31

Anonymous
Not applicable

Hello Thiago! I'll try this out ASAP - thanks!!

0 Likes
Message 27 of 31

thiago.ize
Autodesk
Autodesk

The GPU will still use more texture memory than CPU. This is a known issue and one that we're actively working on. Last I checked, nvlink does not double the texture memory available, though one of the future nvidia driver updates should I think fix that.

Yes, GPU loads on demand, but it currently loads much more texture data than CPU. For now you'll have to use the workaround of setting a max texture resolution in the render settings.

Hopefully with this new build you should be able to render on multiple GPUs with textures without problems, provided it fits in memory.

0 Likes
Message 28 of 31

Anonymous
Not applicable

Thiago-
First, thanks for being so responsive and answering these threads and being awesome. All of my RTX 2080ti's are the 11GB variety - sorry for being a bit clueless, but is there a specific line or a way from reading the log files that the user can see the texture hit on the card? I tried the same scene last night by capping the max texture size in the Systems tab to 256 - it actually got to first pixel and crashed, which was encouraging.
Does the GPU engine (minus overhead) still MIP the TXs down by judging distance from camera?

0 Likes
Message 29 of 31

thiago.ize
Autodesk
Autodesk

Make sure you enable sufficient verbosity in the log file (see link to the right of this page for help on log files) and then at the end of a successful render (we do need to add a way to report memory used during a failed render) you can see the memory used (in this case, 3.5GB of texture memory on the GPU):

3:30 164MB | peak GPU memory consumed 5197.00MB
3:30 164MB |  output buffers            79.33MB
3:30 164MB |  geometry                 133.37MB
3:30 164MB |   polymesh                133.37MB
3:30 164MB |  texture cache           3552.00MB
3:30 164MB |  unaccounted             1432.30MB
0 Likes
Message 30 of 31

thiago.ize
Autodesk
Autodesk

Yes, GPU uses mipmapping, but it's missing some other important optimizations which is why it's not as efficient as the CPU. We are in the process of adding in the missing optimizations, so in the future we do hope that memory usage will be more comparable between CPU and GPU.

0 Likes
Message 31 of 31

thiago.ize
Autodesk
Autodesk

We've now released Arnold 6.0.2.0 which fixed a hang/crash when doing the GPU cache pre-population on machines with lots of cores. I suspect this might have been the problem Eden was experiencing.

Hopefully this plus the previous Arnold release which fixed the multi-GPU texture hang, should solve the the multi-gpu issues reported in this arnold answers question.

0 Likes