Using Deadline render manager for burn (or any 3rd party render manager)

danny_chambers · ‎01-14-2025

Why we did it?

Was having a lot of stability issues with backburner at scale

far more control over resource sharing than the simple backburner groups

allows other software to also use expensive GPU nodes for rendering when idle

ability to have post jobs run after burn, ie make a QT from exr render

How we did it?

using a flame batch hook that runs when a burn job is submitted. It grabs the burn tar file, extracts and submits a deadline job

Backburner is still setup and running as normal. It handles all the background caching. burn jobs still get submitted to it using a burn group with no nodes, every night we clear out the BBM queue.

how AD can make it better in order of importance:

have burn_gpu always complete with a exit code 0 or 1 for success or failure. currently we have to tail the burn shell log file to catch for errors to fail tasks.
Right now we haven't figured out how to catch background caching jobs, or timeline rendering jobs. timeline renders, we do see the same burn tars, and are able to run the burn_gpu with it which looks like it renders frames. But it never updates the flame timeline.

having burn_gpu output it's log to stdout. logs are a bit messy in deadline, they have the the current task logs and each previous task.

Ability to disable backburner completely.

to setup:

Standard Deadline setup, nothing special needed on the service side.

(scripts attached)

flame.py can live on central storage for burn nodes to call

batch_hook.py and deadline_chunks.py lives in flame python folder

on all hosts that will run burn

add ACL to /opt/Autodesk/log/ so calling user can write to it.

on burn manager, edit this script to open up Jobs folder so calling user can pull burn tar's. ACL wasn't enough.

/opt/Autodesk/backburner/scripts/.systemd/adsk_backburner_manager.sh

......

chmod -R 777 /opt/Autodesk/backburner/Network/Jobs

exit 0

Hoping to rally some support to make this function better. With the added ability to easily share render resources w/ other apps, any size company could make use of this.

slabrie · ‎01-15-2025

WOW! This is very impressive! That would make a great Logik Live event!

Stéphane Labrie
Senior Product Owner
Flame Family

AlanINS · ‎02-18-2025

Hi Danny..... this is my response from the Logik forum. I presented some of our in-house infrastructure tools centered around BackBurner at the Flame UG meeting last week.

Interesting… but we also use Backburner to render Nuke, Maya, and other things, all of which can also utilize the renders nodes and with GPU. Now yes, Backburner doesn’t have much fancy resource utilization, and we aren’t huge scale. ~20 render nodes and ~12 workstations, and all can participate in a render job. I get why Danny did that, but he also has much more financial/development resources than most other places. I really appreciate BackBurner’s simplicity and flexibility, even if it is missing some fancier features. With our in-house toolset, we’ve been able to expand it just enough, to make it kind of perfect for our use case.

richard_kimUJXAV · ‎02-28-2025

Is the flame.py file only for flame2025 or can it run on flame2024?

danny_chambers · ‎02-28-2025

Hey Richard, you can mod the files to work w/ 2024, or 2026.

just change this to match whatever burnGPURenderer you have in /opt/Autodesk/backburner/Adapters/
# Set the burnGPURenderer file to use burnGPURenderer_2025.1
adapter_dir = '/opt/Autodesk/backburner/Adapters/'
renderer_file = os.path.join(adapter_dir, 'burnGPURenderer_2025.1')

aoflame · ‎03-05-2025

@danny_chambersThank you for this.

to clarify:

place deadline_chunks.py and batch_hook.py to /opt/Autodesk/shared/python
1. change path
  1. deadline.py
    1. last_update_file to /INSTALLS/ADSK/.flamestore/userdata/%s/%s.deadline.*' (any depdencies?)
    2. os.system('rm /INSTALLS/ADSK/.flamestore/userdata/%s/%s.deadline.*' % (user_id, project))
  2. batch_hook.py
    1. is looking for post_job.py (missing on your .zip)
2. flame.py
  1. is placed in a centralize location where a burn node can access it
    1. run flame.py
      1. syntax: python3 /path/path/flame.py [batch] [first frame] [last frame]
        what is batch or define the argument?

danny_chambers · ‎03-05-2025

Ah dang yeh forgot to remove that post_job option. I'll include that file here, we were using it to create a post render job if certain criteria was met, it would create a QT from a exr write render on the SAN.

safe to remove that from the batch_hook.py though

-prop PostJobScript=/Volumes/MY_SAN/Resources/Engineering/Flame/post_job.py

last_update_file = '%s/.flamestore/userdata/%s/%s.deadline.*' % (server, username, project_name)

this is a central file location that holds the deadline "frames per task". when using deadline_chunks.py in flame, it sets the "frames per task" to a file in the shared location.

%s shared location root pulled from earlier in the file. (this exists since we have multiple locations / SAN vols)

if "-location2" in host:
server = "/Volumes/MY_SAN2"
else:
server = "/Volumes/MY_SAN"

which also correlates to the deadline_chunks.py where it sets up that file.
if dialog == "Confirm":
user_id = os.getlogin()
last_update_file = '/Volumes/MY_SAN/.flamestore/userdata/%s/%s.deadline.%s' % (user_id, project, info['name'].split()[0])
try:
os.mkdir(os.path.dirname(last_update_file))
except:
pass
os.system('rm /Volumes/MY_SAN/.flamestore/userdata/%s/%s.deadline.*' % (user_id, project))
os.system('touch %s' % last_update_file)

flame.py
1. is placed in a centralize location where a burn node can access it
  1. run flame.py
    1. syntax: python3 /path/path/flame.py [batch] [first frame] [last frame]
      1. what is batch or define the argument

So this flame.py command is actually what deadline is going to run on the burn node. the batch_hook.py submits the job like this:
os.popen("/opt/Thinkbox/Deadline10/bin/deadlinecommand -SubmitCommandLineJob -executable /Volumes/MY_SAN/Resources/Engineering/Flame/flame.py -arguments '<QUOTE>%s<QUOTE> <STARTFRAME> <ENDFRAME>' '-frames' '0-%d' -chunksize %s -name '%s' -pool %s -group %s -prop SecondaryPool=%s -prop TaskTimeoutMinutes=15 -prop PostJobScript=/Volumes/MY_SAN/Resources/Engineering/Flame/post_job.py -prop ExtraInfo0='%s'" % (xml, int(frames), chunk, description, pools[os.uname()[1]], groups[os.uname()[1]], secondary_pool, os.path.basename(batch[0]))).read()

/opt/Thinkbox/Deadline10/bin/deadlinecommand -SubmitCommandLineJob -executable /usr/local/bin/flame.py -arguments '<QUOTE>/Volumes/MY_SAN/JOBS/Engineering/.burn/Burn_GPU_mvp-flame-flame14_250305_16.06.35/Burn_GPU_mvp-flame-flame14_250305_16.06.35.xml<QUOTE> <STARTFRAME> <ENDFRAME>' '-frames' '0-178' -chunksize 05 -name 'BURN_TEST_v002' -pool pool3 -group flame -prop SecondaryPool=all -prop TaskTimeoutMinutes=15 -prop PostJobScript=/Volumes/MVP/Resources/Engineering/Flame/post_job.py -prop ExtraInfo0='5EBCE4B2 Burn_GPU_mvp-flame-flame14_250305_16.06.35'

deadline will then run this command on the burn node for ex:

/usr/local/bin/flame.py "/Volumes/MY_SAN/JOBS/Engineering/.burn/Burn_GPU_mvp-flame-flame14_250305_16.06.35/Burn_GPU_mvp-flame-flame1_250305_16.06.35.xml" 110 114

I hope that helps!

aoflame · ‎05-02-2025

@danny_chambersthanks for this.

danny_chambers · ‎05-16-2025

Made some updates to the scripts. Biggest one being no longer having to tail the burn logs to try and catch errors. It can now pull a proper 0/1 exit from the burn_gpu process for failures. Otherwise, cleaned up the batch hook to provide a bit more commenting, uses functions and more error reporting/catching.

Community

Using Deadline render manager for burn (or any 3rd party render manager)

Using Deadline render manager for burn (or any 3rd party render manager)

Using Deadline render manager for burn (or any 3rd party render manager)