Optimizing an MPxLocator

jmreinhart · ‎06-19-2020

I have been working on a locator that draws a subset of faces from a mesh.

It's working fairly well but it's slower than I would like, and I'm not sure what I can do to optimize it. There is one part of the code that I know can be faster. At the moment I need to convert the MPointArray of the vertex positions into a MVectorArray and then back again because pulling an MPointArray out of the MUserData is causing a crash (which seems to me like a Maya bug). I've tried using the pre-evaluate method so that I can use the parallel evaluation method, but I run into crashing problems when it tries to compute a plug with a giant index (even though there's no connection to such a plug). Any help would be greatly appreciated.

jmreinhart · ‎06-20-2020

I was able to solve the pre-evaluate issue.

https://forums.autodesk.com/t5/maya-programming/parallel-evaluation-evaluating-the-wrong-plug/td-p/9...

The MVectorArray to MPointArray conversion is still present though, and now that I'm using Parallel evaluation I'm trying to suss out some viewport update issues. I won't mark this as solved until I get those all ironed out and then I'll post the optimized code.

jmreinhart · ‎06-20-2020

I have attached a video showing the update issue. The "WorldMatrixModifiedCallback" that I was using to draw the locator in worldSpace instead of having it move with the locator is not being called when in parallel evaluation mode.

jmreinhart · ‎06-20-2020

x

jmreinhart · ‎06-21-2020

When I use a control not in the EG then it uses DG evaluation, and you can see in the script editor that it runs prepareForDraw and addUIDrawables every frame, and worldMatrix callback is called properly and does trigger a redraw.

void meshLocator::trigger_redraw(MObject &obj, MDagMessage::MatrixModifiedFlags &modified, void* clientData)
{
	MGlobal::displayInfo("redraw triggered by worldMatrix callback");
	MHWRender::MRenderer::setGeometryDrawDirty(obj);
}

If I use a control that has keys and is therefore part of the EG then it uses parallel evaluation and it runs

prepareForDraw and addUIDrawables once, and worldMatrix callback is called properly and but does NOT trigger a redraw.

No idea how to solve this, but hopefully since I've narrowed it down someone with more experience with this part of Maya will have a fix.

jmreinhart · ‎06-21-2020

http://download.autodesk.com/us/company/files/2019/UsingParallelMaya.html

Handles requests for evaluation at all levels of the plug tree. While the DG can request plug values at any level, the EM always requests the root plug. For example, for plug N.gp[0].p[1] your compute() method must handle requests for evaluation of N.gp, N.gp[0], N.gp[0].p, and N.gp[0].p[1].

So during parallel evaluation the root array plug trianglePoints was being called, not the individual elements. I thought his could be causing the problem, because I wasn't recomputing the the output unless the plug being computed was an element. However after changing the compute method to compute all the element plugs when the root plug was being computed, the problem still persisted.

jmreinhart · ‎06-21-2020

Another potential source of the problem (from the MPxDrawOverride documentation)

If true, this override will always be updated (via prepareForDraw() or addUIDrawables()) without checking the dirty state of the Maya object. To avoid any unnecessary performance overhead due to the frequency of calling the update methods, the flag can be set to false. In this case the update methods will only be called when the Maya object is marked dirty via DG evaluation or dirty messages. To explicitly mark an object as being dirty the MRenderer::setGeometryDrawDirty() method can be used. Default is true.

I tried setting isAlwaysDirty to true, and that does make Maya call the prepareForDraw and addUIDrawables methods on every frame, but the data that is pulled out of the plug in the prepareForDraw method does not change. This is definitely turning out to be a stumper.

jmreinhart · ‎06-22-2020

So I found another forum post with the same sort of plug update problem.

https://forums.autodesk.com/t5/maya-programming/dg-vs-parallel-evaluation-update-issues/td-p/9018971

I implemented the workaround from that post (adding a dummy output, making the input I need to update affect the dummy output, and querying that input in the data block).

MStatus meshLocator::compute(const MPlug &plug, MDataBlock &data)
{

	data.inputValue(trianglePoints).data();
	
	data.setClean(plug);
	return MS::kSuccess;
}

This workaround does make everything function, but pulling the data out of the dataBlock on every frame is something I'd like to avoid if possible, since I'm already getting the data via a plug in the prepareForDraw method every frame.

Also this method still requires me to have isAlwaysDirty set to true, which is undesirable for performance.

jmreinhart · ‎06-23-2020

So another limitation of the workaround from the above post is that you need to have the dummy output connected to another DG node, which is extremely undesirable.

jmreinhart · ‎06-23-2020

So after some more testing, it appears as though the node that is connected to is evaluating in DG mode on the first frame (causing the update we see) but not updating once it switched to parallel on subsequent frames.

this seems to be confirmed by testing the values being output and the value being input using getAttr

print cmds.getAttr('subMesher1.trianglePoints[0]')[0][1]
print cmds.getAttr('meshLocator1.trianglePoints')[0][1]

123.97666204
129.275381505

Highlighting the plug in the node editor also show the two different values.

jmreinhart · ‎06-23-2020

The destination plug is the one that is always out of date. I would have expected that this compute method (using the dummy output) would get me the up-to-date value.

MStatus meshLocator::compute(const MPlug &plug, MDataBlock &data)
{
	MStatus status;
	if (plug == dummy)
	{
		MDataHandle test = data.inputValue(trianglePoints, &status );
		MObject trianglePoints_val_mObj = test.data();
		MFnPointArrayData trianglePoints_getter(trianglePoints_val_mObj);
		MPointArray trianglePoints_val = trianglePoints_getter.array();
		if (trianglePoints_val.length() > 0)
		{
			MGlobal::displayInfo(MString("calculate value:") + trianglePoints_val[0][1]);
		}
		MDataHandle outputHandle = data.outputValue(plug);
		outputHandle.setBool(false);
	}
	data.setClean(plug);
	return MS::kSuccess;
}

based on this part of the MDataBlock documentation:

Gets a handle to this data block for the given plug's data. The data represented by the handle is guaranteed to be valid for reading. If the data is from a dirty connection, then the connection will be evaluated. If no connection is present, then the value that the plug has been set to will be returned. If the plug has not been set to a particular value, then the default value will be returned

But I guess that is not true when dealing with parallel evaluation.

jmreinhart · ‎06-24-2020

The plug on the destination node is marked clean before I query the data using inputValue in the compute method, and it is clean afterward as well.

If I duplicate the destination node and connect the same output to the same input on the duplicate node, the value is correct.

H

How could meshLocator be getting an out of date value from the dataBlock?

They are evaluating in the proper order according to the profiler.

jmreinhart · ‎06-24-2020

I found asolution to that issue, thanks to the help of a coworker. I was setting the dummy output clean in the MpxLocators compute method. By removing this and instead just returning kUnknownParameter it works as intended. This does still require the locator to be set to isAlwaysDirty (which I'm hoping to avoid for performance), but it does not require the dummy output.

So there are two remaining avenues for optimization that I know off.

1. Resolving the MPointArray to MVectorArray conversion bug. I've submitted a bug report to Autodesk regarding this issue.

2. Getting the MPxLocators to draw in parallel.

Right now the computing of the nodes is very fast, but the drawing is slow. Resolving issue 1 will significantly increase performance of the drawing, but if I could make them draw in parallel that would be great. I did a quick look in the documentation for parallel drawing but didn't find anything and I did a simple test with two meshes that suggests it's not possible. Maybe I could draw them all with one DG node but that would break selection.

It's faster than before, but I wanna go faster.

jmreinhart · ‎06-25-2020

Found another performance improvement. By switching to using getRawPoints instead of getPoints.

I don't know if the pointer would still be valid on the next frame or if I do need to get it every frame like I do now.

There is a getRawNormals method but because it returns per-vertex per-face normals instead of just per-vertex normals, it seems faster to stick with getNormals (for my node specifically). If my understanding is incorrect please let me know.

Drawing in parallel appears to be a dead end unfortunately. There's not mention of it in the documentation and I can't get it to occur even in a simple scene with no custom nodes.

One idea I came across was using a kdTree

https://vimeo.com/311008901

I very vaguely understand what a kdTree is for, but I don't understand it enough to see how it could be implemented here. I guess it would replace the std::vector?

Multi-threading seems like a dead end unfortunately.

There's a few tweaks I know I can make just to clean everything up since there's probably bits of code left over from all my testing, but that won't change the performance much.

jmreinhart · ‎06-26-2020

I was looking at Stephen Candells implementation and wondering if switching to deforming a mesh instead of drawing a mesh might actually be faster. The general logic being that I would be dealing with fewer points because I would not need to get the position for each point of each triangle, but instead would only need to get each point. This can make a big difference, if I have a patch of four faces, there would only be 9 vertices, but 4 faces * 2 triangles per faces * 3 points per triangle = 24 points (more than twice as many point calculations). And the difference keeps increasing the more faces you have roughly following the function (6*(x-1)^2)/(x^2), which means it's 1.5-6.0 time slower.

Before I threw out all my work for that sweet 33% speed increase I checked the mesh draw method to make sure there wasn't a way to us the same point twice. In fact, there is!

The optional index array specifies the order in which the vertex positions (and their corresponding normals and colors) should be drawn. Vertices can be reused by having their indices appear multiple times, so the index array may be longer (or shorter) than the other three arrays.

If the index array is not provided then the vertices will be drawn in the order in which they appear in the positions array.

So now I'll store a non-repeating set of vert IDs for each set of faces so I don't query the same points for adjacent triangles. And I'll add an output attribute that contains the list of ids that the draw method needs.

jmreinhart · ‎06-27-2020

So I've made the switch to rawPoints, and I'm being more efficient with how many point queries I do. But surprisingly I only got a small improvement in performance, and the draw is slightly slower because we need to get more data out of the MUserData. The draw still hast two major flaws. I'm converting an MPointArray to MVectorArray and back (to avoid a bug), and now I'm converting an MIntArray to an MUIntArray. Can you have an MUIntArray type attribute? The documentation doesn't mention that as an option.

I also have a check in the prepareForDraw to see if the object is selected so I can highlight it

MSelectionList sel;
	MGlobal::getActiveSelectionList(sel);
	//get the locators transform
	MFnDagNode mFnDagNode(objPath);
	MObject parent_mObj = mFnDagNode.parent(0);
	MDagPath parent_dagPath;
	parent_dagPath.getAPathTo(parent_mObj, parent_dagPath);

	//check if the locator is selected
	if (sel.hasItem(objPath) || sel.hasItem(parent_dagPath))
	{
         ...
         }

This is the best method I could come up with but I'm hoping there's a better one.

Speed-wise it's usable for a simple face rig but not it's too slow to be used for a full body rig.

jmreinhart · ‎06-29-2020

https://www.jonah-reinhart.com/single-post/2020/06/27/Direct-Manipulation---MPxLocator

The code and a video talking about the end product can be found here. Sorry for sort of plugging my own work like this

Community

Optimizing an MPxLocator

Optimizing an MPxLocator

Optimizing an MPxLocator