Unit tests for FlowGraphs

FlowCanvas Forums Support Unit tests for FlowGraphs

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
    Posts
  • #1641
    kalms
    Participant

    We begin to see that we need unit tests for some of our FlowGraphs. Not C# level tests, but the person who creates a FlowGraph in the visual editor should sometimes also create a few test cases using visual tools.

    I have written a bit about testing of visual programming languages[link]. The last section, “Tests for sequence graphs” would apply almost directly to FlowCanvas.

    Is this something that is on the FlowCanvas roadmap at all? We can perhaps build something ugly that works ourselves – but in the long run, I think some feature, like special ‘test outputs’, and a special test runner for all test cases, would make this much easier.

    #1652
    Gavalakis
    Keymaster

    Hello and happy new year!

    Thanks a lot for the link. I will check it out more thoroughly in the coming days.
    Something like this was not really part of the roadmap, but I can see how it can be useful, thus I will take a look and see what can be done, or how it can be implemented in FlowCanvas, in one way or another!

    Thanks.

    Join us on Discord: https://discord.gg/97q2Rjh

    #2376
    derkork
    Participant

    I actually stumbled over this issue this morning myself, so I figured I’d necro this thread. I read @kalms document and it provided some insights on how you could approach testing such graphs in theory. I’d like to know if anyone could share some experience on how testing flow graphs in a real-world project could be done (ideally in an automated fashion). Also some best practices on how to set up flow graphs to make them better testable would be helpful.

    #2377
    kalms
    Participant

    We used FlowCanvas for scripting the execution of individual abilities in a turn-based strategy game. Toward the end we had ~200 unique abilities. In order to get test coverage, we built a generalized test rig, and ~1300 test cases in total. The test cases showed up in the Unity Test Runner, and our build system ran the entire suite on every commit. The plumbing took a lot of time to build but it was well worth it for us.

    Our initial approach was to allow non-C# programmers create the test rigs from scratch as FlowGraphs, and then have a generalized lightweight “test rig runner” in C# wrapped around it. However, after a while we realized that in our situation, it scaled better if we created one single test rig, and a large number of data descriptions. A data description consisted of four sections:

    1. a reference to the FlowGraph to be run
    2. a description of the setup configuration of the rig (the list of things to do to the system before starting the FlowGraph)
    3. a list of events to feed into the FlowGraph at certain points in time
    4. a description of post-conditions (the list of things to check in the system after the FlowGraph has completed execution)

    Since we needed only one single test rig, and it became fairly complicated with all that parameterization, we wrote the test rig in C#.

    The data descriptions were instances of a ScriptableObject containing the parameterization. Each such asset was thought of as one “ability test case” by the designers. A bit of glue logic made these show up in the Unity Test Runner: A subsystem used Asset Change Tracker to maintain a list of the test-case assets available at any given time; then a regular C# parameterized test case with the TestCaseSource attribute exposed the list of test-case assets to Unity’s Test Runner & ran them when instructed to do so by Unity. Note that this relied on all our tests being instant (not needing the Unity engine to tick anything); the TestCaseSource approach can only be used for [Test] style execution, not [UnityTest].

     

    So, why data descriptors + 1 test rig, instead of multiple test rigs? Well, data descriptors are easier to diff/merge, there is less room for mistakes in the data descriptors themselves, data descriptors are quicker to set up, data descriptors give a limited but consistent language to all test cases. We were concerned about the viability of maintaining 1000+ test scripts; it’s difficult to do bulk changes across FlowGraphs.

    If our game had not been turn-based, but a continuous simulation, with lots of custom built FlowGraphs for different entities that manipulated Unity objects directly, then we would either have built a half-dozen or so test rig FlowGraphs per FlowGraph-to-test, or we would have built one test rig FlowGraph + one data descriptor C# class + created a half-dozen-or-so data descriptor assets for each FlowGraph-to-test. I haven’t looked deeply into this myself, but I were to, I would look at Unreal’s Functional Testing for inspiration.

    When it comes to making a FlowGraph testable:

    1. There needs a way for the test rig to affect the FlowGraph, either directly (poking parameters in the FlowGraph) or indirectly (modifying the world around it)
    2. there needs to be a way for the test rig to observe what the FlowGraph is doing, either directly (reading parameters within the FlowGraph) or indirectly (observing the world around it).

    When it comes to making a FlowGraph easy to test / not requiring so many test cases:

    1. Avoid infinite loops. If you cannot avoid infinite loops, split the graph into top-level loops that call out to functions, and invoke the functions separately to validate individual function behaviour. (This may require special glue logic within the FlowGraph to get the functions started – I haven’t investigated that myself).
    2. Design your FlowGraph logic so that you minimize the total number of permutations that you need to test.
    #2381
    Gavalakis
    Keymaster

    Fanstastic read and very insightful! Thank your for taking the time to write this 🙂
    Is there any change/improvement that you’d deem important for FlowCanvas that would make what you did easier, or more convenient in any way? (or anything else that you’ve encounter while making your game or working with the designers for that matter).

    Thanks!

    Join us on Discord: https://discord.gg/97q2Rjh

    #2384
    kalms
    Participant

    (We are currently not working with FlowCanvas – we have for various reasons (not related to FlowCanvas) switched to working with Unreal – so I’m writing this from memory)

    Our #1 issue with FlowCanvas today is that the FlowGraphs do not diff/merge at all using the standard text-based merge tools. I have personally done the following dance a large number of times:

    1. Extract the “serializableObject” of the previous version of the .asset from source control
    2. Extract the “serializableObject” of the current version of the .asset
    3. Paste both into https://jsonformatter.curiousconcept.com
    4. Store both results into a pair of text files
    5. Use a diff tool to compare the text files
    6. Look for changes (is it just a single value changed? one or a few lines? a group of lines? large structural changes?)

    If the JSON data was in some multi-line format I would have been able to use the version control’s system tool to perform the diff in just 1 step.

     

    We have run into problems a number of times that have to do with custom functions, and reentrancy. This usually shows up with helper functions that are of a reusable nature – or when there is a bit of ping-pong going on between multiple FlowGraphs. We have worked around those problems by not using functions as often as we’d like; the “real” solution would have been for FlowCanvas to pass parameters via a call stack.

     

    We have found it problematic that the “Finish” node terminates all flows within an entire FlowGraph. This makes logic that ping-pongs between FlowGraphs not work as intended. We worked around that by limiting ping-pong, and carefully structuring logic so that early termination of certain flows would have no effect. A “proper” solution would be to provide an alternative to Finish that does not kill other currently-running threads (so something that is more of a Return operation), and/or provide an alternative to Finish that only kills threads that have been invoked during the currently-running event (which, in turn, requires tracking which Flows have been created during which Event).

     

    We have found the CPU cost for cloning FlowGraphs problematic.

     

    If we had been doing testing of FlowGraphs with other FlowGraphs as test rigs, we would probably have wanted some more tooling for inspecting the FlowGraph-under-test from the test rig. Tools for easily inspecting internal state, for example: can I easily get a list of the internal variables within the FlowGraph-under-test? can I replace a node in the FlowGraph-under-test with a spy that reports things back to the test rig? We did not need those – we were happy to limit ourselves to inspecting state outside of the FlowGraph itself.

     

    Those are the main things we’ve run into. For our use case (which may be more complex than the average user’s), we would have benefited from having state separated from logic. That would have made graph instantiation quicker. Cheaper graph instantiation would allow functions to be represented as separate graphs. That would make function-local variables viable. Passing function arguments on a stack would finally make it possible to create reentrant logic. All this would probably make writing custom nodes more complicated, and it may result in overall slower execution of the type of code that FlowCanvas users create today.

    #2389
    Gavalakis
    Keymaster

    Hello again,

    Thank you for your feedback. It is unfortunate to learn that you’ve moved on to Unreal, but I am at least glad that FlowCanvas was not the main reason for that. With that said,

    – I do understand that diff/merge ability is an issue. Recently someone posted a hack around this in the NodeCanvas forums however and I want to see if I can implement that within the fsPrinter in a reasonable manner. Here is the link to that guys post: https://nodecanvas.paradoxnotion.com/forums/topic/diffability/

    – The CustomFunctions have been refactored some time ago and at least to my experience (or reports) I haven’t encountered any problems with them. Are you however specifically refering to CustomFunctions that have some “Latent” action within them (like Wait) though? Or can you please let me know of an example on such a problem if at all possible by you?

    – Regarding “Finish”, this nodes is exactly for completely ceasing the execution of the whole FlowGraph. If you just want to “do nothing” after a certain node, you don’t need to plug a “Finish” node to that; you just leave the outgoing ports empty. Can you please explain what you would want a “Finish” to do more specifically per your suggestion? A “Flow” is actually created when an event is called, so tracking the invoker (node) would be very easy to do, by holding a reference in the Flow struct that is created in the Event Node by the way. But I am still not exactly sure what the end goal of that alternative “Finish” node would be; what are the things that you need “finished”?. I’d be more than glad to know however 🙂

    Thank you!

    Join us on Discord: https://discord.gg/97q2Rjh

Viewing 7 posts - 1 through 7 (of 7 total)
  • You must be logged in to reply to this topic.