Performance of event subscribers. SingleInstance or not?

There are countless factors that can affect application performance - structuring of the code, an algorithm that we chose for a specific task, efficiency of database queries issued by the application, infrastructure where the application is executed, number of concurrent user sessions, and a million other things. When we talk about Business Central extensions, we can highlight a few most significant performance factors, undoubtedly stressing on the importance of the database communication. Although the topic of efficient interaction between AL and SQL is vast, there are general recommendations that help avoid most basic mistakes and improve code performance.

On the other hand, when it comes to the topic of performance impact of event subscribers, information is rather scarce. In the following documentation article, Microsoft team provides a set of generic performance recommendations, and some of these apply to event subscribers.

Performance Articles for Developers

The short list of the recommendations on the event subscribers design in this article is very brief, and some of them leave open questions and doubts. While the warning against subscribers in table trigger events is clear and self-explanatory, the suggestion to keep subscribers in a SingleInstance codeunit seemed questionable. Mainly I found it controversial because of the intended use of the SingleInstance property - to keep one instance of an object throughout the user session, as an aid in implementing the Singleton pattern, maintaining instances of entities that cannot or should not exist in multiple instances.

Google search on the event subscriber performance in Business Central did not yield many results. One blog post The performance impact of events that explores the topic, shared a link to a recording of a BC performance session presented on Nav Tech Days 2018.

NAV TechDays 2018 - Performance: Business Central reloaded for the Cloud

Two slides from that session are the primary source of what information is available on the performance of Business Central event subscribers. The first slide is explaining an event call in pseudocode.

And the second slide demonstrates the results of an internal test done by the Microsoft dev team.

After watching this video, still full of doubts, I decided to do my own tests.

Performance measurement

To reproduce the test and measure subscribers execution time I declared an empty method subscribed to the event OnAfterInitRoundingPrecision in the Currency table.

codeunit 75101 "Test Subscribers 1"
{
    [EventSubscriber(ObjectType::Table, Database::Currency,
        'OnAfterInitRoundingPrecision', '', false, false)]
    local procedure OnAfterInitRoundingPrecision(
        var Currency: Record Currency;
        var xCurrency: Record Currency;
        var GeneralLedgerSetup: Record "General Ledger Setup")
    begin
    end;
}

Code that triggers the subscriber is called in a loop 100 000 times, and the total execution time of all iterations is captured. To minimize the impact of the first 'cold' run, I take it out of the measurement loop. Besides, the InitiRoundingPrecision procedure contains a Get on the General Ledger Setup, so I let the BC server cache the record in the first call to avoid delays caused by database queries.

trigger OnAction()
var
    Currency: Record Currency;
    TempTimerBuf: Record "Line Number Buffer" temporary;
    StartTime: Time;
    I, J : Integer;
begin
    // First execution is excluded from the statistics
    Currency.InitRoundingPrecision();

    for I := 1 to 20 do begin
        StartTime := Time();

        for J := 1 to 100000 do
            Currency.InitRoundingPrecision();

        TempTimerBuf."Old Line Number" := I;
        TempTimerBuf."New Line Number" := Time() - StartTime;
        TempTimerBuf.Insert();
    end;

    Page.Run(Page::"Timer Buf. View", TempTimerBuf);
end;

The result of this job is a series of 20 numbers, each representing the total time of the 100 000 executions of the same function which invokes the test subscriber. To minimize random fluctuations, I take the average time of the 20 runs as the final measure.

For example, one of the data sets for the first test is as follows.

The value on the chart is the average of the 20 numbers: 1045.9.

Before starting any measurements with subscribers, I ran the same process without any event subscribers, which returned the average execution time 643,3 ms. This number is the baseline value, foundation for the comparison in other tests.

Local variables in normal codeunit (SingleInstance = false)

The first hypothesis to test was if the number of variables being instantiated in each execution has an influence on the execution time. In this test, I used a single subscriber and gradually increased the number of local variables in the subscriber function. It was only a declaration without method calls.

[EventSubscriber(ObjectType::Table, Database::Currency,
    'OnAfterInitRoundingPrecision', '', false, false)]
local procedure OnAfterInitRoundingPrecision(
    var Currency: Record Currency;
    var xCurrency: Record Currency;
    var GeneralLedgerSetup: Record "General Ledger Setup")
var
    Customer1: Record Customer;
    Customer2: Record Customer;
    Customer3: Record Customer;
    // etc...
begin
end;

The following chart shows the execution time depending on the number of variables, adding five more variables in each run.

As you can see, the result is almost a flat line, displaying only minor fluctuations, but no strict correlation between the number of variables and the execution time. Execution time increased from the baseline 643 ms to 1045 ms and remained around this mark. So far, we can conclude that memory allocation for a variable, without executing any code, is a very fast process, not showing any significant effect on the code run time.

Variables in a SingleInstance codeunit

In the following test the only subscriber codeunit was changed to SingleInstance. With this modification, the execution time drops slightly below 1000 milliseconds per 100 000 executions, but the difference is rather marginal.

I won't bore the reader with yet another graph, but I repeated the same test in two variations: declaring the variables locally in the subscriber method and moving them into the global scope of the subscriber codeunit. Both tests resulted in the same timing: execution time a little below one second, independently of the number of variables. So the interim result at this step is - variable instantiation and memory allocation is extremely cheap, what matters is the permissions check which is triggered prior to code execution or data access.

Probably the load of the subscriber metadata is responsible for the 30 to 50 ms difference between the two graphs, but this is a mere speculation.

Event subscribers (SingleInstance = false)

In the next test, I removed all variables and started increasing the number of subscribers attached to the same event by simply duplicating the "Test Subscribers" codeunit. I ran 11 iterations, each adding one more subscriber. And now the characteristics of the graph changed, it switched to linear growth, with each new subscriber adding up around 300 ms per 100 000 executions.

Event subscribers (SingleInstance = true)

Now this is the same test, but with the subscriber codeunit switched to SingleInstance.

If we look at two rows of data and calculate the difference between respective instances, we can notice that the absolute value of the difference is growing.

This means that there is a noticeable overhead in reinitializing the codeunit compared to calling an existing instance of a SingleInstance codeunit. If I was a computer scientist I would say that the second approach (SingleInstance) is definitely more efficient. Although both test demonstrate linear complexity, the linear coefficient is lower in the second case. But as engineers, we must be practical. This test shows that if we have 10 subscribers bound to the same event, a loop with 100 000 iterations will be half a second slower. I it something that is going to impact application design decisions?

Codeunit.Run on a local variable

Microsoft 2018 tech session mentions three actions that make a static-automatic event subscriber slower compared to the same subscription bound to a SingleInstance codeunit:

Check permissions/license
Find metadata
Create instance

But all these actions equally apply to executing code in any application object. When we call a method on a codeunit, the server still must check the permissions, load object metadata and create an object instance. Therefore the following set of tests was executed on codeunit variables, running an empty OnRun trigger to compare performance of an event subscriber with the execution of a codeunit.

For this test, I declared a new codeunit DoNothing which indeed does nothing.

codeunit 75111 "Do Nothing 1"
{
    trigger OnRun()
    begin
    end;
}

Then I modified my loop to run DoNothing instead of triggering an event.

for I := 1 to 20 do begin
    StartTime := Time();

    for J := 1 to 100000 do
        RunDoNothing();
...
end;

RunDoNothing is the procedure that holds all codeunit variable declarations and runs them. All variables are declared in the local scope for this test and are reinitialized on every iteration - 100 000 times in each test iteration.

local procedure RunDoNothing()
var
    DoNohting1: Codeunit "Do Nothing 1";
    DoNohting2: Codeunit "Do Nothing 2";
    DoNohting3: Codeunit "Do Nothing 3";
    // Adding one variable in each test run
begin
    DoNohting1.Run();
    DoNohting2.Run();
    DoNohting3.Run();
    ...
end;

The graph begins at the same point, around one second for 1 codeunit. But as the number of variables grows, the graph starts looking unexpected. It is still linear, but with the line taking a steeper rise.

Each additional Codeunit.Run on a new codeunit instance increases the execution time of the loop by almost 900 ms (9 microseconds per call) against 300 ms (3 μs per call) for subscribers.

Codeunit.Run on a global variable

The purpose of the next test was to find how much instantiation of codeunit variables impacts performance. For this run, I declared all DoNothing codeunit variables in the global scope. Instead of keeping variables in the local scope in the procedure, I declared them as global in the test page. This resulted in a noticeable improvement, and the maximum run time for 10 codeunits dropped from 9 to 7 seconds.

Codeunit.Run on a SingleInstance codeunuit

And the final test was running the same loop of DoNothing codeunits, but all of them with SingleInstance = true. This test was not supposed to produce any unexpected result because variables inside the main test loop (100 000 calls of the RunDoNothing) are not reinstantiated. Indeed, the numbers are the same as in the previous run.

Conclusion

I covered a few of the key test scenarios, but many performance factors of event subscribers remain untested. I haven't tested manual subscribers. Microsoft documentation suggests that these are more efficient compared to static subscription. This will be also interesting to test, although I don't expect significant difference in performance.

Another attractive topic for testing is the impact of the subscriber codeunit's size on the performance. But again, codeunit size will have comparable impact on normal function invocation, it is not limited to subscribers. My rule of thumbs here is to keep the codeunit size below 1000 lines. Monstrosities like codeunit 80 with nearly 11000 lines should be split into smaller objects.

Chain of calls when the event subscriber calls a method from another codeunit also needs testing from the performance perspective. The recommendation to always keep subscribers in a single instance codeunit and move all other code out implies a sequence of at least two calls on different objects which doubles the overhead, and this is not covered in my tests.

The recommendation to limit the size of subscriber codeunits and move all non-subscriber methods out into separate objects is very valid and adheres to the Separation on Concerns principle. But from the performance perspective, it can be even slightly slower than keeping all code in the subscriber. As the tests above showed, a call of a codeunit method has similar or higher overhead, since the license restrictions and access permissions are apparently verified on every call, not only once when the variable is instantiated.

Still, with all the open questions, I think that the collected information is sufficient to make the conclusion. I don't recommend using SingleInstance codeunits as a means of performance tuning. Even the simplest scenarios demonstrate performance of subscribers comparable to usual codeunit executions. Possible performance gain from keeping events in a single instance codeunit does not exceed 10 μs per function call and doesn't have significant impact on the overall application performance.