- Aug 25

if not IsEmpty then FindSet... Or not?

This post is a result of one very fruitful discussion with many knowledgeable Business Central enthusiasts in a long LinkedIn thread. Now I want to start the post by saying big thank you to everyone who contributed their expertise to this discussion - this was indeed a very valuable discourse.

The topic of the discussion was exactly the one I put in the header of the blog post: whether this language structure can ever do any good to improve code performance:

if not Record.IsEmpty() then
    Record.FindSet();

In the end, the community seems to be in consensus, agreeing that this is not the best practice and should not be used as any means of performance improvement, since it's rather likely to make things worse. Waldo already published his test results confirming this view. I want to add my own observations from a slightly different perspective and pay a little bit more attention to some details of the query execution to demonstrate why, contrary to common assumptions, IsEmpty does not perform so much better than FindSet when the query result is empty.

For the test I prepared a table with 1 000 000 records. The field, which I will apply filters on, is "Item No.", and values in this field are uniformly randomly distributed. There are 10 000 items, so a FindSet with a filter on any Item No. returns on average 100 records.

IsEmpty function really shines when filters are applied on indexed fields. If the query can use a good non-clustered index it will not have to look into the clustered index to retrieve any data. FindSet, on the contrary, usually has to perform a key lookup or scan the clustered index. Without an index, both IsEmpty and FindSet must do the same table scan, therefore I create an index on the field which I am going to filter on.

In every test, I run 5000 iterations and measure the total execution time of the loop.

SelectLatestVersion();

for I := 1 to 5000 do begin
    MySalesTable.SetRange("Item No.", GetRandomDimValue());
    if not MySalesTable.IsEmpty() then begin
        MySalesTable.FindSet();
        HitCount += 1;
    end
    else
        MissCount += 1;
end;

Before running each test, I reset all Business Central and SQL Server cache buffers. In BC, I call SelectLatestVersion, in SQL Server - dbcc dropcleanbuffers. This ensures that the first queries run on cold cache and actually hit the database, but as more and more pages are cached in SQL Server, more results are fetched from the buffer cache. On the BC side, on the other hand, cache stores exact query results, therefore cache hits may randomly occur when the same filter value is selected more than once.

And finally, GetRandomDimValue function in code snippet above gets a random code from a temporary table which I prepare for each set of tests. Records in this table are prepared such that a certain specific percentage of values is not found in the data table. I will be running my tests with 10%, 50%, and 90% of missing values. This means that in the tests described further, 10%, 50%, and 90% of time the filtered recordset is empty, and FindSet if not executed.

Business Central tests

In the first test, I prepare the filter buffer with 90% of the values present in the data table, and 10% missing. When I run 5000 iterations, randomly selecting filter values, I expect that around 500 queries will return an empty dataset. The table below shows the results of 10 test runs, with the bottom line showing the average value of all 10 executions.

Since all filter values are selected randomly, it is not exactly 10% misses, but close enough: 10.05% for FindSet and 10.2% for IsEmpty + FindSet. The difference in execution time is quite noticeable. The combination of IsEmpty + FindSet is around 36% slower than just FindSet alone.

If the second test, I prepared the filters the same way, but now half of the values are found in the data table, so 50% of the queries will not find any data and FindSet will not run.

This time, it's 49.83% of missing data for FindSet, and 49.87% for IsEmpty + FindSet. IsEmpty + FindSet pair is still slower, approximately by 18% now.

In the third test series, filters are prepared in a way that 90% of queries result in empty set.

The results are: 89.85% and 89.87% of empty datasets this time, and execution time is nearly the same in both cases: 1.38% difference in favour of FindSet without IsEmpty.

And finally, I ran the same test with FindSet and IsEmpty separately.

I expected nearly the same time for both functions, but IsEmpty is indeed slightly better than FindSet when the result is always empty - it performs around 13.5% faster.

T-SQL test

To find out where the difference comes from, I ran the same test in pure T-SQL. Here, I insert 5000 random strings (truncated GUIDs actually) into a temporary table and run the SELECT query applying each value as a search predicate.

"SELECT TOP 1 NULL" query produced by the IsEmpty function is consistently faster than a SELECT from FindSet, although the difference is just around 4%. It looks like client/server communication and Business Central runtime add some overhead, increasing this difference.

SQL Server perspective

Now a few words about both functions from the SQL Server point of view. It is often assumed that IsEmpty is much faster because the query it sends to the database does not select any table data, so it can use more efficient execution plan.

The query indeed does not request any table fields:

SELECT TOP 1 NULL
    FROM "CRONUS".dbo."CRONUS International Ltd_$My Sales Table$382b9cf1-7c89-48eb-bdc6-be609426126a" "50501"  
    WITH(READUNCOMMITTED) 
    WHERE ("50501"."Item No_"='ABCD')

And the execution plan is a simple index seek.

SELECT issued by FindSet, on the other hand, requires a key lookup, which is by far the most expensive operation in the execution plan.

But if we look at the actual execution plan and key lookup details, we can see that the actual number of executions of this operation is 0. Since the index seek operation did not find any keys, there is nothing to look up.

So in fact, this plan adds one execution of the nested loops operation on an empty input.

IO statistics will reflect this difference, showing that an operation on the Worktable is included in the plan for FindSet query. But the number of reads on the Worktable is 0. Otherwise, all the IO stats are exactly the same for both queries.

IsEmpty

Table 'CRONUS International Ltd_$My Sales Table$382b9cf1-7c89-48eb-bdc6-be609426126a'.

Scan count 1, logical reads 4, physical reads 4, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

FindSet

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Table 'CRONUS International Ltd_$My Sales Table$382b9cf1-7c89-48eb-bdc6-be609426126a'.

Conclusion

To conclude - does it ever make sense to use the IsEmpty to check the recordset before calling FindSet? Certainly not as a general rule. It may be slightly faster when the dataset under the filters is expected to be empty in nearly 100% of the queries. But even in this case, performance gain is a few microseconds per call, so it may be useful only in a really performance-critical path with a high volume of queries.

There is one reason to use IsEmpty before FindSet, but it not directly related to the query execution time. A while ago, I did a post with an overview of a similar pattern: IsEmpty before DeleteAll and ModifyAll functions. The same reasoning can be applied to a locking FindSet statement. If an exclusive lock or update lock is escalated to the table level, it will block other FindSet's with update lock in concurrent sessions - even if there are no records to return from this FindSet. This happens because SQL Server places an intent lock on the table before executing table seek or scan. Existing incompatible lock on the table will force the query to wait, so in this case a non-locking IsEmpty before a locking FindSet may be useful to reduce contention between concurrent transactions.

// Depending on specific need, we can run IsEmpty with
// ReadUncommitted or ReadCommitted isolation
// But in any case, it must not be executed with UpdLock

Record.ReadIsolation := IsolationLevel::ReadUncommitted;
if not Record.IsEmpty() then
    Record.FindSet(true);

if not IsEmpty then FindSet... Or not?

Business Central tests

T-SQL test

SQL Server perspective

Conclusion

Recent Posts

Comments

SIGN UP AND STAY UPDATED!