The effects of SetLoadFields and AutoCalcFields on caching
This post continues the topic of query caching in Business Central which I started in the previous text. The first post on this topic gave a brief introduction into the caching capabilities of the BC server, and now I will focus on two special cases - caching the results of queries with partial records and calculated fields.
SetLoadFields and data cache
Enough time has passed since the introduction of partial recors in AL language for the majority of developers to get used to this concept and start using it almost subconsciously (I'm not one of those, I have to admit - I still regularly receive code review comments pointing out forgotten SetLoadFields). Today I want to experiment with SetLoadFields and its interaction with the internal BC cache - especially different situations when data access requests require different sets of fields to be loaded.
Business Central server caches data access requests with associated datasets, so it is possible that two requests on the same record, but with different sets of fields to be loaded could result in separate cache entries - so none of the two requests can use the dataset cached by the other. This is the hypothesis to be confirmed or refuted. To verify this, I will once again use the SQL Server Profiler with the Tuning template and capture the results of data access functions from AL actions (I described my setup in some more details in the previous post).
Test 1: Expanding the SetLoadFields list
In the first test, I am running the FindSet statement on the Item table loading three fields.
trigger OnAction()
var
Item: Record Item;
begin
Item.SetLoadFields("No.", Description, "Description 2");
Item.FindSet();
end;
This action generates the following SQL query.
SELECT
"27"."timestamp",
"27"."No_",
"27"."Description",
"27"."Description 2",
"27"."$systemId",
"27"."$systemCreatedAt",
"27"."$systemCreatedBy",
"27"."$systemModifiedAt",
"27"."$systemModifiedBy"
FROM "CRONUS".dbo."CRONUS International Ltd_$Item$437dbf0e-84ff-417a-965d-ed2bb9650972" "27"
WITH(READUNCOMMITTED) ORDER BY "No_" ASC OPTION(FAST 50)
I highlighted the three fields which I included in the SetLoadFields list (although primary key fields are always loaded, so "No." is not really necessary in the list, but I still prefer to keep it for clarity and readability). timestamp and the cohort of $system* fields are always added automatically to every query.
Continuing the test, I add the fourth field, "Search Description" to the list of loaded fields and run the action again.
trigger OnAction()
var
Item: Record Item;
begin
Item.SetLoadFields(
"No.", Description, "Description 2", "Search Description");
Item.FindSet();
end;
When another field is added to the SetLoadFields list, BC server has to execute the query again. Which makes perfect sense, because the field "Search Description" was not retrieved by the first query and is not stored in the cache.
SELECT
"27"."timestamp",
"27"."No_",
"27"."Description",
"27"."Search Description",
"27"."Description 2",
"27"."$systemId",
"27"."$systemCreatedAt",
"27"."$systemCreatedBy",
"27"."$systemModifiedAt",
"27"."$systemModifiedBy"
FROM "CRONUS".dbo."CRONUS International Ltd_$Item$437dbf0e-84ff-417a-965d-ed2bb9650972" "27"
WITH(READUNCOMMITTED) ORDER BY "No_" ASC OPTION(FAST 50)
The profiler shows two queries against the Item table.
Test 2: Narrowing down the SetLoadFields list
The first test was rather a demonstration of the concept: Business Central caches the dataset linked with the exact data request, therefore a new field in the query forces the server to run a new database request, because this field is not present in the cache. Now let's try to reverse the test and remove some fields from the initially more extensive list.
I restart my test BC service to make sure that the data cache is reset and the next test starts with the cold cache. In this case, I start from querying the complete list of table fields without limiting the set.
trigger OnAction()
var
Item: Record Item;
begin
Item.FindSet();
end;
Immediately after running this action, I execute the other two - loading three and four fields. When I switch to the profiler after running all three actions sequentially, all I see is one query against the Item table.
SQL statement captured in the profiler is the FindSet without SetLoadFields - it selects all available table fields. Other actions, which limit the dataset to a subset of three and four fields, are not logged - and not executed.
Same logic works if I skip the FindSet with the full set of fields (no SetLoadFields) and run the action with only four fields first, followed by the last one, which retrieves three fields. The first query leaves the trace in the SQL Server profiler, whilst the second one goes under the radar and pulls data from the cache.
With all this in mind, it may be a good idea to take a wider view on the use of SetLoadFields. If your app sequentially runs multiple queries against the same table, it may be worth not limiting the loaded fields to the precise set required for the next AL statement. If the next procedure retrieves same records, but with a different set of fields, including all fields in both queries may enable the server to use cached data and eliminate the latter database query.
This recommendation should be taken with a pinch of salt, though. Remember that queries must be exactly the same to be able to reuse the cache (with an exception of data requests operating on primary key values: Introduction to Business Central Caching). If those queries are triggered from logically disparate code blocks, this structure could create implicit code coupling difficult to maintain - so all these criteria should be carefully weighed.
FlowFields and data cache
Another interesting case for a test is the way data access requests in BC interact with the data cache when calculating flowfields. It becomes even more intriguing considering the different ways of calculating FlowField values.
Note. In the following tests, I run calculation of the Inventory field in the Item table, which can be calculated from the related indexed view, because its source table Item Ledger Entry has a suitable index with MaintainSiftIndex enabled. If the SIFT index for a FlowField is not available, BC will query the source table itself. Although I would not expect significant differences in caching behaviour, I can't confirm it because I did not test these scenarios.
With this disclaimer in mind, let's try to calculate a FlowField, engaging SetAutoCalcFields first.
Test 3: Enabling AutoCalcFields
For this set of tests, I prepared similar actions, but now I add a repeat..until loop to iterate on the Item records. This is important to demonstrate the nuts and bolts of the FlowField calculation, which will become more obvious as we go ahead.
trigger OnAction()
var
Item: Record Item;
begin
Item.SetAutoCalcFields(Inventory);
Item.SetLoadFields("No.", Description, "Description 2");
Item.FindSet();
repeat
until Item.Next() = 0;
end;
Here is the SQL query resulting from this code. It is important to notice that a single query retrieves values of the calculated field Inventory for all Item records. No other SQL statements are triggered from this code.
SELECT
"Item"."timestamp" AS "timestamp",
"Item"."No_" AS "No_",
"Item"."Description" AS "Description",
"Item"."Description 2" AS "Description 2",
"Item"."$systemId" AS "$systemId",
"Item"."$systemCreatedAt" AS "SystemCreatedAt",
"Item"."$systemCreatedBy" AS "SystemCreatedBy",
"Item"."$systemModifiedAt" AS "SystemModifiedAt",
"Item"."$systemModifiedBy" AS "SystemModifiedBy",
ISNULL("SUB$Inventory"."Inventory$Item Ledger Entry$SUM$Quantity",0.0) AS "Inventory"
FROM "CRONUS".dbo."CRONUS International Ltd_$Item$437dbf0e-84ff-417a-965d-ed2bb9650972" AS "Item"
WITH(READUNCOMMITTED)
OUTER APPLY (
SELECT TOP (1) SUM(
"Inventory$Item Ledger Entry"."SUM$Quantity")
AS "Inventory$Item Ledger Entry$SUM$Quantity"
FROM "CRONUS".dbo."CRONUS International Ltd_$Item Ledger Entry$437dbf0e-84ff-417a-965d-ed2bb9650972$VSIFT$Key2"
AS "Inventory$Item Ledger Entry" WITH(READUNCOMMITTED,NOEXPAND)
WHERE ("Inventory$Item Ledger Entry"."Item No_"="Item"."No_"))
AS "SUB$Inventory"
ORDER BY "No_" ASC OPTION(FAST 50)
If you take a closer look at the first few lines, you will immediately notice the difference with previous examples. This time, the BC query builder added an alias for each field. And the table is now referenced by the alias "Item" instead of the number "27" as in previous examples.
This is the effect of the SmartSQL feature which changes the way a SQL query is generated from an AL statement. This is a performance optimisation feature described in Microsoft documentation. A key DisableSmartSql in the CustomSettingsConfig configuration file controls this behaviour.
<!--
Specifies whether to disable the SmartSql feature.
The SmartSql feature converts find requests and calculation of flow-fields into a single SQL statement.
-->
<add key="DisableSmartSql" value="false" />
A more important addition to the data access query produced with SmartSQL is the correlated subquery joined with the Item table by the OUTER APPLY statement. This is the part of the query that collects all related item quantities from the SIFT indexed view. This correlated query is executed once for each Item record, but all processing and joining is confined to the SQL Server, so the result is still more efficient compared to sending a separate query for each item between the application server and the database server.
So far, I was running my tests with the standard server configuration, which enables the SmartSQL feature by default, as the above example demonstrates.
Now to the point of the test - without changing anything in the setup, I run the same action multiple times.
The resulting profiler trace shows that in this case the query is executed every time when the action button is hit.
This is one disadvantage of the SmartSQL calculation - queries produced with SmartSQL cannot use the data cache.
Test 4: AutoCalcField with SmartSQL disabled
In this test, I disabled the SmartSQL feature and ran the same action as previously in the Test 3.
When SmartSQL is disabled, FindSet generates separate queries for the fields stored in the table and FlowFields included in the query. The first one, which is querying the Item table, is exactly the same SQL statement as the first example in the beginning of this post. Even the familiar alias "27" for the Item table is back.
The other query retrieves the summarized quantity from the related "VSIFT" indexed view:
SELECT
SUM("SUM$Quantity")
FROM "CRONUS".dbo."CRONUS International Ltd_$Item Ledger Entry$437dbf0e-84ff-417a-965d-ed2bb9650972$VSIFT$Key2"
WITH(READUNCOMMITTED,NOEXPAND)
WHERE ("Item No_"=@0) OPTION(FAST 50)
And this query is repeated on the Next() statement for each record:
This is the reason I included the empty loop in the action - to highlight that it is nothing else but the Next statement that triggers this query. Although, we mainly think of Next as a function executed on the BC side and simply moving the pointer on the dataset, in some cases it can generate separate SQL queries - this is just one example.
Curiously, only the VSIFT key queries are cached when SmartSQL is disabled. If the same query is executed again, this long list of SELECT SUM queries disappears from the profiler trace, but the SELECT ... FROM Item is still raised every time.
Anyway, disabling SmartSQL in a production environment can hardly be justified - this is rather a test to satisfy my curiousity. What is really interesting is the way the data cache is used for flowfields when SetAutoCalcFields is enabled for FindSet vs calculation of flow fields in separate statements by the CalcFields function. So I move to the next test.
Test 5: CalcFields function
For the next test, I restore the server settings, reverting the value of the DisableSmartSql key back to its default value false, and run the calculation of flowfields in a separate query explicitly from AL code, calling CalcFields.
trigger OnAction()
var
Item: Record Item;
begin
Item.SetLoadFields("No.", Description, "Description 2");
Item.FindSet();
repeat
Item.CalcFields(Inventory);
until Item.Next() = 0;
end;
Once this action is executed, the SQL profiler trace is not different from the situation with disabled SmartSQL setting: a single query is triggered to select Item records without the quantities (since Inventory is completely detached from the FindSet request now), and a separate query is sent on each CalcFields invocation to retrieve the Inventory count.
Note that although the query selecting the item inventory count is still the same, now this query is triggered by the CalcFields method, whilst the previous test (with AutoCalcFields enabled and SmartSQL turned off) triggered the inventory calculation on the Next statement.
What is more important is that now all these queries are cached - running the same action again does not leave any trace in the profiler - all CalcFields requests return data from the cache.
So this can be another consideration for performance optimisation. Whilst multiple relatively small calculations on FlowFields benefit from correlated subqueries produced by SmartSQL, a rare, but massive calculation performed on a few records and repeated multiple times, may be more efficient when done by the explicit call of CalcFields, thanks to caching.
Anyway, all my tests lack exact numbers on large amounts of data when these considerations can be relevant - something to think about for future performance tests.