User-friendly feature

I am considering using some custom function calls in some of my queries, instead of using a bunch of built-in case statements. Inline statements are likely to work better, but the features make browsing easier and possibly support.

I just wanted to get an idea of ​​what is the typical best practice for UDF? I understand that using them in criteria (Where Clause) can have a significant impact on performance.

Especially in cases where you can have many statements in the case block or even nested case statements.

Thanks,

S

+4
source share
3 answers

My canning answer:

There is a common misconception that UDFs adversely affect performance. As a general statement, this is simply not true. In fact, the built-in table values ​​of UDF are actually macros - the optimizer is very well able to rewrite queries related to them and also optimize them. However, scalar UDFs are usually very slow. I will give a brief example.

Background

Here is the script to create and populate the tables:

CREATE TABLE States(Code CHAR(2), [Name] VARCHAR(40), CONSTRAINT PK_States PRIMARY KEY(Code)) GO INSERT States(Code, [Name]) VALUES('IL', 'Illinois') INSERT States(Code, [Name]) VALUES('WI', 'Wisconsin') INSERT States(Code, [Name]) VALUES('IA', 'Iowa') INSERT States(Code, [Name]) VALUES('IN', 'Indiana') INSERT States(Code, [Name]) VALUES('MI', 'Michigan') GO CREATE TABLE Observations(ID INT NOT NULL, StateCode CHAR(2), CONSTRAINT PK_Observations PRIMARY KEY(ID)) GO SET NOCOUNT ON DECLARE @i INT SET @i=0 WHILE @i<100000 BEGIN SET @i = @i + 1 INSERT Observations(ID, StateCode) SELECT @i, CASE WHEN @i % 5 = 0 THEN 'IL' WHEN @i % 5 = 1 THEN 'IA' WHEN @i % 5 = 2 THEN 'WI' WHEN @i % 5 = 3 THEN 'IA' WHEN @i % 5 = 4 THEN 'MI' END END GO 

When a request using UDF is overwritten as an external connection.

Consider the following query:

 SELECT o.ID, s.[name] AS StateName INTO dbo.ObservationsWithStateNames_Join FROM dbo.Observations o LEFT OUTER JOIN dbo.States s ON o.StateCode = s.Code /* SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'States'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 187 ms, elapsed time = 188 ms. */ 

And compare it to the query with the UDF built-in table value:

 CREATE FUNCTION dbo.GetStateName_Inline(@StateCode CHAR(2)) RETURNS TABLE AS RETURN(SELECT [Name] FROM dbo.States WHERE Code = @StateCode); GO SELECT ID, (SELECT [name] FROM dbo.GetStateName_Inline(StateCode)) AS StateName INTO dbo.ObservationsWithStateNames_Inline FROM dbo.Observations 

Both execution plans and the costs of its implementation are the same - the optimizer rewrote it as an external connection. Do not underestimate the power of the optimizer!

Querying using scalar UDF is much slower.

Here is the UDF scalar:

 CREATE FUNCTION dbo.GetStateName(@StateCode CHAR(2)) RETURNS VARCHAR(40) AS BEGIN DECLARE @ret VARCHAR(40) SET @ret = (SELECT [Name] FROM dbo.States WHERE Code = @StateCode) RETURN @ret END GO 

Obviously, a query using this UDF gives the same results, but has a different execution plan, and it is much slower:

 /* SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 3 ms. Table 'Worktable'. Scan count 1, logical reads 202930, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 11890 ms, elapsed time = 38585 ms. */ 

As you have seen, the optimizer can rewrite and optimize queries using the built-in table values ​​of UDF. On the other hand, queries related to scalar UDFs are not overwritten by the optimizer - the last query involves one function call on a line, which is very slow. Copied from here

+9
source

Do not use functions solely for aesthetics. This can be handled by consistent code formatting.

In the described situation, you create an external dependency - the function must exist and be visible to the user to run the request. So far, SQL Server does not support something identical to Oracle packages (assemblies are not native SQL) ...

There is also a risk of falling into the trap, believing that SQL functions are executed like methods / functions in procedural / OO programming that they do not have, so that queries can be degraded with the help of the function, and not without it.

+6
source

I try to avoid most functions because, although they are more beautiful than in string case statements, they also tend to make the query plan less accurate. If you hide a lot of complexities in the function, then complexity is also hidden from the query plan, so if you have problems and need to configure the request later, you will usually fix things that look like high cost but really trivial cost compared to UDF.

+1
source

All Articles