A feature of Azure SQL Data Warehouse is the ability to scale (and pay for) compute resources as needed. This is particularly useful when loading data to a data warehouse. ETL operations typically put heavy load on the data warehouse, and could benefit from increased computing power. When the load process completes, however, that extra power may no longer be necessary.
The simplest way to adjust the compute power setting (known as the Service Objective or DWU level) is using the slider in the Azure Portal, shown below. Sliding up increases compute power, sliding down decreases compute power.
You could use the slider GUI to manually increase the compute power prior to running the ETL, and decrease the power after the ETL finishes.
Compute power can also be changed with a PowerShell command, REST API call, or SQL command. Following is a sample of the SQL command to change the Service Objective.
ALTER DATABASE MyDataWarehouse MODIFY (SERVICE_OBJECTIVE = 'DW400')
I recently built an SSIS package to load an Azure SQL Data Warehouse. It included an Execute SQL task to increase the compute power at the beginning of processing, and a similar task at the end to scale back down. The tasks included the command above with the DWU levels I wanted to use.
This seemed simple enough!
I quickly ran into problems, however. Immediately after this task , my package included an Execute SQL task to truncate a series of staging tables on the Azure SQL Data Warehouse. Executing the package, the scaling task finished and quickly moved on to the truncate task. But the truncate task failed with the following error message:
“111202;Query QID50802 has been cancelled.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.
While the command to change the service objective finishes quickly, it takes time for the change to take effect. Any SQL command issued while the scaling operation is running on the warehouse will fail. The database is effectively offline.
In my experience, the SQL command to change DWU level normally finishes quickly, in about two seconds, but the change normally takes about one minute to take effect.
The SQL command is asynchronous.
To test this yourself, run the following query on the master database immediately after changing the DWU level. You will likely see the current Service Objective is not what you changed it to.
SELECT CurrentDWServiceObjective = DBSO.[service_objective] FROM sys.databases AS DB JOIN sys.database_service_objectives AS DBSO ON DB.database_id = DBSO.database_id WHERE DB.[name] = 'MyDataWarehouse'
To prevent this failure, the ETL needs to understand when the change has taken effect and the database is available before continuing processing.
I’ve replaced my simple SQL script with the following script to accomplish this. The idea is to make the ETL wait until the scaling change takes effect and the warehouse is available again before continuing with ETL processing.
/* Run on *master* database */ DECLARE @DWName NVARCHAR(100) = 'MyDataWarehouse' DECLARE @NewDWServiceObjective NVARCHAR(100) = 'DW400' DECLARE @CurrentDWServiceObjective NVARCHAR(100) = 'TBD' DECLARE @DelayInSeconds INT = 10 --must be between 0 and 59 DECLARE @CumulativeDelayInSeconds INT = 0 DECLARE @MaxDelayInSeconds INT = 300 --5 minutes DECLARE @DelayString CHAR(8) --WAITFOR uses a string (00:00:10) DECLARE @SQL NVARCHAR(1000) --Command to change Service Objective DECLARE @Message NVARCHAR(1000) --Used in error message /* Build command to change DW Service Objective */ SET @SQL = 'ALTER DATABASE ' + @DWName + ' MODIFY (SERVICE_OBJECTIVE = ' + CHAR(39) + @NewDWServiceObjective + CHAR(39) + ')' /* Execute command to change Service Objective */ EXEC sp_executesql @SQL /*Create delay string*/ SELECT @DelayString = '00:00:' + RIGHT('0' + CAST(@DelayInSeconds AS VARCHAR(2)), 2) WHILE (@CurrentDWServiceObjective <> @NewDWServiceObjective) AND (@CumulativeDelayInSeconds <= @MaxDelayInSeconds) BEGIN /*Determine Current Service Objective*/ SELECT @CurrentDWServiceObjective = DBSO.[service_objective] FROM sys.databases AS DB JOIN sys.database_service_objectives AS DBSO ON DB.database_id = DBSO.database_id WHERE DB.[name] = @DWName /*Initiate delay*/ WAITFOR DELAY @DelayString /*Increment cumulative delay*/ SELECT @CumulativeDelayInSeconds += @DelayInSeconds END /* Raise error if time delay reaches max */ IF @CumulativeDelayInSeconds > @MaxDelayInSeconds BEGIN SET @Message = 'Changing Service Objective took longer than ' + CAST(@MaxDelayInSeconds AS NVARCHAR(2)) + ' seconds.' RAISERROR(@Message, 16, 1) END
A few notes on the script:
- Must be executed on the master database, not the data warehouse
- Issues the command to change the Azure SQL DW Service Objective (DWU level)
- It then queries the current DW Service Objective (DWU level). If current Service Objective does not match the intended Service Objective, it waits 10 seconds.
- After 10 seconds, it queries the current DW Service Objective again
- Delays continue as long as Current Service Objective is NOT the new Service Objective
- The loop exits only after the Service Tier change has taken effect
- Includes a timeout on the loop, in case some larger problem prevents the scale change from completing
I implemented this script in an Execute SQL task, passing some values as parameters. The task takes longer to run than before, but only finishes after the DWU scaling operation is complete.
Microsoft’s James Rowland-Jones alludes to a similar technique in his Developers Guide to Azure SQL Data Warehouse video. I highly recommend this video if you are new to Azure SQL DW. Even if you have experience with the platform, JRJ does a nice job explaining topics you may already take for granted.
Have you experienced issues changing DWU level on the fly? How did you work around the problem?
If you try the scripts from this post, please let me know how they worked for you!