In this article we will be migrating our on-premises SQL Data Warehouse to Dedicated SQL pool in Azure Synapses Analytics. We will do this process in three stages -
- Migrating the metadata using Azure Synapse Pathway
- Making any changes to table design and code
- Migrating the data using integration capabilities of Azure Synapse
-
Download backup of AdventureWorksDW2019.bak from https://docs.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-ver15&tabs=ssms
- Restore the backup locally
- Download and install Azure Synapse Pathway from https://www.microsoft.com/en-us/download/details.aspx?id=103061
- Register Microsoft.Synapse provider for your subscription
- Create Azure Synapse Analytics service
- Select your subscription
- Create a new resource group for the demo
- Provide workspace name
- Select region
- Provide Data Lake Storage Gen2 account and check assign myself contributor role checkbox
- Create SQL Sever admin login and SQL password to login
- Leave rest of the configuration as default and hit Review + create
- Open and check your workspace
- First step is to generate scripts for all the database objects from on-premises AdevntureWorksDW2019 data warehouse
- Select only Tables for generting scripts
- Now open Azure Synapse Pathway tool installed locally and follow below steps
- Translation type - select Microsoft SQL Server
- Input directory - select the directory where generated scripts are stored
- Output directory - select folder where you want to store the output of translation
- Click Translate
- In this step, we can redesign the tables so as to achieve optimized performance on Dedicated SQL pools
- Microsoft has provided guidance on how to design distributed tables - https://docs.microsoft.com/en-gb/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distribute
- We also resolve any errors that we got from Azure Synapse Pathway as some code might need to be changed due to it being not compatible with Dedicated SQL pools
- A list of those incompatibilities is listed here - https://docs.microsoft.com/en-gb/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-develop-stored-procedures#limitations
We will use Azure Synapse Analytics Integrate hub to create integration pipeline for copying data from on-premises SQL Data Warehouse to the Dedicated SQL pool
- In your Azure Synapse Anaytics Workspace, go to Integrate hub and select Copy Data tool by clicking on + sign. It will open up Copy Data tool wizard
- Select Built in copy task and Run once now radio button to create the pipeline
- For Source type select SQL Server and create a new connection between on-premises environment and Azure using self-hosted integration runtime
- Click on + New under Connect via integration runtime to create a self-hosted integration runtime
- Provide a name for your integration runtime
- Use Option 1: Express Setup to download and install integration runtime on your Windows machine
- After creating the integration runtime, provide below setting to connect to your on-premises SQL Data Warehouse
- Server name: name of your local SQL instance
- Database name: AdventureWorksDW2019
- Authentication type: SQL authentication
- User name and Password: credentials with access rights to the DataWarehouse
Click on Test connection and Apply settings
- Now we will setup configuration
- Select the tables as the source data for the copy task. We will select dbo.DimAccount for our demo
- Preview the data
- Select Destination data store now
- Create a New connection and select Azure Synapse Analytics
- Select the Dedicated SQL pool created in Azure Synapse Analytics workspace
- Select your Azure subscription
- Server name: select your Synapse workspace
- Database name: select the Dedicated SQL pool
- Authentication type: select SQL authentication along with User name and Password you created while setting up your Synapse workspace
- Now select Configuration you want to apply
- Update any Column mappings for each table. We will uncheck Type conversion checkbox
- Provide final Settings for your pipeline
- Provide Task name
- Check Enable staging checkbox to first move your data to a staging area and then to Dedicated SQL pool - Setup Staging settings by providing a Storage Path to the Data Lake account
- Select Copy method as PolyBase
- Review the pipeline and deploy the pipeline
- Click on Monitor to check the Pipeline runs
- Once the pipeline has run successfully, you can go to Data tab and select your Dedicated SQL pool under Workspaces to check the copied data for dbo.DimAccount table
Here you saw how you can migrate your on-premises SQL Data Warehouse to Dedicated SQL pool in Azure Synapse Analytics.