Integration can be an expensive and time-consuming activity. It’s not uncommon for integration projects to take months and for that process to be completely opaque. Microsoft Azure Data Factories (ADF) is a toolset for data transformation in the cloud that does not require developers to configure. It can help to minimize the amount of code required for building and maintaining an integration project. It allows non-developers to build and maintain integration infrastructure, which gives transparency to people who are not developers.
What are Microsoft Azure Data Factories?
Like AWS Data Pipelines and SQL Server Integration Services (SSIS), data factories transform data and run on triggers or schedules. They are part of the Microsoft Azure offering and run in the cloud. ADF supports both the Extract, Transform, and Load (ETL), and the Extract, Load, and Transform (ELT) approaches to transforming data. You can use ADF to configure mapping from source to target with a visual interface instead of writing code.
ADF is an evolution of the same approach as SSIS. It brings the concepts to the cloud in the familiar Microsoft environment. SQL Server experts understand the approach, and ADF offers an approach to migrate existing SSIS packages to the cloud. Microsoft refers to ADF as “SSIS in the Cloud”.
Why Azure Data Factories?
The critical reason to use ADF is to decouple your integration projects from code. ADF does not exclude code solutions, but encourages configuration over code, and allows the integration team to drop back to code via tools such as Azure Functions when necessary. Data Factories support .NET and Python out of the box when code is required.
AWS Data Pipelines is a similar technology by Amazon. A comparison is outside the scope of this article. Still, your team should choose ADF over AWS Data Pipelines if your infrastructure is predominantly Azure-based, and your team is more familiar with the Microsoft tech stack. Here are some features.
Development teams commonly take a code-first approach to integration. This approach is not wrong, but code requires specialized staff to maintain, and diagnostic information is usually an afterthought. ADF has an interface simple enough that most engineers can understand without having previous exposure. When a team member leaves, black-box code often leaves the rest of the team stumped. ADF can reduce this scenario’s risk, as all data mapping is clear and transparent via the interface.
- It scales on demand. Running SSIS packages on local infrastructure means that you need to configure a server for the purpose. This usually means that the server will be over or under utilized.
- ADF comes with over 90 connectors with prebuilt functionality for connecting to systems such as Azure Blob Storage, PostgreSQL, SQL Server, Dynamics 365, Oracle, and many more
- Compatibility with SSIS
- Administer through Microsoft Azure Portal
- Graphical user interface for mapping
- Protect privacy and adhere to standards. According to Microsoft, “Data Factory is certified by HIPAA, HITECH, ISO/IEC 27001, ISO/IEC 27018 and CSA STAR.”
"ADF helped us building and automating data integration workflows to measure the customer engagement and delivery outcome, which inturn reduces the amount of futile deliveries"
Benkham Koshy - Lead Developer
Our client has hundreds of brick and mortar stores nation-wide and multiple legacy systems. They engaged Restive Tech for centralisation of its warehousing and supply chain systems.
Futile deliveries are an issue for retailers that deliver to their customers. Futile deliveries are deliveries where the customer cannot receive their goods at the time of delivery. This can sometimes occur because the customer is not at home or misjudgment about the ability to fit the item into the home. This becomes costly because the retailer needs to redeliver at a later date.
The project is a workflow that allows the client to contact the customers and gather information based on a questionnaire sent via a Url in a text message. A web form captures the customer’s answers in a secure fashion and stores the answers in Azure Cosmos DB. The form is decoupled from the existing warehouse system, and ADF allows collected data to be shaped and then imported into SQL Server in Azure using a Data Flow activity . The transformation processes the source JSON and stores the data in the target relational database using the ETL approach. A schedule triggers the data imports.
The form and Dynamics 365 data aggregate into the reporting database, where Power BI serves a BI dashboard. The dashboard answers questions like how many people responded? Was the item delivered? This allows the client to gather objective data on customer engagement and adjust the questionnaire to encourage more engagement with the long term view of reducing futile deliveries.
Conclusion
Consider ADF as part of your integration solution. Most solutions require some level of data transformation. Code should not be the go-to for each project. Platforms like ADF can increase your integration’s maintainability, allow you to integrate faster, and reduce your costs over the long run. Take some time to discuss ADF with Restive Tech before launching into a potentially costly integration exercise.