DP-203T00 – Data Engineering on Microsoft Azure
In this course, the student will learn how to implement and manage data engineering workloads on MicrosoftAzure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure StreamAnalytics, Azure Databricks, and others. The course focuses on common data engineerings tasks such asorchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.
Microsoft Azure provides a comprehensive platform for data engineering; but what is data engineering?Complete this module to find out.
Click here to know more details
Data lakes are a core element of data analytics architectures. Azure Data Lake Storage Gen2 provides a scalable, secure, cloud-based solution for data lake storage.
Click here to know more details
Learn about the features and capabilities of Azure Synapse Analytics – a cloud-based platform for big data processing and analysis.
Click here to know more details
With Azure Synapse serverless SQL pool, you can leverage your SQL skills to explore and analyze data in files, without the need to load the data into a relational database.
Click here to know more details
Why choose between working with files in a data lake or a relational database schema? With lake databases in Azure Synapse Analytics, you can combine the benefits of both.
Click here to know more details
Apache Spark is a core technology for large-scale data analytics. Learn how to use Spark in Azure Synapse Analytics to analyze and visualize data in a data lake.
Click here to know more details
Data engineers commonly need to transform large volumes of data. Apache Spark pools in Azure Synapse Analytics provide a distributed processing platform that they can use to accomplish this goal.
Click here to know more details
Delta Lake is an open source relational storage area for Spark that you can use to implement a data lakehouse architecture in Azure Synapse Analytics.
Click here to know more details
Relational data warehouses are a core element of most enterprise Business Intelligence (BI) solutions, and are used as the basis for data models, reports, and analysis.
Click here to know more details
A core responsibility for a data engineer is to implement a data ingestion solution that loads new data into a relational data warehouse.
Click here to know more details
Pipelines are the lifeblood of a data analytics solution. Learn how to use Azure Synapse Analytics pipelines to build integrated data solutions that extract, transform, and load data across diverse systems.
Click here to know more details
Apache Spark provides data engineers with a scalable, distributed data processing platform, which can be integrated into an Azure Synapse Analytics pipeline.
Click here to know more details
Learn how hybrid transactional / analytical processing (HTAP) can help you perform operational analytics with Azure Synapse Analytics.
Click here to know more details
Azure Synapse Link for Azure Cosmos DB enables HTAP integration between operational data in Azure Cosmos DB and Azure Synapse Analytics runtimes for Spark and SQL.
Click here to know more details
Azure Synapse Link for SQL enables low-latency synchronization of operational data in a relational database to Azure Synapse Analytics.
Click here to know more details
Azure Stream Analytics enables you to process real-time data streams and integrate the data they contain into applications and analytical solutions.
Click here to know more details
Azure Stream Analytics provides a real-time data processing engine that you can use to ingest streaming event data into Azure Synapse Analytics for further analysis and reporting.
Click here to know more details
By combining the stream processing capabilities of Azure Stream Analytics and the data visualization capabilities of Microsoft Power BI, you can create real-time data dashboards.
Click here to know more details
In this module, you’ll evaluate whether Microsoft Purview is the right choice for your data discovery and governance needs.
Click here to know more details
Learn how to integrate Microsoft Purview with Azure Synapse Analytics to improve data discoverability and lineage tracking.
Click here to know more details
Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.
Click here to know more details
Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to transform, analyze and visualize data at scale.
Click here to know more details
Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.
Click here to know more details
The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course is data analysts and data scientists who work with analytical solutions built on Microsoft Azure.
Successful students start this course with knowledge of cloud computing and core data concepts and
professional experience with data solutions.
Specifically completing:
Skills Measured
To be added