Home » Exam Preparation » Certification » DP-203: Microsoft Azure Data Engineering Certification Exam Dumps » Page 9

DP-203: Microsoft Azure Data Engineering Certification Exam Dumps

Question #11

You have an Azure Data Factory that contains 10 pipelines.
You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.
What should you add to each pipeline?

  • A. a resource tag
  • B. a correlation ID
  • C. a run group ID
  • D. an annotation

Correct Answer: D
Annotations are additional, informative tags that you can add to specific factory resources: pipelines, datasets, linked services, and triggers. By adding annotations, you can easily filter and search for specific factory resources.
Reference:
https://www.cathrinewilhelmsen.net/annotations-user-properties-azure-data-factory/

D (100%)

Question #12

HOTSPOT –
The following code segment is used to create an Azure Databricks cluster.

For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:

Correct Answer: 
Box 1: Yes –
A cluster mode of ‘High Concurrency’ is selected, unlike all the others which are ‘Standard’. This results in a worker type of Standard_DS13_v2.

Box 2: No –
When you run a job on a new cluster, the job is treated as a data engineering (job) workload subject to the job workload pricing. When you run a job on an existing cluster, the job is treated as a data analytics (all-purpose) workload subject to all-purpose workload pricing.

Box 3: Yes –
Delta Lake on Databricks allows you to configure Delta Lake based on your workload patterns.
Reference:

Databricks – Cluster Sizing


https://docs.microsoft.com/en-us/azure/databricks/jobs
https://docs.databricks.com/administration-guide/capacity-planning/cmbp.html https://docs.databricks.com/delta/index.html

Question #13

You are designing a statistical analysis solution that will use custom proprietary Python functions on near real-time data from Azure Event Hubs.
You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.
What should you recommend?

  • A. Azure Synapse Analytics
  • B. Azure Databricks Most Voted
  • C. Azure Stream Analytics
  • D. Azure SQL Database

Correct Answer: C
Reference:
https://docs.microsoft.com/en-us/azure/event-hubs/process-data-azure-stream-analytics

B (74%)

C (26%)

Leave a Comment