Verified DP-200 dumps Q&As - Pass Guarantee Exam Dumps Test Engine [2021]
DP-200 dumps and 242 unique questions
Microsoft DP-200 Exam Syllabus Topics:
| Topic | Details |
|---|---|
Implement Data Storage Solutions (40-45%) | |
| Implement non-relational data stores | - implement a solution that uses Cosmos DB, Data Lake Storage Gen2, or Blob storage -implement data distribution and partitions -implement a consistency model in Cosmos DB -provision a non-relational data store -provide access to data to meet security requirements -implement for high availability, disaster recovery, and global distribution |
| Implement relational data stores | -provide access to data to meet security requirements -implement for high availability and disaster recovery -implement data distribution and partitions for Azure Synapse Analytics -implement PolyBase |
| Manage data security | -implement data masking -encrypt data at rest and in motion |
Manage and Develop Data Processing (25-30%) | |
| Develop batch processing solutions | -develop batch processing solutions by using Data Factory and Azure Databricks -ingest data by using PolyBase -implement the integration runtime for Data Factory -create linked services and datasets -create pipelines and activities -create and schedule triggers -implement Azure Databricks clusters, notebooks, jobs, and autoscaling -ingest data into Azure Databricks |
| Develop streaming solutions | -configure input and output -select the appropriate built-in functions -implement event processing by using Stream Analytics |
Monitor and Optimize Data Solutions (30-35%) | |
| Monitor data storage | -monitor relational and non-relational data stores -implement Blob storage monitoring -implement Data Lake Storage Gen2 monitoring -implement Azure Synapse Analytics monitoring -implement Cosmos DB monitoring -configure Azure Monitor alerts -implement auditing by using Azure Log Analytics |
| Monitor data processing | -monitor Data Factory pipelines - monitor Azure Databricks -monitor Stream Analytics -configure Azure Monitor alerts -implement auditing by using Azure Log Analytics |
| Optimize of Azure data solutions | -troubleshoot data partitioning bottlenecks - optimize Data Lake Storage Gen2 -optimize Stream Analytics -optimize Azure Synapse Analytics -manage the data lifecycle |
Who should take the DP-200 exam
The Implementing an Azure Data Solution (beta) DP-200 Exam certification is an internationally-recognized validation that identifies persons who earn it as possessing skilled as a Microsoft Certified Azure Data Engineer Associate. If candidates want significant improvement in career growth needs enhanced knowledge, skills, and talents. Implementing an Azure Data Solution (beta) DP-200 Exam certification provides proof of this advanced knowledge and skill. If a candidate has knowledge of associated technologies and skills that are required to pass Implementing an Azure Data Solution (beta) DP-200 Exam then he should take this exam.
NEW QUESTION 37
You need to mask tier 1 data. Which functions should you use? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
A: Default
Full masking according to the data types of the designated fields.
For string data types, use XXXX or fewer Xs if the size of the field is less than 4 characters (char, nchar, varchar, nvarchar, text, ntext).
B: email
C: Custom text
Custom StringMasking method which exposes the first and last letters and adds a custom padding string in the middle. prefix,[padding],suffix Tier 1 Database must implement data masking using the following masking logic:
References:
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
NEW QUESTION 38
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some questions sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You need to configure data encryption for external applications.
Solution:
1. Access the Always Encrypted Wizard in SQL Server Management Studio
2. Select the column to be encrypted
3. Set the encryption type to Deterministic
4. Configure the master key to use the Azure Key Vault
5. Validate configuration results and deploy the solution
Does the solution meet the goal?
- A. No
- B. Yes
Answer: B
Explanation:
Explanation
We use the Azure Key Vault, not the Windows Certificate Store, to store the master key.
Note: The Master Key Configuration page is where you set up your CMK (Column Master Key) and select the key store provider where the CMK will be stored. Currently, you can store a CMK in the Windows certificate store, Azure Key Vault, or a hardware security module (HSM).
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-always-encrypted-azure-key-vault
Topic 2, Proseware Inc
Background
Proseware, Inc, develops and manages a product named Poll Taker. The product is used for delivering public opinion polling and analysis.
Polling data comes from a variety of sources, including online surveys, house-to-house interviews, and booths at public events.
Polling data
Polling data is stored in one of the two locations:
* An on-premises Microsoft SQL Server 2019 database named PollingData
* Azure Data Lake Gen 2
Data in Data Lake is queried by using PolyBase
Poll metadata
Each poll has associated metadata with information about the poll including the date and number of respondents. The data is stored as JSON.
Phone-based polling
Security
* Phone-based poll data must only be uploaded by authorized users from authorized devices
* Contractors must not have access to any polling data other than their own
* Access to polling data must set on a per-active directory user basis
Data migration and loading
* All data migration processes must use Azure Data Factory
* All data migrations must run automatically during non-business hours
* Data migrations must be reliable and retry when needed
Performance
After six months, raw polling data should be moved to a lower-cost storage solution.
Deployments
* All deployments must be performed by using Azure DevOps. Deployments must use templates used in multiple environments
* No credentials or secrets should be used during deployments
Reliability
All services and processes must be resilient to a regional Azure outage.
Monitoring
All Azure services must be monitored by using Azure Monitor. On-premises SQL Server performance must be monitored.
NEW QUESTION 39
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
* A workload for data engineers who will use Python and SQL
* A workload for jobs that will run notebooks that use Python, Spark, Scala, and SQL
* A workload that data scientists will use to perform ad hoc analysis in Scala and R The enterprise architecture team at your company identifies the following standards for Databricks environments:
* The data engineers must share a cluster.
* The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
* All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databrick clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.
Does this meet the goal?
- A. No
- B. Yes
Answer: A
Explanation:
We would need a High Concurrency cluster for the jobs.
Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language:
Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
References:
https://docs.azuredatabricks.net/clusters/configure.html
NEW QUESTION 40
Your company uses several Azure HDInsight clusters.
The data engineering team reports several errors with some application using these clusters.
You need to recommend a solution to review the health of the clusters.
What should you include in you recommendation?
- A. Log Analytics
- B. Azure Automation
- C. Application Insights
Answer: C
NEW QUESTION 41 
Use the following login credentials as needed:
Azure Username: xxxxx
Azure Password: xxxxx
The following information is for technical support purposes only:
Lab Instance: 10277521
You need to create an Azure SQL database named db3 on an Azure SQL server named SQL10277521. Db3 must use the Sample (AdventureWorksLT) source.
To complete this task, sign in to the Azure portal.
Answer:
Explanation:
See the explanation below.
Explanation
1. Click Create a resource in the upper left-hand corner of the Azure portal.
2. On the New page, select Databases in the Azure Marketplace section, and then click SQL Database in the Featured section.
3. Fill out the SQL Database form with the following information, as shown below:
Database name: Db3
Select source: Sample (AdventureWorksLT)
Server: SQL10277521
4. Click Select and finish the Wizard using default options.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-design-first-database
NEW QUESTION 42
You manage an enterprise data warehouse in Azure Synapse Analytics.
Users report slow performance when they run commonly used queries. Users do not report performance changes for infrequently used queries.
You need to monitor resource utilization to determine the source of the performance issues.
Which metric should you monitor?
- A. Data IO percentage
- B. Data Warehouse Units (DWU) used
- C. Cache hit percentage
- D. DWU limit
Answer: C
Explanation:
Explanation
The Azure Synapse Analytics storage architecture automatically tiers your most frequently queried columnstore segments in a cache residing on NVMe based SSDs designed for Gen2 data warehouses. Greater performance is realized when your queries retrieve segments that are residing in the cache. You can monitor and troubleshoot slow query performance by determining whether your workload is optimally leveraging the Gen2 cache.
Note: As of November 2019, Azure SQL Data Warehouse is now Azure Synapse Analytics.
Reference:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-how-to-monitor-cache
https://docs.microsoft.com/bs-latn-ba/azure/sql-data-warehouse/sql-data-warehouse-concept-resource-utilization
NEW QUESTION 43
You need to mask tier 1 dat
a. Which functions should you use? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
References:
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
NEW QUESTION 44
Which masking functions should you implement for each column to meet the data masking requirements? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
Box 1: Default
Default uses a zero value for numeric data types (bigint, bit, decimal, int, money, numeric, smallint, smallmoney, tinyint, float, real).
Only Show a zero value for the values in a column named ShockOilWeight.
Box 2: Credit Card
The Credit Card Masking method exposes the last four digits of the designated fields and adds a constant string as a prefix in the form of a credit card.
Example: XXXX-XXXX-XXXX-1234
Only show the last four digits of the values in a column named SuspensionSprings.
Scenario:
The company identifies the following data masking requirements for the Race Central data that will be stored in SQL Database:
Only Show a zero value for the values in a column named ShockOilWeight.
Only show the last four digits of the values in a column named SuspensionSprings.
Topic 4, ADatum Corporation
Case study
Overview
ADatum Corporation is a retailer that sells products through two sales channels: retail stores and a website.
Existing Environment
ADatum has one database server that has Microsoft SQL Server 2016 installed. The server hosts three mission-critical databases named SALESDB, DOCDB, and REPORTINGDB.
SALESDB collects data from the stored and the website.
DOCDB stored documents that connect to the sales data in SALESDB. The documents are stored in two different JSON formats based on the sales channel.
REPORTINGDB stores reporting data and contains server columnstore indexes. A daily process creates reporting data in REPORTINGDB from the data in SALESDB. The process is implemented as a SQL Server Integration Services (SSIS) package that runs a stored procedure from SALESDB.
Requirements
Planned Changes
ADatum plans to move the current data infrastructure to Azure. The new infrastructure has the following requirements:
Migrate SALESDB and REPORTINGDB to an Azure SQL database.
Migrate DOCDB to Azure Cosmos DB.
The sales data including the documents in JSON format, must be gathered as it arrives and analyzed online by using Azure Stream Analytics. The analytic process will perform aggregations that must be done continuously, without gaps, and without overlapping.
As they arrive, all the sales documents in JSON format must be transformed into one consistent format.
Azure Data Factory will replace the SSIS process of copying the data from SALESDB to REPORTINGDB.
Technical Requirements
The new Azure data infrastructure must meet the following technical requirements:
Data in SALESDB must encrypted by using Transparent Data Encryption (TDE). The encryption must use your own key.
SALESDB must be restorable to any given minute within the past three weeks.
Real-time processing must be monitored to ensure that workloads are sized properly based on actual usage patterns.
Missing indexes must be created automatically for REPORTINGDB.
Disk IO, CPU, and memory usage must be monitored for SALESDB.
NEW QUESTION 45
Use the following login credentials as needed:
Azure Username: xxxxx
Azure Password: xxxxx
The following information is for technical support purposes only:
Lab Instance: 10543936
You need to ensure that users in the West US region can read data from a local copy of an Azure Cosmos DB database named cosmos10543936.
To complete this task, sign in to the Azure portal.
NOTE: This task might take several minutes to complete. You can perform other tasks while the task completes or end this section of the exam.
Answer:
Explanation:
See the explanation below.
Explanation
You can enable Availability Zones by using Azure portal when creating an Azure Cosmos account.
You can enable Availability Zones by using Azure portal.
Step 1: enable the Geo-redundancy, Multi-region Writes
1. In Azure Portal search for and select Azure Cosmos DB.
2. Locate the Cosmos DB database named cosmos10543936
3. Access the properties for cosmos10543936
4. enable the Geo-redundancy, Multi-region Writes.
Location: West US region
Step 2: Add region from your database account
1. In to Azure portal, go to your Azure Cosmos account, and open the Replicate data globally menu.
2. To add regions, select the hexagons on the map with the + label that corresponds to your desired region(s).
Alternatively, to add a region, select the + Add region option and choose a region from the drop-down menu.
Add: West US region
3. To save your changes, select OK.
Reference:
https://docs.microsoft.com/en-us/azure/cosmos-db/high-availability
https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-manage-database-account
NEW QUESTION 46
You need to mask tier 1 data. Which functions should you use? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
A: Default
Full masking according to the data types of the designated fields.
For string data types, use XXXX or fewer Xs if the size of the field is less than 4 characters (char, nchar, varchar, nvarchar, text, ntext).
B: email
C: Custom text
Custom StringMasking method which exposes the first and last letters and adds a custom padding string in the middle. prefix,[padding],suffix Tier 1 Database must implement data masking using the following masking logic:
References:
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
NEW QUESTION 47
You need to receive an alert when Azure SQL Data Warehouse consumes the maximum allotted resources.
Which resource type and signal should you use to create the alert in Azure Monitor? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
Resource type: SQL data warehouse
DWU limit belongs to the SQL data warehouse resource type.
Signal: DWU USED
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-insights-alerts-portal
NEW QUESTION 48
A company is planning to use Microsoft Azure Cosmos DB as the data store for an application. You have the following Azure CLI command:
az cosmosdb create --name "cosmosdbdev1" --resource-group "rgdev"
You need to minimize latency and expose the SQL API. How should you complete the command? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation:
Box 1: Eventual
With Azure Cosmos DB, developers can choose from five well-defined consistency models on the consistency spectrum. From strongest to more relaxed, the models include strong, bounded staleness, session, consistent prefix, and eventual consistency.
The following image shows the different consistency levels as a spectrum.
Box 2: GlobalDocumentDB
Select Core(SQL) to create a document database and query by using SQL syntax.
Note: The API determines the type of account to create. Azure Cosmos DB provides five APIs: Core(SQL) and MongoDB for document databases, Gremlin for graph databases, Azure Table, and Cassandra.
References:
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
https://docs.microsoft.com/en-us/azure/cosmos-db/create-sql-api-dotnet
NEW QUESTION 49
Your company manages on-premises Microsoft SQL Server pipelines by using a custom solution.
The data engineering team must implement a process to pull data from SQL Server and migrate it to Azure Blob storage. The process must orchestrate and manage the data lifecycle.
You need to configure Azure Data Factory to connect to the on-premises SQL Server database.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Explanation
Step 1: Create a virtual private network (VPN) connection from on-premises to Microsoft Azure.
You can also use IPSec VPN or Azure ExpressRoute to further secure the communication channel between your on-premises network and Azure.
Azure Virtual Network is a logical representation of your network in the cloud. You can connect an on-premises network to your virtual network by setting up IPSec VPN (site-to-site) or ExpressRoute (private peering).
Step 2: Create an Azure Data Factory resource.
Step 3: Configure a self-hosted integration runtime.
You create a self-hosted integration runtime and associate it with an on-premises machine with the SQL Server database. The self-hosted integration runtime is the component that copies data from the SQL Server database on your machine to Azure Blob storage.
Note: A self-hosted integration runtime can run copy activities between a cloud data store and a data store in a private network, and it can dispatch transform activities against compute resources in an on-premises network or an Azure virtual network. The installation of a self-hosted integration runtime needs on an on-premises machine or a virtual machine (VM) inside a private network.
References:
https://docs.microsoft.com/en-us/azure/data-factory/tutorial-hybrid-copy-powershell
NEW QUESTION 50
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You are developing a solution that will use Azure Stream Analytics. The solution will accept an Azure Blob storage file named Customers. The file will contain both in-store and online customer details. The online customers will provide a mailing address.
You have a file in Blob storage named LocationIncomes that contains median incomes based on location. The file rarely changes.
You need to use an address to look up a median income based on location. You must output the data to Azure SQL Database for immediate use and to Azure Data Lake Storage Gen2 for long-term retention.
Solution: You implement a Stream Analytics job that has one streaming input, one reference input, one query, and two outputs.
Does this meet the goal?
- A. No
- B. Yes
Answer: A
Explanation:
Explanation/Reference:
Explanation:
We need one reference data input for LocationIncomes, which rarely changes.
We need two queries, on for in-store customers, and one for online customers.
For each query two outputs is needed.
Note: Stream Analytics also supports input known as reference data. Reference data is either completely static or changes slowly.
References:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-add-inputs#stream-and-reference- inputs
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-outputs
NEW QUESTION 51
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values.
75% of the rows contain description data that has an average length of 1.1 MB.
You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics.
You need to prepare the files to ensure that the data copies quickly.
Solution: You convert the files to compressed delimited text files.
Does this meet the goal?
- A. No
- B. Yes
Answer: B
Explanation:
Explanation
All file formats have different performance characteristics. For the fastest load, use compressed delimited text files.
Reference:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-data
NEW QUESTION 52
You need to develop a pipeline for processing dat
a. The pipeline must meet the following requirements.
* Scale up and down resources for cost reduction.
* Use an in-memory data processing engine to speed up ETL and machine learning operations.
* Use streaming capabilities.
* Provide the ability to code in SQL, Python, Scala, and R.
* Integrate workspace collaboration with Git.
What should you use?
- A. HDInsight Hadoop Cluster
- B. Azure Stream Analytics
- C. Azure SQL Data Warehouse
- D. HDInsight Spark Cluster
Answer: D
Explanation:
Aparch Spark is an open-source, parallel-processing framework that supports in-memory processing to boost the performance of big-data analysis applications.
HDInsight is a managed Hadoop service. Use it deploy and manage Hadoop clusters in Azure. For batch processing, you can use Spark, Hive, Hive LLAP, MapReduce.
Languages: R, Python, Java, Scala, SQL
You can create an HDInsight Spark cluster using an Azure Resource Manager template. The template can be found in GitHub.
References:
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing
NEW QUESTION 53
You manage the Microsoft Azure Databricks environment for a company. You must be able to access a private Azure Blob Storage account. Data must be available to all Azure Databricks workspaces. You need to provide the data access.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Explanation
Step 1: Create a secret scope
Step 2: Add secrets to the scope
Note: dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>") gets the key that has been stored as a secret in a secret scope.
Step 3: Mount the Azure Blob Storage container
You can mount a Blob Storage container or a folder inside a container through Databricks File System - DBFS. The mount is a pointer to a Blob Storage container, so the data is never synced locally.
Note: To mount a Blob Storage container or a folder inside a container, use the following command:
Python
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net", mount_point = "/mnt/<mount-name>", extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")}) where:
dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>") gets the key that has been stored as a secret in a secret scope.
References:
https://docs.databricks.com/spark/latest/data-sources/azure/azure-storage.html
NEW QUESTION 54
You develop data engineering solutions for a company.
You need to ingest and visualize real-time Twitter data by using Microsoft Azure.
Which three technologies should you use? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
- A. Event Grid topic
- B. Logic App that sends Twitter posts which have target keywords to Azure
- C. Event Hub instance
- D. Azure Stream Analytics Job that queries Twitter data from an Event Hub
- E. Azure Stream Analytics Job that queries Twitter data from an Event Grid
- F. Event Grid subscription
Answer: B,C,D
Explanation:
Explanation/Reference:
Explanation:
You can use Azure Logic apps to send tweets to an event hub and then use a Stream Analytics job to read from event hub and send them to PowerBI.
References:
https://community.powerbi.com/t5/Integrations-with-Files-and/Twitter-streaming-analytics-step-by-step/td- p/9594
NEW QUESTION 55
You plan to create a new single database instance of Microsoft Azure SQL Database.
The database must only allow communication from the data engineer's workstation. You must connect directly to the instance by using Microsoft SQL Server Management Studio.
You need to create and configure the Database. Which three Azure PowerShell cmdlets should you use to develop the solution? To answer, move the appropriate cmdlets from the list of cmdlets to the answer area and arrange them in the correct order.
Answer:
Explanation:
Explanation
Step 1: New-AzureSqlServer
Create a server.
Step 2: New-AzureRmSqlServerFirewallRule
New-AzureRmSqlServerFirewallRule creates a firewall rule for a SQL Database server.
Can be used to create a server firewall rule that allows access from the specified IP range.
Step 3: New-AzureRmSqlDatabase
Example: Create a database on a specified server
PS C:\>New-AzureRmSqlDatabase -ResourceGroupName "ResourceGroup01" -ServerName "Server01"
-DatabaseName "Database01
References:
https://docs.microsoft.com/en-us/azure/sql-database/scripts/sql-database-create-and-configure-database-powers
NEW QUESTION 56
......
DP-200 Dumps for Pass Guaranteed - Pass DP-200 Exam: https://www.dumpstests.com/DP-200-latest-test-dumps.html