A simple way to unit test notebooks is to write the logic in a notebook that accepts parameterized inputs, and a separate test notebook that contains assertions. The sample project https://github.com/algattik/databricks-unit-tests/ contains two demonstration notebooks: The normalize_orders notebook processes a list of Orders and a list of OrderDetails into a joined list, taking into account […]
Tutorial
Tutorial: Power BI Transactional Applications with JavaScript
Writing data back from Power BI traditionally involves PowerApps integration, which comes with certain limitations, especially when fast feedback is desired in the UI layer. The Power BI JavaScript API, combined with DirectQuery, allows easily implementing powerful MVC scenarios, such as creating a business transaction from an embedded Power BI report and instantly visualizing the […]
Tutorial: Monitoring Azure Databricks with Azure Log Analytics and Grafana
This is the second post in our series on Monitoring Azure Databricks. See Monitoring and Logging in Azure Databricks with Azure Log Analytics and Grafana for an introduction. Here is a walkthrough that deploys a sample end-to-end project using Automation that you use to quickly get overview of the logging and monitoring functionality. The provided […]
Tutorial: DevOps in Azure with Databricks and Data Factory
This is Part 2 of our series on Azure DevOps with Databricks. Read Part 1 first for an introduction and walkthrough of DevOps in Azure with Databricks and Data Factory. Setting up the environment To get started, you will need a Pay-as-you-Go or Enterprise Azure subscription. A free trial subscription will not allow you to […]
Tutorial: Event-based ETL with Azure Databricks
This is part 2 of our series on event-based analytical processing. In the previous article, we covered the basics of event-based analytical data processing with Azure Databricks. This tutorial demonstrates how to set up a stream-oriented ETL job based on files in Azure Storage. We will configure a storage account to generate events in a […]
Real-time insights from Azure Databricks jobs with Stream Analytics and Power BI
The Azure Databricks Spark engine has capabilities to ingest, structure and process vast quantities of event data, and use analytical processing and machine learning to derive insights from the data at scale. Power BI can be used to visualize the data and deliver those insights in near-real time. Streaming data can be delivered from Azure […]
Tutorial: Idempotent ETL and API consumption with Azure Databricks
Scenario A data producer service generates data as messages. As messages can contain very large payloads, the service writes the data content to blob files, and only sends metadata as events. The data producer service exposes an API allowing retrieval of the payload data. Rather than returning the payload data in the API response, the […]