cloudarchitected

Geospatial Analytics in Databricks with Python and GeoMesa

Starting out in the world of geospatial analytics can be confusing, with a profusion of libraries, data formats and complex concepts. Here are a few approaches to get started with the basics, such as importing data and running simple geometric operations. This walkthrough is demonstrated in the sample notebooks (read below to compile the GeoMesa […]

cloudarchitected
Architecture

Monitoring and Logging in Azure Databricks with Azure Log Analytics and Grafana

Connecting Azure Databricks with Log Analytics allows monitoring and tracing each layer within Spark workloads, including the performance and resource usage on the host and JVM, as well as Spark metrics and application-level logging. You can easily test this integration end-to-end by following the accompanying tutorial on Monitoring Azure Databricks with Azure Log Analytics and […]

cloudarchitected

DevOps in Azure with Databricks and Data Factory

Building simple deployment pipelines to synchronize Databricks notebooks across environments is easy, and such a pipeline could fit the needs of small teams working on simple projects. Yet, a more sophisticated application includes other types of resources that need to be provisioned in concert and securely connected, such as Data Factory pipeline, storage accounts and […]

cloudarchitected
Architecture

Event-based analytical data processing with Azure Databricks

Cloud-native streaming architecture Overview Modern data analytics architectures should embrace the high flexibility required for today’s business environment, where the only certainty for every enterprise is that the ability to harness explosive volumes of data in real time is emerging as a a key source of competitive advantage. Fortunately, cloud platforms allow high scalability and […]

cloudarchitected
Architecture

Network Isolation for Azure Databricks

For the highest level of security in an Azure Databricks deployment, clusters can be deployed in a custom Virtual Network. With the default setup, inbound traffic is locked down, but outbound traffic is unrestricted for ease of use. The network can be configured to restrict outbound traffic. For data science and exploratory environments, it is […]