Menu
Design a Partition Strategy for Efficiency and Performance – Data Sources and Ingestion

Design a Partition Strategy for Efficiency and Performance – Data Sources and Ingestion

Both efficiency and performance were discussed in earlier sections pertaining to the design of a partition strategy. An efficient query is one in which the time required to execute it is well used. That means the query should not be waiting on data shuffling or querying irrelevant data. The most efficient query would be one […]

Design a Partition Strategy – Data Sources and Ingestion

Design a Partition Strategy – Data Sources and Ingestion

A core objective of your data analytics solution is to have queries return results within an acceptable amount of time. If the dataset on which a query executes is huge, then you might experience unacceptable latency. In this context, “dataset” refers to a single database table or file. As the volume of data increases, you […]

Design for Efficient Querying – Data Sources and Ingestion

Design for Efficient Querying – Data Sources and Ingestion

You can take numerous steps to optimize the performance and manageability of your files contained on ADLS. The following actions can improve query efficiency: Use this information as a basis for the design of your storage structure. File Size, Type, and Quantity The more data contained within a file, the larger it is and the […]

Create an Azure Data Lake Storage Container – Data Sources and Ingestion-2

Create an Azure Data Lake Storage Container – Data Sources and Ingestion-2

The following options are available on the Advanced tab: Begining with the selections you made during the provisioning of ADLS in Exercise 3.1, start with Enable Hierarchical Namespaces. If you do not select this, instead of getting an ADLS container, you get a general‐purpose v2‐based blob container. As discussed in Chapter 1, blob containers are […]

Create an Azure Data Lake Storage Container – Data Sources and Ingestion-1

Create an Azure Data Lake Storage Container – Data Sources and Ingestion-1

FIGUER 3.2 An Azure storage account Overview blade Exercise 3.1 walked you through provisioning an ADLS container. You encountered numerous options, beginning with the first items selected, the subscription and resource group. Remember that an Azure subscription is the location where billing happens. It is a grouping of all provisioned Azure products. You can have […]

Training and Enrichment – CREATE DATABASE dbName; GO

Training and Enrichment – CREATE DATABASE dbName; GO

The training and enrichment of data typically happens by making improvements to the data quality or invoking Azure Machine Learning models, which can be later consumed by Azure Cognitive Services. The invoke can take place within a pipeline or manually. Azure Machine Learning models can be used to predict future outcomes based on historical trends […]

Design a Data Storage Structure – Data Sources and Ingestion

Design a Data Storage Structure – Data Sources and Ingestion

In this chapter you will provision numerous Azure data analytics products. By doing so, you will begin to understand more about the products and their features, which can help you create and choose the best tool for your given solution requirements. Choosing a proper service for a scenario results in having a solid design. Table […]

CORR – CREATE DATABASE dbName; GO

CORR – CREATE DATABASE dbName; GO

This function returns the coefficient of correlation when passed a pair of numbers. CORR will determine if a relationship exists between the pair of values it receives. The result is a range from −1 to 1 where either ±1 means there is a correlation between the two numbers, and a 0 means there is no […]

OPENJSON – CREATE DATABASE dbName; GO

OPENJSON – CREATE DATABASE dbName; GO

This function is used in combination with OPENROWSET and instructs the runtime about the format of the content being retrieved. In addition to OPENJSON, we can use OPENXML and OPENQUERY, which are used in the same context but on different file formats or to run SQL queries directly against a database. The following query illustrates […]

Stored Procedures – CREATE DATABASE dbName; GO

Stored Procedures – CREATE DATABASE dbName; GO

When you write code or a query, you typically do so on a workstation on your desk or via a browser. If the code needs to make a connection to a database to retrieve and parse some data, it is important to decide where the parsing should take place. There are two places where the […]