Transform Data Using Apache SparkApache Spark can be used in a few products running on Azure: Azure Synapse Analytics Spark pools, Azure Databrick Spark clusters, Azure HDInsight Spark clusters, and Azure Data Factory. The...
Read More
Transform Data Using Azure Synapse Pipelines – Transform, Manage, and Prepare Data-3
One action you may have noticed in Exercise 5.1 is that you used the existing pipeline that you created in Exercise 4.13. That pipeline performed one activity, which was to copy data from the...
Read More
Transform Data Using Azure Synapse Pipelines – Transform, Manage, and Prepare Data
It does provide some benefit to understand the structure of the data you must ingest, transform, and progress through the other Big Data pipeline stages. It is helpful to know because as you make...
Read More
Storing Prepared, Trained, and Modeled Data – Data Sources and Ingestion
All data, regardless of the Big Data stage it is in, must be stored. The data not yet ingested into the pipeline is stored someplace too, but not yet on Azure. You can use...
Read More
Transform Data Using Apache Spark—Azure Databricks – Transform, Manage, and Prepare Data
The Azure Databricks workspace should resemble Figure 5.12. FIGURE 5.12 Transforming data using an Apache Spark Azure Databricks workspace The first important point for Exercise 5.4 has to do with the location of the...
Read More
Perform Exploratory Data Analysis– Transform, Manage, and Prepare Data
The previous queries are in the preliminaryEDA.sql file in the Chapter05/Ch05Ex11 folder, on GitHub at https://github.com/benperk/ADE. The previous queries are in the preliminaryEDA.sql file in the Chapter05/Ch05Ex11 folder, on GitHub at https://github.com/benperk/ADE. FIGURE 5.31...
Read More
Storing Data Using Azure HDInsight – Data Sources and Ingestion
Like most other Azure Big Data analytics products, an Azure Storage account is provisioned along with the compute nodes and platform. Azure HDInsight is no different in this respect. When you provision an Azure...
Read More
Ingest and Transform Data – Transform, Manage, and Prepare Data
Before diving into the process, which you have already seen in Figure 2.30, first you need a definition of transformation. A few examples of transformation that you have experienced so far were in Exercise...
Read More
Transform Data Using Azure Synapse Pipelines – Transform, Manage, and Prepare Data-2
FIGURE 5.3 Azure Synapse Analytics—transformating Brainjammer brain waves FIGURE 5.4 Azure Synapse Analytics—monitoring Brainjammer brain wave transformationsSELECT COUNT(*) AS [COUNT] FROM [brainwaves].[FactREADING]...
Read More
Transform Data Using Azure Data Factory – Transform, Manage, and Prepare Data
Transform Data Using Azure Data FactoryThe capabilities for achieving most activities in Azure Data Factory (ADF) are also available in Azure Synapse Analytics. Unless you have a need or requirement to use ADF, you...
Read More