It does provide some benefit to understand the structure of the data you must ingest, transform, and progress through the other Big Data pipeline stages. It is helpful to know because as you make...
Read More
Storing Prepared, Trained, and Modeled Data – Data Sources and Ingestion
All data, regardless of the Big Data stage it is in, must be stored. The data not yet ingested into the pipeline is stored someplace too, but not yet on Azure. You can use...
Read More
Transform Data Using Apache Spark—Azure Databricks – Transform, Manage, and Prepare Data
The Azure Databricks workspace should resemble Figure 5.12. FIGURE 5.12 Transforming data using an Apache Spark Azure Databricks workspace The first important point for Exercise 5.4 has to do with the location of the...
Read More
Perform Exploratory Data Analysis– Transform, Manage, and Prepare Data
The previous queries are in the preliminaryEDA.sql file in the Chapter05/Ch05Ex11 folder, on GitHub at https://github.com/benperk/ADE. The previous queries are in the preliminaryEDA.sql file in the Chapter05/Ch05Ex11 folder, on GitHub at https://github.com/benperk/ADE. FIGURE 5.31...
Read More
Storing Data Using Azure HDInsight – Data Sources and Ingestion
Like most other Azure Big Data analytics products, an Azure Storage account is provisioned along with the compute nodes and platform. Azure HDInsight is no different in this respect. When you provision an Azure...
Read More
Ingest and Transform Data – Transform, Manage, and Prepare Data
Before diving into the process, which you have already seen in Figure 2.30, first you need a definition of transformation. A few examples of transformation that you have experienced so far were in Exercise...
Read More
Transform Data Using Azure Synapse Pipelines – Transform, Manage, and Prepare Data-2
FIGURE 5.3 Azure Synapse Analytics—transformating Brainjammer brain waves FIGURE 5.4 Azure Synapse Analytics—monitoring Brainjammer brain wave transformationsSELECT COUNT(*) AS [COUNT] FROM [brainwaves].[FactREADING]...
Read More
Transform Data Using Azure Data Factory – Transform, Manage, and Prepare Data
Transform Data Using Azure Data FactoryThe capabilities for achieving most activities in Azure Data Factory (ADF) are also available in Azure Synapse Analytics. Unless you have a need or requirement to use ADF, you...
Read More
Transform Data Using Apache Spark—Azure Synapse Analytics – Transform, Manage, and Prepare Data-2
FIGURE 5.10 Transforming data using an Apache Spark Azure Synapse Spark pool That was a long and complicated exercise, so congratulations if you got it all going. Figure 5.11 illustrates how what you just...
Read More