Now that you have performed some data transformation exercises, it is a good time to read about some applicable transformation and data management concepts. Transformation As you progressed through the exercises that transformed the...
Read More
Normalize and Denormalize Values– Transform, Manage, and Prepare Data
Normalization and denormalization can be approached in two contexts. The first context has to do with the deduplication of data and query speed on database tables in a relational database. The other context has...
Read More
Configure Error Handling for the Transformation– Transform, Manage, and Prepare Data
As you transform data using Azure Synapse Analytics, there may be some failures when writing to the sink. The failures might happen due to data truncation, such as when the data type is defined...
Read More
Perform Exploratory Data Analysis—Transform– Transform, Manage, and Prepare Data
This exercise requires a Power BI Premium subscription, which can be acquired at https://powerbi.microsoft.com. FIGURE 5.32 Performing exploratory data analysis—visualizing data in Power BI (2) FIGURE 5.33 Performing exploratory data analysis—Power BI workspace FIGURE...
Read More
Encode and Decode Data– Transform, Manage, and Prepare Data
There is a lot of history surrounding the encoding and decoding of data. Fundamentally, this concept revolves around how to store and render letter characters. As you know, all things that are computed must...
Read More
Azure Cosmos DB—Shred JSON– Transform, Manage, and Prepare Data
FIGURE 5.23 Shredding JSON with Azure Cosmos DB The query you executed in step 4 begins with a SELECT, which is followed by the OPENROWSET that contains information about the PROVIDER, CONNECTION, and OBJECT.SELECT...
Read More
Split Data – Transform, Manage, and Prepare Data
FIGURE 5.21 Splitting the data source—Projection tab FIGURE 5.22 Splitting the data sink—Optimize tab In Exercise 5.6 you created a data flow that contains a source to import a large CSV file from ADLS....
Read More
Jupyter Notebooks – Transform, Manage, and Prepare Data
Throughout the exercises in this book, you have created numerous notebooks. The notebooks are web‐based and consist of a series of ordered cells that can contain code. The code within these cells is what...
Read More
Cleanse Data – Transform, Manage, and Prepare Data
%%pysparkdf = spark.read \.load(‘abfss://*@*.dfs.core.windows.net/SessionCSV/BRAINWAVES_WITH_ NULLS.csv’,format=’csv’, header=True) The final action to take after cleansing the data is to perhaps save it to a temporary table, using the saveAsTable(tableName) method, or into the Parquet file format....
Read More
Shred JSON– Transform, Manage, and Prepare Data
When you shred something, the object being shredded is torn into small pieces. In many respects, it means that the pieces that result from being torn are in the smallest possible size. In this...
Read More