Shred JSON– Transform, Manage, and Prepare Data

When you shred something, the object being shredded is torn into small pieces. In many respects, it means that the pieces that result from being torn are in the smallest possible size. In this scenario the object in question is a JSON document or file, and the result of shredding it fulfills two primary objectives. The first objective is the flattening of arrays, which are very common JSON constructs, where the common JSON construct is one with square brackets ([]) that hold arrays separated by a comma, and where curly braces ({}) hold name‐value pairs. When you look through the brain waves JSON files, you will see a structure similar to the following. Also refer to Figure 2.7 to recall how this looks in an Azure Cosmos DB.

{ “Session”: { “Scenario”: “ClassicalMusic”,”POWReading”: [ { “AF3”: [ {“THETA”: 44.254, “ALPHA”: 5.479,”BETA_L”: 1.911,”BETA_H”: 1.688,”GAMMA”: 0.259}], “T7”: [{ “THETA”: 1.664,”ALPHA”: 1.763,”BETA_L”: 3.806,”BETA_H”: 1.829,”GAMMA”: 0.867 } ],…}]}}

If you want to shred that JSON file, you would do so with the objective of capturing the scenario, the electrode, the frequency, and the value. You might remember the term “exploding arrays” from Chapter 2, which has the same meaning and purpose as shredding; it is simply a different term. The second objective of shredding is done once the data you require is captured, i.e., pulled from the file into memory. Once the data is in memory, you can arrange the data into a format that can be easily queried using SQL‐like syntax or DataFrame logic. In this state, you are then able to add INSERT, UPDATE, UPSERT, or DELETE statements directly into the file. Additionally, this gives you the facility to store the extracted data in a traditional relational database structure, if desired. The following sections offer specific information about shredding JSON files, as it pertains to different Azure products.

Azure Cosmos DB/SQL Pool

Azure Cosmos DB was formerly named Azure Document DB, at least the SQL API part of it. So, from its name alone you would think it is a good place to store documents like JSON files. In addition, Azure Cosmos DB can be scaled globally in a matter of minutes, with data synchronizations between the instances happening, by default, behind the scenes. You can find more about Azure Cosmos DB in Chapter 1 “Gaining the Azure Data Engineer Associate Certification.” Complete Exercise 5.7, where you will configure an Azure Cosmos DB linked service and shred some JSON.

Raymond Gallardo

Learn More →

Leave a Reply

Your email address will not be published. Required fields are marked *