Developer Blog

Tipps und Tricks für Entwickler und IT-Interessierte

Azure | Working with Widgets

TL;DR

Don’t want to read the post, then explore this Azure Notebook

Requirements

Define needed moduls and functions

from datetime import datetime

import pyspark.sql.functions as F

Create DataFrame for this post:

df = spark.sql("select * from diamonds")
df.show()

Working with Widgets

Default Widgets

dbutils.widgets.removeAll()

dbutils.widgets.text("W1", "1", "Text")
dbutils.widgets.combobox("W2", "3", [str(x) for x in range(1, 10)], "Combobox")
dbutils.widgets.dropdown("W3", "4", [str(x) for x in range(1, 10)], "Dropdown")

Multiselect Widgets

list = [ f"Square of {x} is {x*x}" for x in range(1, 10)]
dbutils.widgets.multiselect("W4", list[0], list, "Multi-Select")

Monitor the changes when selection values

print("Selection: ", dbutils.widgets.get("W4"))
print("Current Time =", datetime.now().strftime(



Filter Query by widgets

Prepare widgets

dbutils.widgets.removeAll()

df = spark.sql("select * from diamonds")

vals = [ str(x[0]) for x in df.select("cut").orderBy("cut").distinct().collect() ]
dbutils.widgets.dropdown("Cuts", vals[0], vals)

vals = [ str(x[0]) for x in df.select("carat").orderBy("carat").distinct().collect() ]
dbutils.widgets.dropdown("Carat", vals[0], vals)

Now, change some values

filter_cut = dbutils.widgets.get("Cuts")
df=spark.sql(f"select * from diamonds where cut='{filter_cut}'").show()

Power Query | Cookbook

Arbeiten mit dem Header

Schreibweise ändern

Grossschreibung/Kleinschreibung/CamelCase

= Table.TransformColumnNames(RenameColumns, Text.Upper)
= Table.TransformColumnNames(RenameColumns, Text.Lower)
= Table.TransformColumnNames(RenameColumns, Text.Proper)

Bestimmte Zeichen entfernen (z. B. _)

= Table.TransformColumnNames(Source,each Text.Proper(Replacer.ReplaceText( _ , "_", " ")))

Aufteilen in Worte

= Table.TransformColumnNames(Source, each Text.Combine(
                    Splitter.SplitTextByCharacterTransition({"a".."z"},{"A".."Z"})(_), " "))

Als Function

(columnNames as text) =>
let 
    splitColumn = Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"})(columnNames)
in
    Text.Combine(splitColumn, " ")

Daten transformieren

Zeilen gruppenweise pivotieren

Aufgabenstellung

Werden Daten angeliefert, in denen das Gruppierungsmerkmal in den Zeilen vorhanden ist und somit mehrere Zeilen pro Datensatz vorhanden, wünscht man sich meist eine kompaktere Darstellung.

Für den Datensatz mit dem Wert “Daten 1” werden also vier Zeilen mit unterschiedlichen Werten in GRUPPE und Wert angeliefert.

Problemstellung

Gewünscht ist aber eine kompaktere Darstellung mit den vorhandenen Gruppen als Spalten:

Die Aufgabenstellung ist somit die Umwandlung der angelieferten Daten:

Eine Beispieldatei liegt hier. Das Endergebnis liegt hier. Speichern sie beide Dateien im Order C: \TMP, dann stimmt der Verweis in Query.xlsx auf die Daten Daten.xlsx.

Schritt 1: Daten vorbereiten

Im ersten Schritt erstellen wir eine neue Excel-Daten und greifen auf die vorbereiteten Daten über Power Query zu.

Wählen Sie dazu im Register Daten den Eintrag Daten abrufen / Aus Datei / Aus Arbeitsmappe und selektieren sie die gewünschte Datei:

Eine Beispieldatei liegt hier.

Ein Klick auf Importieren führt sie zum Navigation

Sie sehen im Navigator 3 verschiedenen Elemente:

  • DATEN: die intelligente Tabelle im Tabellenblatt. Diese beinhaltet genau die gewünschten Daten
  • ERGBNIS: die intelligente Tabelle, die das zu erwartende Ergbnis beinhaltet
  • Beispieldaten: das Tabellenblatt mit den beiden intelligenten Tabellen

Selektieren sie das Element DATEN und klicken sie auf Daten transformieren.

Schritt 2: Spalte pivotieren

Wir wollen die Werte der Spalte GRUPPE als neue Spalten erhalten.

Hier klicken sie auf die Spalte GRUPPE und wählen dein Eintrag Spalte pivotieren im Register Transformieren / Beliebige Spalte:

Die Werte für die neuen Spalten (Gruppe 1, Gruppe 2 , ..) kommen aus der Spalte WERT (Wert 11, Wert 12, ..):

Wir wollen die Werte selbst übernehmen und keine (wie bei Pivottabellen meist üblich) Aggregierungsfunktion verwenden (Summe, Max, Anzahl, ..).

Klicken sie hierzu auf Erweiterte Optionen und selektieren sie den Eintrag Nicht aggregieren:

Anschließen klicken sie auf OK:

Zum Abschluss beenden wir den Power Query Editor:

Power BI | Importing multiple files

Getting Started

To import multiple files from a folder, the following two steps had to be done:

  • create a list of all files in the folder
  • for each file: read the file and add it to the result table

When importing files with Power BI, you can do both tasks together or each task separately.

The decision, which way to go, ist done after selection the folder:

You could choose between 4 posibilities. Strictly speaking, you have to possibilities, both with the same to final steps.

  1. Load or Combine files
    • Load means, the list of the files will be loaded as table
      Technicaly two things are done:
      • a connection is created in the model
      • the data (list of files) is loaded to the mode
  2. Just Load or Transform data
    • Transform means, you will end up in the Power Query Editor, so you can add additional modifications

In order to better understand the process, we show the two steps separately: one after the other

Load the list of files from folder

Start Power BI and close the start screen, if it is still visible.

Then, click on the Get Data Button in the Home Ribbon

If you click on the small down arrow on the Get Data Button, you have to select the option More

Now, select Folder and click on Connect

Enter the folder (or Browse…) with the files to be loaded and click Ok

After this, Power Query will create a table with all files in the folder.

Now, here is the point to decide, which way to go:

  • Combine
    • Read list of files and combine all files into on table
  • Load
    • Just keep the list of files and return to Power BI
  • Transform
    • Keep the list of files and open the Power Query Editor

We will choose to load the files, because we will do each step later separately

In Power BI Desktop, click on the Data Icon to show the resulting table.

Combine all files into one table

To add additional steps, we need the Power Query Editor.

So click on the 3 dots at the right side of the Query name Samples and choose Edit Query

Now, you are in the Power Query Editor

To combine all files, just click on the small icon beneath the header of the content column:

In the following dialog, you will see all files an a preview of the content for each file. For excel files, you will see the sheet names and the names of the intelligent tables in the sheets.

Click on OK to start the import.

When Power Query is done with this step, you will see the result:

The previous query Samples is still there, but now with the content of all files.

Additionally, you will see four other elements:

How combining the files is done

Each query consists of a list of steps, which are process one after another. Normaly, each step is using the result (data) of the previous step, performs some modifications and has a result (data) for the next step.

So, each step is modifying the whole data of the previous step. Describing some modifications means either

  • do one thing, e.g. add an additional column

or

  • do something for each row in the data
    This means, we need some sort of a loop, like “do xyz for each row in the data

Lets see, how Power Query solves this task.

In the query Samples, exampine the Step Invoke Custom Function1

The Step if performing the M function Table.AddColumn

This functions needs 3 parameter:

  • table: which is normaly the name of the prevoius step
    In our example #”Filtered Hidden Files1″
  • newColumnName: the name for the column to be added
    “Transform File”
  • columnGenerator: a function which is called for each row in the input table and creates the new column content
    each #”Transform File”([Content])

This results in the following procedure:

  • for each row of the list of files (output from step #”Filtered Hidden Files1″)
  • get the content of the column Content (this will be the parameter for the function call)
  • call the function “Transform File”([Content]) to create the column with one parameter: the value of the column ([Content] i

Helper Queries (Required)

This is the required function to create the column content for each file

Helper queries (Optional)

For the resulting query Samples to work, only the function definition is required.

But Power Query add some additional elements, to test the function and show the result

Create a parameter used in the query Transform Sample File and define the curent value Sample File

Define a value for the parameter. Here, the first row in the list of files is used.

Create a query and use an excel workbook as input. The name of the excel file is speficied as a parameter

In this query, the previously create parameter Parameter1 is used as parameter (to much of the word parameter, i know :))

Importing multiple files with different formats

If the selected folder contains files with different format, the result is not what you may be expect:

The list of files contains all files, both csv files and xls files

When combining the files, you can select between the files. So first take a look at an csv file:

The csv file looks as expected:

But the xls files looks strange:

But lets try. Click on ok to combine all files.

But, looking at the resulting query, the data of the xls files still looks strange:

To understand this, take a look into the create transfer function:

The crucial instruction is line 2:

Source = Csv.Document(Parameter3,[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]),

The source document (each file in the list of files) is interpreted as csv file.

So, the xls files are also read in as csv files. This leads to the strange result.

You can fix this by adding an additional filter step in the query to select only csv files: