Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Hack] 072-DataScienceInFabric #883

Open
wants to merge 243 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
243 commits
Select commit Hold shift + click to select a range
d2169c7
Created WhatTheHack template stub
Mar 27, 2024
35f30d5
Update README.md
juanlldc Mar 27, 2024
af7b883
Merge pull request #1 from juanlldc/patch-3
pradeepsingla87 Mar 27, 2024
77f280b
Update Solution-00.md
juanlldc Apr 22, 2024
bac5e5c
Update README.md
juanlldc Apr 22, 2024
b6c3523
Update README.md
juanlldc Apr 22, 2024
50dcb9d
Merge pull request #4 from juanlldc/patch-4
pradeepsingla87 Apr 25, 2024
7c6ffa4
Merge pull request #3 from juanlldc/patch-5
pradeepsingla87 Apr 25, 2024
afad360
Merge pull request #2 from juanlldc/patch-6
pradeepsingla87 Apr 25, 2024
f653bfc
Merge branch 'microsoft:master' into xxx-DataScience_In_MicrosoftFabric
lesantana May 14, 2024
da8908d
Creating directory xxx-DataScience_In_MicrosoftFabric/Coach/Solutions…
pradeepsingla87 Jun 4, 2024
29badf4
Committing 7 items from workspace 9aafe34c-0112-488c-ac8a-787e353a55e1
pradeepsingla87 Jun 4, 2024
e836f40
Committing 6 items from workspace 9aafe34c-0112-488c-ac8a-787e353a55e1
pradeepsingla87 Jun 4, 2024
116eb0e
Committing 4 items from workspace 9aafe34c-0112-488c-ac8a-787e353a55e1
pradeepsingla87 Jun 4, 2024
0effa71
Merge branch 'microsoft:master' into xxx-DataScience_In_MicrosoftFabric
pradeepsingla87 Jun 4, 2024
ed66544
Add files via upload
pradeepsingla87 Jun 5, 2024
2d12f06
Rename 03-Train-Register-HeartFailurePredictionModel (1).ipynb to 03…
pradeepsingla87 Jun 5, 2024
6262729
Rename 02-data-analysis-preprocess (4).ipynb to 02-data-analysis-prep…
pradeepsingla87 Jun 5, 2024
536c723
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
c79783a
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
09ad5c0
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
5e4bae3
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
8b72184
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
c9a69d4
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
7f371f9
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
4aac647
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
03db381
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
cc2fb00
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
96dac3f
Update Challenge-00.md
pradeepsingla87 Jun 22, 2024
513c465
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
cea02ae
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
4cdc187
Update Challenge-04.md
pradeepsingla87 Jun 22, 2024
00fb96b
Update Solution-04.md
pradeepsingla87 Jun 25, 2024
0216124
Update Solution-04.md
pradeepsingla87 Jun 25, 2024
fe9f6df
Add files via upload
pradeepsingla87 Jun 25, 2024
5fc2420
Update Solution-04.md
pradeepsingla87 Jun 25, 2024
f6117f9
Update Solution-04.md
pradeepsingla87 Jun 25, 2024
13a1091
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
9da8e40
Add files via upload
pradeepsingla87 Jun 26, 2024
c2b88f2
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
5f4af6e
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
9bfa0de
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
5054975
Add files via upload
pradeepsingla87 Jun 26, 2024
c76930f
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
5532e35
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
392ae8f
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
3561ce7
Add files via upload
pradeepsingla87 Jun 26, 2024
f451759
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
437f8f3
Add files via upload
pradeepsingla87 Jun 26, 2024
ef74d75
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
777e7a4
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
685bcdc
Add files via upload
pradeepsingla87 Jun 26, 2024
adec07a
Delete xxx-DataScience_In_MicrosoftFabric/Screenshot_26-6-2024_03016_…
pradeepsingla87 Jun 26, 2024
68961c0
Add files via upload
pradeepsingla87 Jun 26, 2024
cc258e7
Update Solution-04.md
pradeepsingla87 Jun 26, 2024
577feef
Update README.md
juanlldc Jun 26, 2024
5fb84b4
Add files via upload
juanlldc Jun 26, 2024
58193cd
Rename 01-Ingest-Heart-Failure-Dataset-to-Lakehouse (1).ipynb to 01-I…
juanlldc Jun 26, 2024
41e1066
Rename 03-Train-Register-HeartFailurePredictionModel (1).ipynb to 03-…
juanlldc Jun 26, 2024
9372227
Add files via upload
juanlldc Jun 26, 2024
a08f76a
Create a
juanlldc Jun 27, 2024
93c5ee1
Rename xxx-DataScience_In_MicrosoftFabric/Coach/Screenshot_26-6-2024_…
juanlldc Jun 27, 2024
eaf1864
Rename xxx-DataScience_In_MicrosoftFabric/Coach/Screenshot_26-6-2024_…
juanlldc Jun 27, 2024
2833080
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-10.png to xxx-D…
juanlldc Jun 27, 2024
e9275fa
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-11.png to xxx-D…
juanlldc Jun 27, 2024
d4d1a64
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-12.png to xxx-D…
juanlldc Jun 27, 2024
79d8e8c
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-13.png to xxx-D…
juanlldc Jun 27, 2024
52bacc9
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-14.png to xxx-D…
juanlldc Jun 27, 2024
6d5a380
Rename xxx-DataScience_In_MicrosoftFabric/Coach/image-15.png to xxx-D…
juanlldc Jun 27, 2024
0ff5d32
Update Solution-04.md
juanlldc Jun 27, 2024
c00676f
Update Solution-01.md
juanlldc Jun 28, 2024
69dece5
Update Solution-01.md
juanlldc Jun 28, 2024
407466a
Update Solution-00.md
juanlldc Jun 28, 2024
16250a7
Update Solution-00.md
juanlldc Jun 28, 2024
a2f7bf5
Update Solution-00.md
juanlldc Jul 1, 2024
0d8e902
Update Challenge-00.md
juanlldc Jul 1, 2024
e24bf96
Update Challenge-00.md
juanlldc Jul 1, 2024
f47eb33
Update Solution-00.md
juanlldc Jul 1, 2024
3f76345
Update Solution-01.md
juanlldc Jul 1, 2024
e3f05ac
Update Solution-01.md
juanlldc Jul 1, 2024
ec32268
Update Challenge-01.md
juanlldc Jul 1, 2024
980a458
Update Challenge-00.md
juanlldc Jul 1, 2024
a29adc6
Update Challenge-00.md
juanlldc Jul 1, 2024
57e8a52
Update Challenge-05.md
juanlldc Jul 1, 2024
6dbb9c4
Update Challenge-06.md
juanlldc Jul 1, 2024
2181d27
Update Challenge-06.md
juanlldc Jul 1, 2024
c330e20
Update Challenge-06.md
juanlldc Jul 1, 2024
feac683
Update Solution-06.md
juanlldc Jul 1, 2024
b7cbab7
Update Solution-06.md
juanlldc Jul 1, 2024
435d5b7
Update Solution-00.md
juanlldc Jul 1, 2024
811ec3b
Update Challenge-00.md
juanlldc Jul 1, 2024
150a49c
Update Solution-00.md
juanlldc Jul 1, 2024
f119ec9
Update Solution-00.md
juanlldc Jul 1, 2024
d2a33f1
Update Challenge-00.md
juanlldc Jul 1, 2024
e9ccded
Update Solution-00.md
juanlldc Jul 1, 2024
9a983fe
Update Solution-01.md
juanlldc Jul 1, 2024
e00bebe
Update Challenge-01.md
juanlldc Jul 1, 2024
098eaa7
Update Challenge-01.md
juanlldc Jul 1, 2024
dc9dc81
Update Challenge-01.md
juanlldc Jul 1, 2024
6919854
Update Challenge-01.md
juanlldc Jul 1, 2024
3efb9f8
Update Challenge-01.md
juanlldc Jul 1, 2024
1c6037e
Update Challenge-01.md
juanlldc Jul 1, 2024
bb0fec8
Update Challenge-01.md
juanlldc Jul 1, 2024
5dc1619
Update Challenge-01.md
juanlldc Jul 1, 2024
f43bd7e
Update Solution-03.md
juanlldc Jul 1, 2024
24cc825
Update Solution-03.md
juanlldc Jul 1, 2024
f253ce1
Update Solution-03.md
juanlldc Jul 1, 2024
67f5061
Update Solution-03.md
juanlldc Jul 1, 2024
1e1b2a8
Update Solution-03.md
juanlldc Jul 1, 2024
a774ecc
Update Solution-01.md
juanlldc Jul 1, 2024
a2ac122
Update Solution-01.md
juanlldc Jul 1, 2024
5824e15
Update Solution-03.md
juanlldc Jul 1, 2024
59d1961
Update Solution-01.md
juanlldc Jul 1, 2024
57e30bf
Update Challenge-03.md
juanlldc Jul 1, 2024
8972c8d
Update Solution-04.md
juanlldc Jul 1, 2024
af168a6
Update Challenge-04.md
juanlldc Jul 1, 2024
fccca04
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
a17d310
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
606f5a0
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
2c188f2
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
28674a4
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
8bae485
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
cdad534
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
503ceb5
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
dfb0615
Update Challenge-05.md
pradeepsingla87 Jul 8, 2024
ba06ede
Update Solution-05.md
pradeepsingla87 Jul 8, 2024
04bc87f
Update Solution-05.md
pradeepsingla87 Jul 8, 2024
a5a9cd6
Update Solution-05.md
pradeepsingla87 Jul 8, 2024
d4d5e1a
Update Solution-05.md
pradeepsingla87 Jul 8, 2024
bfb9f16
Update Solution-05.md
pradeepsingla87 Jul 15, 2024
eaa5a01
Update Solution-05.md
pradeepsingla87 Jul 15, 2024
95cf645
Update Solution-05.md
pradeepsingla87 Jul 15, 2024
372bc41
Update Challenge-02.md
lesantana Jul 15, 2024
961a1e9
Update Solution-01.md
juanlldc Jul 15, 2024
a62b733
Update Solution-01.md
juanlldc Jul 15, 2024
1319473
Update Challenge-02.md
lesantana Jul 15, 2024
5003f42
Update Solution-01.md
juanlldc Jul 15, 2024
887aaa6
Update Challenge-02.md
lesantana Jul 15, 2024
d8bbf59
Update Challenge-02.md
lesantana Jul 15, 2024
af0497a
Update Challenge-02.md
lesantana Jul 15, 2024
a65ab21
Update Solution-03.md
juanlldc Jul 15, 2024
487bd9c
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solutions/Notebooks/0…
juanlldc Jul 15, 2024
2d1a545
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solutions/Notebooks/0…
juanlldc Jul 15, 2024
bcbf8b6
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solutions/Notebooks/0…
juanlldc Jul 15, 2024
940a20f
Add files via upload
juanlldc Jul 15, 2024
bcfe236
Update Solution-02.md
lesantana Jul 16, 2024
485645b
Update Solution-02.md
lesantana Jul 16, 2024
f3e3fc1
Update Solution-02.md
lesantana Jul 16, 2024
d162d42
Update Solution-05.md
juanlldc Jul 17, 2024
ad2677c
Update Challenge-05.md
juanlldc Jul 17, 2024
f613d1a
Update Challenge-05.md
juanlldc Jul 17, 2024
a31e339
Update Solution-06.md
juanlldc Jul 17, 2024
a950f42
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Photos/a
juanlldc Jul 17, 2024
03cf467
Add files via upload
juanlldc Jul 17, 2024
41d6c76
Rename Screenshot 2024-07-17 153448.png to postman-body.png
juanlldc Jul 17, 2024
b82209e
Rename Screenshot 2024-07-17 153526.png to postman-token.png
juanlldc Jul 17, 2024
e1ac6aa
Rename Screenshot 2024-07-17 153501.png to postman-header.png
juanlldc Jul 17, 2024
95ba384
Update Solution-06.md
juanlldc Jul 17, 2024
5abc2da
Update Challenge-06.md
juanlldc Jul 17, 2024
dcdd087
Update Challenge-06.md
juanlldc Jul 17, 2024
27d1138
Update Challenge-06.md
juanlldc Jul 17, 2024
9bba28a
Update Challenge-06.md
juanlldc Jul 17, 2024
2ade4bc
Add files via upload
juanlldc Jul 18, 2024
13433f7
Create a
juanlldc Jul 18, 2024
3089bcb
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Notebooks/a
juanlldc Jul 18, 2024
cbdddd7
Create a
juanlldc Jul 18, 2024
6488513
Add files via upload
juanlldc Jul 18, 2024
01eddb3
Add files via upload
juanlldc Jul 18, 2024
bded490
Delete xxx-DataScience_In_MicrosoftFabric/Student/Resources/01-Ingest…
juanlldc Jul 18, 2024
fc8155e
Delete xxx-DataScience_In_MicrosoftFabric/Student/Resources/03-Train-…
juanlldc Jul 18, 2024
fba50cb
Delete xxx-DataScience_In_MicrosoftFabric/Student/Resources/04-Perfor…
juanlldc Jul 18, 2024
cb3d487
Delete xxx-DataScience_In_MicrosoftFabric/Coach/CoachResources.zip
juanlldc Jul 18, 2024
e1df1e7
Add files via upload
juanlldc Jul 18, 2024
796ca9e
Add files via upload
juanlldc Jul 18, 2024
a77529a
Create a
juanlldc Jul 18, 2024
4c242c3
Add files via upload
juanlldc Jul 18, 2024
173bb75
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Notebooks/a
juanlldc Jul 18, 2024
1758bd4
Delete xxx-DataScience_In_MicrosoftFabric/Student/Notebooks/a
juanlldc Jul 18, 2024
e785e1b
Update README.md
juanlldc Jul 18, 2024
fda64a5
Update README.md
juanlldc Jul 18, 2024
da1c24f
Update README.md
juanlldc Jul 18, 2024
4df385e
Update README.md
juanlldc Jul 18, 2024
02d8c5e
Add files via upload
juanlldc Jul 18, 2024
4666429
Delete xxx-DataScience_In_MicrosoftFabric/Student/Resources/heart_fai…
juanlldc Jul 18, 2024
8538bc1
Delete xxx-DataScience_In_MicrosoftFabric/Student/StudentResources.zip
juanlldc Jul 18, 2024
b93a780
Add files via upload
juanlldc Jul 18, 2024
5855603
Delete xxx-DataScience_In_MicrosoftFabric/Coach/CoachResources.zip
juanlldc Jul 18, 2024
16f0fe7
Add files via upload
juanlldc Jul 18, 2024
5efecf7
Delete xxx-DataScience_In_MicrosoftFabric/Student/StudentResources.zip
juanlldc Jul 18, 2024
19a4732
Add files via upload
juanlldc Jul 18, 2024
c78a781
Update README.md
juanlldc Jul 18, 2024
4cc27ce
Update README.md
juanlldc Jul 18, 2024
976c607
Delete xxx-DataScience_In_MicrosoftFabric/Student/Challenge-07.md
juanlldc Jul 18, 2024
57c270e
Delete xxx-DataScience_In_MicrosoftFabric/Student/Challenge-08.md
juanlldc Jul 18, 2024
484c1be
Update Challenge-00.md
juanlldc Jul 18, 2024
ef0306b
Update Challenge-01.md
juanlldc Jul 18, 2024
dbe431b
Update Challenge-02.md
juanlldc Jul 18, 2024
f5cb3f1
Update Challenge-02.md
juanlldc Jul 18, 2024
6f36d1f
Update Challenge-02.md
juanlldc Jul 18, 2024
e597e3c
Update Challenge-02.md
juanlldc Jul 18, 2024
f666f9d
Update Challenge-02.md
juanlldc Jul 18, 2024
f6d3ea2
Update Challenge-02.md
juanlldc Jul 18, 2024
d4f010e
Update Challenge-05.md
juanlldc Jul 18, 2024
b152fbf
Update Challenge-05.md
juanlldc Jul 18, 2024
1682ec0
Update Challenge-05.md
juanlldc Jul 18, 2024
fac3170
Update Challenge-05.md
juanlldc Jul 18, 2024
b8c130e
Update Challenge-05.md
juanlldc Jul 18, 2024
7def066
Update Challenge-05.md
juanlldc Jul 18, 2024
6e4850e
Update Challenge-05.md
juanlldc Jul 18, 2024
00d75bd
Update Challenge-05.md
juanlldc Jul 18, 2024
8627b18
Update Challenge-05.md
juanlldc Jul 18, 2024
0ae5e23
Update Challenge-05.md
juanlldc Jul 18, 2024
30f4636
Update Challenge-05.md
juanlldc Jul 18, 2024
3226843
Update Challenge-06.md
juanlldc Jul 18, 2024
5549de7
Update Challenge-06.md
juanlldc Jul 18, 2024
0f8ad39
Update Challenge-06.md
juanlldc Jul 18, 2024
93ba24f
Update Challenge-06.md
juanlldc Jul 18, 2024
c3d9db4
Update Challenge-06.md
juanlldc Jul 22, 2024
b700dac
Update Challenge-06.md
juanlldc Jul 22, 2024
880d72a
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Lectures.pptx
juanlldc Jul 23, 2024
6375c91
Update README.md
juanlldc Jul 23, 2024
76164cf
Update README.md
juanlldc Jul 23, 2024
5b87f74
Update README.md
juanlldc Jul 23, 2024
d718187
Update README.md
juanlldc Jul 23, 2024
b62907e
Update Solution-05.md
juanlldc Jul 23, 2024
4121415
Update Solution-05.md
juanlldc Jul 23, 2024
43c0db2
Update Solution-05.md
juanlldc Jul 23, 2024
ac5a036
Update Solution-06.md
juanlldc Jul 23, 2024
a4d841f
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solution-07.md
juanlldc Jul 23, 2024
52db5cd
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solution-08.md
juanlldc Jul 23, 2024
375d649
Delete xxx-DataScience_In_MicrosoftFabric/Coach/Solutions directory
juanlldc Jul 23, 2024
166358c
Renaming top level folder from xxx to 072
juanlldc Jul 26, 2024
2a5f543
Merge branch 'microsoft:master' into xxx-DataScience_In_MicrosoftFabric
juanlldc Jul 26, 2024
4a551ef
Moving student resources
juanlldc Jul 29, 2024
db24fe1
Merge branch 'xxx-DataScience_In_MicrosoftFabric' of https://github.c…
juanlldc Jul 29, 2024
6abe351
Moved coach solutions
juanlldc Jul 29, 2024
b287136
Update student setup instructions
juanlldc Jul 29, 2024
9b44fb8
Updating coach setup instructions
juanlldc Jul 29, 2024
a72e529
Moved Images folder to root, updated references
juanlldc Jul 29, 2024
41593f0
Update licensing requirements across both guides
juanlldc Jul 31, 2024
2bcfd0b
Add wordlist and fix spellcheck errors
juanlldc Aug 5, 2024
0d3ce76
Second spellchecker fix
juanlldc Aug 5, 2024
b0f78a0
Adding changes from Cameron's test run
juanlldc Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions 072-DataScienceInFabric/.wordlist.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
dataframe
DataFrame
DataFrame's
MLFlow
Leandro
interpretability
auc
repurpose
85 changes: 85 additions & 0 deletions 072-DataScienceInFabric/Coach/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# What The Hack - Data Science In Microsoft Fabric

## Introduction

Welcome to the coach's guide for the Data Science In Microsoft Fabric What The Hack. Here you will find links to specific guidance for coaches for each of the challenges.

**NOTE:** If you are a Hackathon participant, this is the answer guide. Don't cheat yourself by looking at these during the hack! Go learn something. :)

## Coach's Guides

- Challenge 00: **[Prerequisites - Ready, Set, GO!](./Solution-00.md)**
- Configure your Fabric workspace and gather your data
- Challenge 01: **[Bring your data to the OneLake](./Solution-01.md)**
- Creating a shortcut to the available data
- Challenge 02: **[Prepare your data for ML](./Solution-02.md)**
- Clean and transform the data into a useful format while leveraging Data Wrangler
- Challenge 03: **[Train and register the model](./Solution-03.md)**
- Train a machine learning model with ML Flow with the help of Copilot
- Challenge 04: **[Generate batch predictions](./Solution-04.md)**
- Score a static dataset with the model
- Challenge 05: **[Visualize predictions with a PowerBI report ](./Solution-05.md)**
- Visualize generated predictions by using a PowerBI report
- Challenge 06: **[Deploy the model to an AzureML real-time endpoint](./Solution-06.md)**
- Deploy the trained model to an AzureML endpoint for inference

## Coach Prerequisites

This hack has pre-reqs that a coach is responsible for understanding and/or setting up BEFORE hosting an event. Please review the [What The Hack Hosting Guide](https://aka.ms/wthhost) for information on how to host a hack event.

The guide covers the common preparation steps a coach needs to do before any What The Hack event, including how to properly configure Microsoft Teams.

### Student Resources

Always refer students to the [What The Hack website](https://aka.ms/wth) for the student guide: [https://aka.ms/wth](https://aka.ms/wth)

**NOTE:** Students should **not** be given a link to the What The Hack repo before or during a hack. The student guide does **NOT** have any links to the Coach's guide or the What The Hack repo on GitHub.


## Azure and Fabric Requirements

This hack requires students to have access to Azure and Fabric. These requirements should be shared with a stakeholder in the organization that will be providing the licenses that will be used by the students.

### Fabric and PowerBI licensing requirements:

Each student will need access to Microsoft Fabric and be licensed to create PowerBI reports for this hack. The following are the options to complete these licensing requirements:

1. **Recommended if available**: Individual [Fabric free trials](https://learn.microsoft.com/en-us/fabric/get-started/fabric-trial#start-the-fabric-capacity-trial). This will grant users access to creating the required Fabric items as well as the PowerBI report. **If previously used, the Fabric free trial may be unavailable**
2. Fabric Capacity and PowerBI Pro/Premium per user license. Each user would need their own PowerBI license but capacities could be shared and scaled up according to the needs. If running the hack on an individual basis, an F4 capacity would be adequate, and an F8 capacity would have generous compute power margin. **Alternatively, users can activate a [PowerBI Free Trial](https://learn.microsoft.com/en-us/power-bi/fundamentals/service-self-service-signup-for-power-bi) if available.** The PowerBI trial could be available even if the Fabric one is not.


### Azure licensing requirements

There are 2 challenges that require access to Azure:

- Challenge 1: Students are required to navigate an Azure ADLS Gen 2 account through the Azure Portal to learn how to set up a Fabric shortcut to an existing file. This challenge requires each student to have contributor permissions to the resource, but 1 single storage account/directory/file could be shared among all students, given that they will not modify it but rather just access and connect to it.

- Challenge 6: Students are required to have Azure AI Developer access to an Azure Machine Learning resource. Each student will need to register their own model and create their own real-time endpoint, which is why it is **recommended to individually deploy an Azure ML workspace per student**.

Given these requirements, each student could have their own Azure subscription or they could share access to a single subscription.

These Azure resources can be deployed on an individual per-student basis using the `deployhack.sh` script included in the student resources folder.

## Suggested Hack Agenda

You may schedule this hack in any format, as long as the challenges are completed sequentially.

Time estimate for each challenge:
- Challenge 00: 15 minutes
- Challenge 01: 30 minutes
- Challenge 02: 30 minutes
- Challenge 03: 45 minutes
- Challenge 04: 30 minutes
- Challenge 05: 30 minutes
- Challenge 06: 45 minutes

## Repository Contents

- `./Coach`
- Coach's Guide and related files
- `./Coach/Solutions`
- Solution files with completed example answers to challenges
- `./Student`
- Student's Challenge Guide
- `./Student/Resources`
- Student resource files, also available as a download link on Student Challenge 0
62 changes: 62 additions & 0 deletions 072-DataScienceInFabric/Coach/Solution-00.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Challenge 00 - Prerequisites - Ready, Set, GO! - Coach's Guide

**[Home](./README.md)** - [Next Solution >](./Solution-01.md)

## Introduction

Thank you for participating in the Data Science in Microsoft Fabric What The Hack. Before you can hack, you will need to set up some prerequisites.

## Common Prerequisites

We have compiled a list of common tools and software that will come in handy to complete most What The Hack Azure-based hacks!

You might not need all of them for the hack you are participating in. However, if you work with Azure on a regular basis, these are all things you should consider having in your toolbox.

<!-- If you are editing this template manually, be aware that these links are only designed to work if this Markdown file is in the /xxx-HackName/Student/ folder of your hack. -->

- [Azure Subscription](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-subscription)
- [Postman](https://www.postman.com/downloads/)
- [Managing Cloud Resources](../Student/000-HowToHack/WTH-Common-Prerequisites.md#managing-cloud-resources)
- [Azure Portal](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-portal)
- [Azure Cloud Shell](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-cloud-shell)
- [Azure CLI (optional)](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-cli)
- [Note for Windows Users](../Student/000-HowToHack/WTH-Common-Prerequisites.md#note-for-windows-users)
- [Azure PowerShell CmdLets](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-powershell-cmdlets)

- [Azure Storage Explorer (optional)](../Student/000-HowToHack/WTH-Common-Prerequisites.md#azure-storage-explorer)

Additionally please refer to the [Coach Hack introduction](./README.md) for more information about licensing requirements and options
## Description

Now that you have the common pre-requisites installed on your workstation, there are prerequisites specific to this hack.

In Challenge 0 on the student guide, students are instructed to download the Resources folder [here](https://aka.ms/FabricdsWTHResources). This folder contains the notebooks students will be working with, as well as a shell script that they will use to deploy some needed Azure resources.

The [coach solution notebooks](./Solutions/) are the completed versions of the student notebooks. The solutions can be used as a guide or uploaded to Fabric to complete each Challenge.


**NOTE:** The resources.zip folder also includes the heart.csv file. You can upload this data directly to the Fabric Lakehouse if you decide you want to go through this hack without needing an Azure subscription. However, this will skip half of Challenge 1 and the important concept of using shortcuts in Fabric. If you are going to be setting up the Azure resources and using the shortcut, ignore the heart.csv file.

To begin setting up your Azure subscription for this hack, you will run a bash script that will deploy and configure a list of resources. You can find this script as the `HackSetup.sh` file in the resources folder.
- Download the setup file to your computer
- Go to the Azure portal and click on the cloud shell button on the top navigation bar, to the right of the Copilot button.
- **NOTE**: This script has been designed for the Azure CLI. It might fail to deploy if you attempt to run it from a local terminal.
- Once the cloud shell connects, make sure you are using a Bash shell. If you are not, click on the button on the top-right corner of the cloud shell to switch to bash.
- Click on the Manage Files button on the shell's navigation bar and select upload. Select the setup file from your computer.
- Run the `sh HackSetup.sh` command in your cloud shell.
- Follow the prompts in the shell.

After setting up your Azure resources, head to [Microsoft Fabric](https://fabric.microsoft.com/).
- Create a new workspace by clicking on 'Workspaces' in the vertical menu on the left side of the screen. Use the 'New Workspace' button at the bottom of the list.
- Once you are inside your new workspace, select the Data Science experience using the button on the bottom left corner of the screen.
- At the top of the Data Science experience menu, check that you are still in the new workspace and select 'Import Notebook' from the top row of options.
- Follow the prompts to upload the 4 notebook (`.ipynb`) files contained within the resources folder.


## Success Criteria

To complete this challenge successfully, you should be able to:

- Verify that you have a storage account with the heart.csv data in a container
- Verify that you have a Fabric workspace where your 4 notebooks are available
- (Optional) Verify that your Azure ML workspace has correctly deployed (if completing Challenge)
45 changes: 45 additions & 0 deletions 072-DataScienceInFabric/Coach/Solution-01.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Challenge 01 - Bring your data to the OneLake - Coach's Guide

[< Previous Solution](./Solution-00.md) - **[Home](./README.md)** - [Next Solution >](./Solution-02.md)

## Notes & Guidance

In this challenge, hack participants must create a shortcut to the folder deployed in their Azure subscription on Challenge 0. This will allow them to use the data in Fabric without the need for replication. Once the shortcut is completed, participants will open Notebook 1 to load the csv file into a delta table for further modification on Notebook 2.

### Sections

1. Create a Lakehouse (non-notebook)
2. Create a Shortcut (non-notebook)
3. Read the .csv file into a dataframe in the notebook (Notebook 1)
4. Write the dataframe to the lakehouse as a delta table (Notebook 1)

### Student step-by-step instructions (creating a shortcut)
- Creating a Lakehouse:
- Participants must create a lakehouse on the Fabric workspace they previously set up. In Fabric, navigate to the workspace.
- On the top left of the screen, select new and more options.
- On the data engineering section, select Lakehouse. Give the lakehouse a unique name and click on create.

- Creating a Shortcut:
- On the Lakehouse navigator, use the left hand-side menu and click on the 3 dots (...) next to files. Click on "New shortcut"
- On the shortcut wizard, click on "Azure Data Lake Storage Gen2"
- Go to your Azure portal. The URL can be found on the **Settings>Endpoint** side menu of the Storage Account. In this menu, you will see a variety of endpoint Resource IDs and URLs. Find and copy the **data lake storage URL** from the list. Enter it into the wizard in Fabric.
- Create a new connection, give it a name and select "Account Key" as the authentication kind.
- Go back to your Azure portal. The Account Key can be found on in the **Security + Networking>Access keys** side menu of the Storage Account. Show and copy one of the keys. Enter it into the wizard in Fabric.
- Click on next to access the file explorer. Wait for the screen to load.
- On the side menu, expand the file-system folder. Select the check mark next to the "files" folder.
- Click next to move to the next screen, then click on create to create the shortcut.
- Verify that your shortcut is showing under the **Files** folder of the lakehouse navigator. You might need to click on the 3 dots and on refresh if your shortcut is not present initially.

### Overview of student directions (running Notebook 1)
- This section of the challenge is notebook based. All the instructions and links required for participants to successfully complete this section can be found on Notebook 1 in the `student/resources.zip/notebooks` folder.
- To run the notebook, go to your Fabric workspace and select Notebook 1. Ensure that it is correctly attached to the lakehouse. You might need to connect to the lakehouse you previously created on the left-hand side file explorer menu.
- The students must follow the instructions, leverage the documentation and complete the code cells sequentially.

### Coaches' guidance
- This challenge has 2 main sections, creating a shortcut and loading the files into delta tables. The first section must be completed before working on Notebook 1.
- The full version of Notebook 1, with all code cells filled in, can be found for reference in the `coach/solutions.zip` folder of this GitHub.
- The aim of this challenge, as noted in the student guide, is to understand lakehouses, shortcuts and the delta format.
- To assist students, coaches can clear up doubts regarding the Python syntax or how to get started with notebooks, but students should focus on learning how to set up shortcuts, navigate the Fabric UI and read/write to the delta lake.

## Success criteria
- The heart.csv data is now saved as a delta table on the lakehouse
62 changes: 62 additions & 0 deletions 072-DataScienceInFabric/Coach/Solution-02.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Challenge 02 - Data preparation with Data Wrangler - Coach's Guide

[< Previous Solution](./Solution-01.md) - **[Home](./README.md)** - [Next Solution >](./Solution-03.md)

## Notes & Guidance

In this challenge, hack participants must use Data Wrangler to prepare the heart dataset for model training. The purpose is to focus on transforming and preparing the data for the next challenges. They will have the flexibility to either write code in a notebook or leverage Data Wrangler’s intuitive interface to streamline the pre-processing tasks.

### Sections

1. Read the .csv file into a pandas dataframe in the notebook. (Notebook 2)
2. Launch the Data Wrangler and interact with the data cleaning operations (Notebook 2)
3. Apply the operations using python codes (Notebook 2)
4. Develop feature engineering using spark. (Notebook 2)
5. Write the dataframe to the lakehouse as a delta table. (Notebook 2)

### Student step-by-step instructions
- Launching Data Wrangler:
- Participants must create a pandas dataframe the fabric notebook. It’s necessary to complete the first cell in the notebook 2.
- Once executed, under the notebook ribbon Home tab, select Launch Data Wrangler. You'll see a list of activated pandas DataFrames available for editing.
- Select the DataFrame you just created in last cell and open in Data Wrangler. From the Pandas dataframe list, select `df`.


- Data Cleaning Operations – (Data Wrangler)
- *Removing Unnecessary Columns*
- On the *Operations* panel, expand *Schema* and select *Drop columns*.
- Select `RowNumber`. This column will appear in red in the preview, to show they're changed by the code (in this case, dropped.)
- Select **Apply**, a new step is created in the **Cleaning steps panel** on the bottom left.

- *Dropping Missing Values*
- On the **Operations** panel, select **Find and replace**, and then select **Drop missing values**.
- Select the `RestingBP`, `Cholesterol` and `FastingBS` columns. On the right left those are the ones that are pointed as missing values.
- Select **Apply**, a new step is created in the **Cleaning steps panel** on the bottom left.

- *Dropping Duplicate Rows*
- On the **Operations** panel, select **Find and replace**, and then select **Drop duplicate rows**.
- Select **Apply**, a new step is created in the **Cleaning steps panel** on the bottom left.

- Feature Engineering - (Notebook)
- This part is notebook based. Participants will work in cells 09, 10, and 11 to transform categorical values into numerical labels.
- You can also explore how to one-hot encode the categorical columns with Data Wrangler. However, this will not create labels in your existing columns, but rather a new column for each category with True and False values. Using this alternative format might need some modification to the code in the model training process. Please discuss this possibility with hack attendees to raise awareness of this Data Wrangler feature.

### Overview of student directions (running Notebook 2)
- This section of the challenge is notebook based. All the instructions and links required for participants to successfully complete this section can be found on Notebook 2 in the `student/resources.zip/notebooks` folder.
- To run the notebook, go to your Fabric workspace and select Notebook 2. Ensure that it is correctly attached to the lakehouse. You might need to connect to the lakehouse you previously created on the left-hand side file explorer menu.
- The students must follow the instructions, leverage the documentation and complete the code cells sequentially.

### Coaches' guidance

- This challenge has 3 main sections, Data Wrangler operations, feature engineering and saving processed data to a delta table.
- The full version of Notebook 2, with all code cells filled in, can be found for reference in the coach/solutions.zip folder of this GitHub.
- The aim of this challenge, as noted in the student guide, is to understand data preparation using data wrangler and fabric notebooks.
- To assist students, coaches can clear up doubts regarding the Python syntax or how to get started with notebooks, but students should focus on learning how to operate data wrangler, navigate the Fabric UI, code in notebooks and read/write to the delta lake.


## Success criteria
- The heart dataset totally shaped, cleaned and prepared for the model training.
- No data duplicated or exceeded columns.
- No missing values.
- No categorical values.


Loading