Skip to content

Commit

Permalink
[Stp 409] impact of deleting user report (#409)
Browse files Browse the repository at this point in the history
* ingest insights datamodels

* add workflow to generate report

* rename directory and add readme

* fix formatting

* add report image sample
  • Loading branch information
wchen4 authored Aug 30, 2024
1 parent 13ba01e commit f1c75ae
Show file tree
Hide file tree
Showing 9 changed files with 226 additions and 0 deletions.
52 changes: 52 additions & 0 deletions scenarios/monitoring/app/impact_of_deleting_user/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Workflow: Scenario (Generate Impact of Deleting A User Report)

## Scenario

The purpose of this scenario is to generate a report listing the impact on resources owned by a user who will be deleted.

`Deleted Users and the Impact on Existing Resources`: https://docs.treasuredata.com/articles/#!pd/Deleted-Users-and-the-Impact-on-Existing-Resources

### Limitations
Due to limitations in the API, this report cannot provide resource ownership information for the following resources:
- API keys
- Sources
- Treasure Insights Dashboards

### Prerequisites
This report is dependent on the following monitoring workflows in this scenario. These workflows must run first.
https://github.com/treasure-data/treasure-boxes/tree/master/scenarios/monitoring
- basic_monitoring
- cdp_monitoring
- insights_monitoring
- workflow_monitoring


### Steps
#### 1. Push the workflow to Treasure Data
```
$ cd impact_of_deleting_user
$ td wf push impact_of_deleting_user
```

#### 2. Configure settings in `common/settings.yaml`
- `td.database` - the database to write the report to
- `td.tables.report_table` - the table to write the report to
- `td.email_to_check` - the user to check for resource ownership and impact upon deletion

#### 3. Uncomment data preparation task (if required)
This workflow depends on the monitoring workflows listed in the Prerequisites to download resource data prior to generating the report. If the monitoring workflows have not been run yet, uncomment the `+prepare_data` task to run those workflows first.

#### 4. Register td.apikey as a workflow secret.
```
$ td wf secrets --project impact_of_deleting_user --set td.apikey=<master_api_key>
```

#### 5. Trigger a new session attempt of the workflow
```
$ td wf start impact_of_deleting_user impact_of_deleting_user --session now
```

#### 6. View the report
When the workflow session is completed, the report will be available in the database and table specified in Step 2.

![](images/report.png)
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
td:
database: monitoring
tables:
report_table: impact_of_deleting_user
email_to_check: user@example.com
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
_export:
!include : common/settings.yaml

# +prepare_data:
# _parallel: true

# +run_basic_monitoring:
# require>: initial_ingest
# project_name: basic_monitoring

# +run_cdp_monitoring:
# require>: initial_ingest
# project_name: cdp_monitoring

# +run_workflow_monitoring:
# require>: initial_ingest_session_attempt
# project_name: workflow_monitoring

# +run_insights_monitoring:
# require>: ingest
# project_name: insights_monitoring

+execute_report:
td>: queries/generate_report.sql
create_table: ${td.tables.report_table}
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
WITH users AS (
SELECT id, name
FROM basic_monitoring.users
WHERE email = '${td.email_to_check}'
)
SELECT
'SAVED QUERY' AS resource_type,
name AS resource_name,
CASE
WHEN CRON IS NOT NULL THEN 'The schedule will be deactivated. Re-enable it manually, as necessary.'
ELSE 'No action required. The query will be re-assigned to Account Owner.'
END AS action_item,
CASE
WHEN cron IS NOT NULL THEN 'Schedule: ' || cron
ELSE 'No schedule is set for this query'
END AS notes
FROM basic_monitoring.schedules
WHERE user_name IN (SELECT name FROM users)

UNION

SELECT
'AUDIENCE',
name,
'Change owner (TD Support Request) or associated CDP workflows will fail',
NULL
FROM cdp_monitoring.parent_segments_configuration
WHERE json_extract_scalar(json_parse(createdby),'$.td_user_id') IN (SELECT CAST(id AS VARCHAR) FROM users)

UNION

SELECT DISTINCT
'WORKFLOW',
'Project: ' || JSON_EXTRACT_SCALAR(json_parse(project),'$.name') || ', Workflow: ' || name,
'Save workflow as a different user to change the workflow owner',
'Limitation: The Treasure Data related operators in the workflow (such as td>, td_run>) fail if the td.apikey secret is not set for projects that the deleted user created.'
FROM workflow_monitoring.workflows w
JOIN workflow_monitoring.revisions r on r.revision = w.revision
WHERE json_extract_scalar(json_parse(userinfo),'$.td.user.id') IN (SELECT CAST(id AS VARCHAR) FROM users)
AND JSON_EXTRACT_SCALAR(json_parse(project),'$.name') NOT LIKE 'cdp_%'

UNION

SELECT DISTINCT
'INSIGHTS DATAMODEL',
name,
'Contact TD Support to re-assign ownership',
null
FROM insights_monitoring.datamodels
WHERE created_by IN (SELECT id from users)

ORDER BY 1,2,3

32 changes: 32 additions & 0 deletions scenarios/monitoring/insights_monitoring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Workflow: Scenario (Import TD Insights object from REST API)

## Scenario

The purpose of this scenario is to import /reporting/datamodels metadata from REST API.

*Steps*
1. Import /reporting/datamodels metadata from REST API (ingest.dig)

# How to Run for Server Mode

First, please upload the workflow.

## Upload
$td wf push insights_monitoring

Second, you register td.apikey as a secret. (Owner of td.apikey must be admin and have all permission for TD functions.)

## Register
$td wf secrets --project insights_monitoring --set td.apikey=1234/abcdefg...

## Run
$td wf start insights_monitoring ingest --session now

# Relationships of Table and REST API

| table | REST API|
| ----- | --------|
| datamodels | [/api/reporting/datamodels] (https://docs.treasuredata.com/articles/#!pd/reference-insights-model-endpoints/a/h1_125997694) |

# Next Step
If you have any questions, please contact to support@treasuredata.com.
5 changes: 5 additions & 0 deletions scenarios/monitoring/insights_monitoring/common/settings.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
td:
database: insights_monitoring
tables:
datamodels: datamodels
api_endpoint: api.treasuredata.com
20 changes: 20 additions & 0 deletions scenarios/monitoring/insights_monitoring/ingest.dig
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
_export:
!include : common/settings.yaml

+initial_database_and_tables:
+create_db:
td_ddl>:
create_databases: [ "${td.database}" ]
+create_tables:
td_ddl>:
create_tables: ${Object.keys(td.tables)}

+initial_ingest_datamodels:
py>: scripts.ingest_datamodels.run
dest_db: ${td.database}
dest_table: ${td.tables.datamodels}
api_endpoint: ${td.api_endpoint}
docker:
image: "digdag/digdag-python:3.9"
_env:
TD_API_KEY: ${secret:td.apikey}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import requests
import pandas as pd
import pytd
import os
import json

def get_all_datamodels(url, headers):
print('Retrieving datamodels: ' + url)
res = requests.get(url=url, headers=headers)

if res.status_code == requests.codes.ok:
return res.json()

if res.status_code == requests.codes.forbidden:
print('ERROR: API key user does not have access to Insights API')

res.raise_for_status()

def run(dest_db, dest_table, api_endpoint='api.treasuredata.com'):
apikey = os.environ['TD_API_KEY']

url = 'https://%s/reporting/datamodels' % api_endpoint
headers = {'Authorization': 'TD1 %s' % apikey}

datamodel_list = get_all_datamodels(url, headers)

if len(datamodel_list) == 0:
print('no import record')
return

df = pd.DataFrame(datamodel_list)

client = pytd.Client(apikey=apikey, endpoint='https://%s' % api_endpoint, database=dest_db)
client.load_table_from_dataframe(df, dest_table, if_exists='overwrite', fmt='msgpack')

0 comments on commit f1c75ae

Please sign in to comment.