Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 15 additions & 6 deletions .github/workflows/ci_cd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,13 @@ jobs:
npx playwright install
fi

- name: Run tests on CI
- name: Run CLI smoke tests
run: npm run test:cli

- name: Validate Malloy models
run: npm run test:models

- name: Run E2E tests
run: npm test

- uses: actions/upload-artifact@v6
Expand All @@ -72,11 +78,13 @@ jobs:
node-version: lts/*
- name: Install dependencies
run: npm ci
- name: Build website
run: npm run build
# For gh pages deployment, base path is repository name
- name: Build CLI
run: npm run build:cli

- name: Build all example sites with landing page
run: npm run build:all
env:
BASE_PUBLIC_PATH: /${{ github.event.repository.name }}/
BASE_PATH: /${{ github.event.repository.name }}/

- name: Upload Build Artifact
# only run on pushes to master or workflow_dispatch
Expand Down Expand Up @@ -123,9 +131,10 @@ jobs:
- name: Install Playwright Browsers
run: npx playwright install --with-deps
- name: Run tests on deployed site
# Tests expect the sample-data example which contains the invoices model
run: npm test
env:
URL: ${{ needs.deploy.outputs.page_url }}
URL: ${{ needs.deploy.outputs.page_url }}sample-data/
- uses: actions/upload-artifact@v6
if: ${{ !cancelled() }}
with:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ dist-ssr
/playwright-report/
/blob-report/
/playwright/.cache/
dist-cli/
File renamed without changes.
File renamed without changes.
41 changes: 41 additions & 0 deletions examples/huggingface/GitHub Events.malloynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
>>>markdown
# GitHub Events Analysis

Explore GitHub activity data from the Hugging Face datasets. This model analyzes push events, pull requests, issues, and other GitHub activities.

**Data Source:** [alvarobartt/github-events](https://huggingface.co/datasets/alvarobartt/github-events) on Hugging Face

>>>malloy
import "github_events.malloy"

>>>markdown
## Overview Dashboard

Let's start with an overview of all GitHub events in the dataset:

>>>malloy
run: github_events -> overview

>>>markdown
## Event Type Distribution

Breaking down events by their type:

>>>malloy
run: github_events -> by_event_type

>>>markdown
## Top Repositories

The most active repositories in the dataset:

>>>malloy
run: github_events -> by_repo

>>>markdown
## Top Contributors

Who are the most active contributors?

>>>malloy
run: github_events -> top_contributors
49 changes: 49 additions & 0 deletions examples/huggingface/IMDB Movies.malloynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
>>>markdown
# IMDB Movies Analysis

Explore movie ratings, genres, and trends from the IMDB dataset. This data is sourced from Hugging Face and includes movie titles, ratings, vote counts, and more.

**Data Source:** [Pablinho/imdb-data](https://huggingface.co/datasets/Pablinho/imdb-data) on Hugging Face

>>>malloy
import "imdb_movies.malloy"

>>>markdown
## Overview Dashboard

A comprehensive look at the movie dataset:

>>>malloy
run: movies -> overview

>>>markdown
## Top Rated Movies

The highest-rated movies with significant vote counts (>10,000 votes):

>>>malloy
run: movies -> top_rated

>>>markdown
## Most Popular Movies

Movies sorted by number of votes:

>>>malloy
run: movies -> most_popular

>>>markdown
## Genre Analysis

Deep dive into movie genres with rating trends over time:

>>>malloy
run: movies -> genre_analysis

>>>markdown
## Movies by Decade

How has movie production and quality changed over the decades?

>>>malloy
run: movies -> by_decade
57 changes: 57 additions & 0 deletions examples/huggingface/NYC Taxi Trips.malloynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
>>>markdown
# NYC Taxi Trips Analysis

Analyze New York City yellow taxi trip data. This dataset contains trip records including pickup/dropoff times, distances, fares, tips, and payment methods.

**Data Source:** [codelion/nyctaxi](https://huggingface.co/datasets/codelion/nyctaxi) on Hugging Face

>>>malloy
import "nyc_taxi.malloy"

>>>markdown
## Overview Dashboard

A comprehensive look at taxi trip patterns:

>>>malloy
run: taxi_trips -> overview

>>>markdown
## Hourly Patterns

When do New Yorkers take taxi rides? Let's look at trips by hour of day:

>>>malloy
run: taxi_trips -> by_hour

>>>markdown
## Daily Patterns

Trip distribution across days of the week:

>>>malloy
run: taxi_trips -> by_day

>>>markdown
## Payment Analysis

How do people pay for their rides, and how does tipping vary by payment method?

>>>malloy
run: taxi_trips -> by_payment

>>>markdown
## Fare Analysis by Hour

Detailed breakdown of fares throughout the day:

>>>malloy
run: taxi_trips -> fare_analysis

>>>markdown
## Long Distance Trips

The longest trips in the dataset:

>>>malloy
run: taxi_trips -> long_trips
104 changes: 104 additions & 0 deletions examples/huggingface/github_events.malloy
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
-- GitHub Events Analysis Model
-- Analyzes GitHub activity data from Hugging Face datasets
-- Uses: hf://datasets/alvarobartt/github-events/data/*.parquet

source: github_events is duckdb.table('hf://datasets/alvarobartt/github-events/data/train-00000-of-00001.parquet') extend {
-- Core measures
measure: event_count is count()
measure: unique_repos is count(distinct repo.name)
measure: unique_actors is count(distinct actor.login)

-- Event type measures
measure: push_events is count() { where: `type` = 'PushEvent' }
measure: pr_events is count() { where: `type` = 'PullRequestEvent' }
measure: issue_events is count() { where: `type` = 'IssuesEvent' }
measure: watch_events is count() { where: `type` = 'WatchEvent' }
measure: fork_events is count() { where: `type` = 'ForkEvent' }
measure: create_events is count() { where: `type` = 'CreateEvent' }

-- Dimensions
dimension: event_type is `type`
dimension: repo_name is repo.name
dimension: actor_login is actor.login
dimension: event_date is created_at::date

-- Views
view: by_event_type is {
group_by: event_type
aggregate:
event_count
unique_repos
unique_actors
order_by: event_count desc
}

view: by_repo is {
group_by: repo_name
aggregate:
event_count
unique_actors
order_by: event_count desc
limit: 20
}

view: by_actor is {
group_by: actor_login
aggregate:
event_count
unique_repos
order_by: event_count desc
limit: 20
}

view: activity_timeline is {
group_by: event_date
aggregate:
event_count
unique_repos
unique_actors
order_by: event_date
}

# dashboard
view: overview is {
aggregate:
event_count
unique_repos
unique_actors
push_events
pr_events
issue_events
watch_events
fork_events
# bar_chart
nest: by_event_type
# bar_chart
nest: top_repos is by_repo
# line_chart
nest: activity_timeline
}

-- Top contributors view
view: top_contributors is {
group_by: actor_login
aggregate:
event_count
push_events
pr_events
issue_events
# bar_chart
nest: activity_by_type is {
group_by: event_type
aggregate: event_count
}
order_by: event_count desc
limit: 15
}
}

-- Named queries
query: github_overview is github_events -> overview
query: event_breakdown is github_events -> by_event_type
query: top_repos is github_events -> by_repo
query: contributors is github_events -> top_contributors
query: timeline is github_events -> activity_timeline
Loading