aws-glue-crawler

Here are 47 public repositories matching this topic...

aws-samples / aws-glue-crawler-utilities

This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.

aws-glue aws-glue-crawler

Updated Dec 21, 2021
Python

aws-samples / amazon-rds-export-to-s3-automation

Star

This repository contains source code for the AWS Database Blog Post Reduce data archiving costs for compliance by automating RDS snapshot exports to Amazon S3

aws-lambda aws-kms aws-cloudformation amazon-rds amazon-sns amazon-s3 amazon-athena aws-backup aws-glue amazon-eventbridge aws-glue-crawler

Updated Apr 26, 2023

aws-samples / automated-datastore-discovery-with-aws-glue

Star

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated Apr 10, 2025
Python

fermat01 / ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

Star

ETL Data pipeline using aws services

aws aws-s3 aws-athena aws-emr-clusters aws-glue-crawler

Updated Aug 23, 2024
Python

DivineSamOfficial / SmartCityProject

Star

Smart City Realtime Data Engineering Project

python aws kafka aws-s3 pyspark spark-streaming aws-ec2 aws-athena aws-redshift aws-glue aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

Star

Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.

aws terraform s3-bucket pyspark glue-job glue-catalog aws-glue-crawler

Updated Feb 10, 2022
Python

ShubhamMohanty680 / Spotify_end_to_end_data_engineering

Star

It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.

python aws aws-lambda aws-s3 spotify-api data-engineering aws-athena data-engineering-pipeline spotipy-library aws-glue-crawler awscloudwatch aws-glue-data-catalog aws-trigger

Updated Jan 22, 2025
Jupyter Notebook

masood2iq / AWS-Athena-Glue-S3-Bucket-Deployment-Through-AWSConsole

Star

AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through AWS GUI console.

aws-athena aws-glue simple-query aws-glue-crawler aws-s3-bucket

Updated Dec 13, 2022

analyticshamza / Retail-Data-Management-with-AWS-Glue

Star

regex data-engineering join aggregation aws-glue aws-glue-crawler

Updated Jul 7, 2025

Saurabhkhandebharad / BigData-SK

Star

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

aws big-data aws-lambda power-bi pyspark aws-ec2 aws-cloudformation aws-athena kaggle-dataset aws-services end-to-end-pipeline end-to-end-project aws-glue-crawler aws-s3-bucket

Updated Sep 7, 2024
Python

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

sarah-zhan / data_pipeline_amazon_products

Star

An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization

aws s3-bucket tableau aws-athena aws-glue-crawler

Updated Mar 27, 2024
Jupyter Notebook

Akanksha-tetwar / YouTube-Trending-video-analysis-ETL-using-AWS-Services

Star

In this project I have used the Trending YouTube Video Statistics data from Kaggle to analyze and prepare it for usage.

python aws aws-s3 aws-athena awslambda quicksight aws-glue-crawler awsglue

Updated Nov 7, 2022

Danitilahun / Reddit-Data-Engineering

Star

This project automates the extraction, transformation, and loading (ETL) of Reddit data into a Redshift data warehouse using Airflow. Key technologies include Celery, PostgreSQL, S3, Glue, Athena, and Redshift, providing a complete data pipeline solution.

python aws airflow aws-s3 data-engineering aws-athena aws-glue postgreesql aws-glue-crawler

Updated Apr 16, 2025

masood2iq / AWS-Athena-Glue-S3-CloudFormation-Deployment-AWSConsole

Star

AWS Athena, Glue Database, Glue Crawler and S3 buckets deployment through CloudFormation stack on AWS console.

aws-cloudformation aws-athena aws-glue simple-query aws-glue-crawler aws-s3-bucket

Updated Dec 14, 2022

dhvani-k / YouTrend_Insights_Analyzing_YouTube_Video_Landscape

Star

An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau