Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
-
Updated
Nov 6, 2024 - Python
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
Algorithms for Massive Datasets (AMD) -- Market-baskets analysis project
This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"
The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is illicit, using the attribute "Is Laundering" as a label to be predicted.
Stream, parse, manipulate and transform extremly large data ( can be 1 GB or 1TB ) in NodeJS without any process block, memory overflow or bottle neck with peak performance. And also show it in UI with the help of webStreams
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
📺 Content Recommendation System for the Netflix Prize Challenge with Collaborative Filtering.
Training the MASSIVE dataset by Amazon(english-US, German-DE and Swahili-KE)
TF-Package: Multiple-Input Multiple-Output Keras Data-Generator for massive and complex datasets
Command line tool to quickly generate a lot of files in a lot of directories
Permite abrir e manipular arquivos massivos de texto/dados cujo seria impossivel abrir em um computador, por exemplo um arquivo de texto de +20gb, permite manipular o arquivo pegando apenas as linhas necessárias sem travar o computador por falta de memória.
Building node2vec algorithm
Building a Bloom Filter on English dictionary words
Building PageRank algorithm on Web Graph around Stanford.edu using NetworkX python library
word count in Spark
University lab exercises with processing big data.
Add a description, image, and links to the massive-datasets topic page so that developers can more easily learn about it.
To associate your repository with the massive-datasets topic, visit your repo's landing page and select "manage topics."