An open source library for deep learning end-to-end dialog systems and chatbots.
-
Updated
Nov 26, 2024 - Python
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
An open source library for deep learning end-to-end dialog systems and chatbots.
An Open-Source Framework for Prompt-Learning.
Data processing with ML, LLM and Vision LLM
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
ChatGPT带火了聊天机器人,主流的趋势都调整到了GPT类模式,本项目也与时俱进,会在近期更新GPT类版本。基于本项目和自己的语料可以训练出自己想要的聊天机器人,用于智能客服、在线问答、闲聊等场景。
精选机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理。算法大牛笔记汇总
Datasets, tools, and benchmarks for representation learning of code.
Text Classification Algorithms: A Survey
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
End-to-end neural table-text understanding models.
A deep dive into embeddings starting from fundamentals
Rasa UI is a frontend for the Rasa Framework
Python AI assistant 🧠
skweak: A software toolkit for weak supervision applied to NLP tasks
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.
Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.
Created by Alan Turing