Skip to content
View baotong's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report baotong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

nlp

31 repositories

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,599 1,556 Updated May 23, 2024

KnowledgeGraphSlides, a collection of knowledgegraph lectures, including the ccks series from 2013 to 2018, 中文知识图谱计算会议CCKS报告合集,涵盖从2013年至2018年,共48篇,从中可以看出从谷歌2012年推出知识图谱以来,中国学术界及工业界这6年来知识图谱的主流思想变迁。

344 85 Updated Dec 15, 2018

100 Must-Read NLP Papers

3,790 566 Updated Jul 9, 2021

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。

Python 4,082 379 Updated Aug 13, 2024

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)

Python 9,840 1,394 Updated Jul 31, 2023

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

Python 2,097 255 Updated Nov 27, 2024

记录本人整理的一些数据集

1,029 131 Updated Jun 16, 2022

BDCI2019金融负面信息判定-线上第一名

Python 155 29 Updated Dec 8, 2022

Mengzi Pretrained Models

534 64 Updated Nov 29, 2022

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit

Python 1,872 410 Updated Sep 6, 2023

今日头条中文新闻文本(多层)分类数据集

Python 396 124 Updated May 6, 2021

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Python 3,112 450 Updated Jun 5, 2023

baidu aistudio event extraction competition

Python 224 39 Updated Mar 24, 2023

UDA(Unsupervised Data Augmentation) implemented by pytorch

Python 276 60 Updated Dec 13, 2019

提取金融相关领域研究报告的主要结论(key idea)

Python 59 25 Updated Jun 6, 2018

ccks金融事件主体抽取

Python 72 23 Updated Oct 21, 2020

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…

Python 71,319 14,709 Updated May 10, 2024

The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。

JavaScript 48,751 9,778 Updated Aug 10, 2024

Inference code for Llama models

Python 57,771 9,715 Updated Jan 26, 2025

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Python 2,036 303 Updated Mar 19, 2024

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,507 422 Updated Mar 3, 2025

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 140,479 28,166 Updated Mar 2, 2025

基于 ChatGPT API 的划词翻译浏览器插件和跨平台桌面端应用 - Browser extension and cross-platform desktop application for translation based on ChatGPT API.

TypeScript 24,232 1,766 Updated Nov 16, 2024

基于向量数据库与GPT3.5的通用本地知识库方案(A universal local knowledge base solution based on vector database and GPT3.5)

Python 3,669 321 Updated May 12, 2023

Toolkit for creating, sharing and using natural language prompts.

Python 2,784 364 Updated Oct 23, 2023

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 53,780 5,247 Updated Jan 21, 2025

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,855 4,061 Updated Jul 17, 2024

A concept and obvious expression pattern collection of Chinese compound event extraction which then be evolved into ComplexEventGraph,本项目提出了中文复合事件的概念与显式模式,包括条件事件、因果事件、顺承事件、反转事件等事件抽取,并形成事理图谱。

Python 1,198 286 Updated Dec 15, 2018

An implementation of the BERT model and its related downstream tasks based on the PyTorch framework

Python 580 110 Updated Mar 1, 2025