Repositories

yongzhuo repositories

38 supported repositories

ChatGLM2-6B微调, SFT/LoRA, instruction finetune

Last commit Jul 19, 2023

 (109 stars) (11 forks) (0 indexed issues) (0 open good first issues)

chatglm3-6b, 微调/LORA/推理/单机多卡/deepspeed/支持多轮对话

Last commit Nov 30, 2023

 (17 stars) (3 forks) (0 indexed issues) (0 open good first issues)

A concept and obvious expression pattern collection of Chinese compound event extraction which then be evolved into ComplexEventGraph,本项目提出了中文复合事件的概念与显式模式,包括条件事件、因果事件、顺承事件、反转事件等事件抽取,并形成事理图谱。

Last commit Dec 15, 2018

 (0 stars) (0 forks) (0 indexed issues) (0 open good first issues)

InternLM-7B微调, SFT/LoRA, instruction finetune

Last commit May 17, 2024

 (13 stars) (0 forks) (0 indexed issues) (0 open good first issues)

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Last commit Jul 5, 2020

 (0 stars) (0 forks) (0 indexed issues) (0 open good first issues)

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

Last commit Dec 5, 2023

 (1,813 stars) (398 forks) (0 indexed issues) (0 open good first issues)

中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微调, 推理, 测评, 接口)等.

Last commit May 17, 2024

 (217 stars) (17 forks) (0 indexed issues) (0 open good first issues)

LlaMA3-SFT, Meta-Llama-3-8B/Meta-Llama-3-8B-Instruct微调(transformers)/LORA(peft)/推理, 支持中文(chinese, zh)

Last commit May 17, 2024

 (34 stars) (6 forks) (0 indexed issues) (0 open good first issues)

Llama2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理

Last commit Jul 26, 2023

 (27 stars) (0 forks) (0 indexed issues) (0 open good first issues)

Macadam是一个以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入; 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法; 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法。

Last commit Jul 15, 2020

 (325 stars) (38 forks) (0 indexed issues) (0 open good first issues)

macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor

Last commit Nov 30, 2023

 (15 stars) (3 forks) (0 indexed issues) (0 open good first issues)

自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)

Last commit Dec 25, 2020

 (660 stars) (92 forks) (0 indexed issues) (0 open good first issues)

中文开放信息抽取系统, open-information-extraction-system, build open-knowledge-graph(SPO, subject-predicate-object) by pyltp(version==3.4.0)

Last commit Aug 2, 2021

 (8 stars) (2 forks) (0 indexed issues) (0 open good first issues)

中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Chinese text classification and sequence labeling toolkit, supports multi class and multi label classification, text similsrity, text summary and NER.

Last commit Jul 18, 2024

 (355 stars) (52 forks) (0 indexed issues) (0 open good first issues)

阿里通义千问(Qwen-7B-Chat/Qwen-7B), 微调/LORA/推理

Last commit May 17, 2024

 (142 stars) (10 forks) (0 indexed issues) (0 open good first issues)

阿里通义千问(Qwen3.5-9B/4B/2B/0.8B/27B/35B/122B/397B), LoRA, SFT

Last commit Apr 11, 2026

 (0 stars) (0 forks) (0 indexed issues) (0 open good first issues)

文本数据分析, Text-Analysis

Last commit Nov 1, 2021

 (7 stars) (5 forks) (0 indexed issues) (0 open good first issues)

tensorflow-transformer(tft) of pre-processing and post-processing of text-classification

Last commit Mar 16, 2020

 (1 star) (1 fork) (0 indexed issues) (0 open good first issues)

Tookit-Sihui, a tool of some common algorithm, AI文本混合科学计算器(calculator-sihui), 句子词频-逆文本频率(TF-IDF),搜索BM25, 前缀树搜索关键词(trietree), 模板匹配-递归函数(func_recursive),中文数字转阿拉伯数字(chinese to number),阿拉伯数字转汉语数字, HMM, CRF

Last commit Apr 9, 2021

 (24 stars) (15 forks) (0 indexed issues) (0 open good first issues)

构建中文词频词典-搜索引擎式切词(create chinese word dict of freq by segnment of search)

Last commit Dec 17, 2019

 (9 stars) (0 forks) (0 indexed issues) (0 open good first issues)