应用场景
模型选用
- 文本,选用了通义实验室fine-tune的structBERT 模型,基于大众点评的评论数据进行训练,使用预训练模型进行推理,CPU 能跑,支持模型微调,基本上不用微调了,因为他是基于电商领域的数据集进行训练的,基本够用,training dataset 使用了大众点评等平台数据,可本地部署;
参考论文:
(图片来源网络,侵删)title: Incorporating language structures into pre-training for deep language understanding
author:Wang, Wei and Bi, Bin and Yan, Ming and Wu, Chen and Bao, Zuyi and Xia, Jiangnan and Peng, Liwei and Si, Luo
journal:arXiv preprint arXiv:1908.04577,
year:2019版本依赖:
(图片来源网络,侵删)modelscope-lib 最新版本
推理代码:
semantic_cls = pipeline(Tasks.text_classification, 'damo/nlp_structbert_sentiment-classification_chinese-base') comment0 = '非常厚实的一包大米,来自遥远的东北,盘锦大米,应该不错的,密封性很好。卖家的服务真是贴心周到!他们提供了专业的建议,帮助我选择了合适的商品。物流速度也很快,让我顺利收到了商品。' result0 = semantic_cls(input=comment0) if result0['scores'][0] > result0['scores'][1]: print("'" + comment0 + "',属于" + result0["labels"][0] + "评价") else: print("'" + comment0 + "',属于" + result0["labels"][1] + "评价") comment1 = '食物的口感还不错,不过店员的服务态度可以进一步改善一下。' result1 = semantic_cls(input=comment1) if result1['scores'][0] > result1['scores'][1]: print("'" + comment1 + "',属于" + result1["labels"][0] + "评价") else: print("'" + comment1 + "',属于" + result1["labels"][1] + "评价") comment2 = '衣服尺码合适,色彩可以再鲜艳一些,客服响应速度一般。' result2 = semantic_cls(input=comment2) if result2['scores'][0] > result2['scores'][1]: print("'" + comment2 + "',属于" + result2["labels"][0] + "评价") else: print("'" + comment2 + "',属于" + result2["labels"][1] + "评价") comment3 = '物流慢,售后不好,货品质量差。' result3 = semantic_cls(input=comment3) if result3['scores'][0] > result3['scores'][1]: print("'" + comment3 + "',属于" + result3["labels"][0] + "评价") else: print("'" + comment3 + "',属于" + result3["labels"][1] + "评价") comment4 = '物流包装顺坏,不过客服处理速度比较快,也给了比较满意的赔偿。' result4 = semantic_cls(input=comment4) if result4['scores'][0] > result4['scores'][1]: print("'" + comment4 + "',属于" + result4["labels"][0] + "评价") else: print("'" + comment4 + "',属于" + result4["labels"][1] + "评价") comment5 = '冰箱制冷噪声较大,制冷慢。' result5 = semantic_cls(input=comment5) if result5['scores'][0] > result5['scores'][1]: print("'" + comment5 + "',属于" + result5["labels"][0] + "评价") else: print("'" + comment5 + "',属于" + result5["labels"][1] + "评价") comment6 = '买了一件刘德华同款鞋,穿在自己脚上不像刘德华,像扫大街的。' result6 = semantic_cls(input=comment6) if result6['scores'][0] > result6['scores'][1]: print("'" + comment6 + "',属于" + result6["labels"][0] + "评价") else: print("'" + comment6 + "',属于" + result6["labels"][1] + "评价")
运行结果:
'非常厚实的一包大米,来自遥远的东北,盘锦大米,应该不错的,密封性很好。卖家的服务真是贴心周到!他们提供了专业的建议,帮助我选择了合适的商品。物流速度也很快,让我顺利收到了商品。',属于正面评价
'食物的口感还不错,不过店员的服务态度可以进一步改善一下。',属于正面评价
'衣服尺码合适,色彩可以再鲜艳一些,客服响应速度一般。',属于正面评价
'物流慢,售后不好,货品质量差。',属于负面评价
'物流包装顺坏,不过客服处理速度比较快,也给了比较满意的赔偿。',属于正面评价
'冰箱制冷噪声较大,制冷慢。',属于负面评价
'买了一件刘德华同款鞋,穿在自己脚上不像刘德华,像扫大街的。',属于负面评价
- 音频,选用了通义实验室 fine-tune的emotion2vec微调模型,CPU 能跑,可本地部署;
参考论文:
title: Self-Supervised Pre-Training for Speech Emotion Representation
author:Ma, Ziyang and Zheng, Zhisheng and Ye, Jiaxin and Li, Jinchao and Gao, Zhifu and Zhang, Shiliang and Chen, Xie
journal:arXiv preprint arXiv:2312.15185
year:2023开源地址:
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
版本依赖:
modelscope >= 1.11.1
funasr>=1.0.5
推理代码:
from funasr import AutoModel model = AutoModel(model="iic/emotion2vec_base_finetuned", model_revision="v2.0.4") wav_file = f"{model.model_path}/example/test.wav" res = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False) print(res) scores = res[0]["scores"] max_score = 0 max_index = 0 i = 0 for score in scores: if score > max_score: max_score = score max_index = i i += 1 print("音频分析后,情感基调为:" + res[0]["labels"][max_index])
运行结果
rtf_avg: 0.263: 100%|██████████| 1/1 [00:02
- 音频,选用了通义实验室 fine-tune的emotion2vec微调模型,CPU 能跑,可本地部署;