基于开源模型对文本和音频进行情感分析-慈云数据

应用场景

从商品详情页爬取商品评论，对其做舆情分析；
电话客服，对音频进行分析，做舆情分析；
通过对商品的评论分析，作为对供应商打分/商品个性化排序等依据；

模型选用

文本，选用了通义实验室fine-tune的structBERT 模型，基于大众点评的评论数据进行训练，使用预训练模型进行推理，CPU 能跑，支持模型微调，基本上不用微调了，因为他是基于电商领域的数据集进行训练的，基本够用，training dataset 使用了大众点评等平台数据，可本地部署；

参考论文：

（图片来源网络，侵删）

title: Incorporating language structures into pre-training for deep language understanding
author：Wang, Wei and Bi, Bin and Yan, Ming and Wu, Chen and Bao, Zuyi and Xia, Jiangnan and Peng, Liwei and Si, Luo
journal：arXiv preprint arXiv:1908.04577，
year：2019

版本依赖：

（图片来源网络，侵删）

modelscope-lib 最新版本

推理代码：

semantic_cls = pipeline(Tasks.text_classification, 'damo/nlp_structbert_sentiment-classification_chinese-base')
comment0 = '非常厚实的一包大米，来自遥远的东北，盘锦大米，应该不错的，密封性很好。卖家的服务真是贴心周到！他们提供了专业的建议，帮助我选择了合适的商品。物流速度也很快，让我顺利收到了商品。'
result0 = semantic_cls(input=comment0)
if result0['scores'][0] > result0['scores'][1]:
    print("'" + comment0 + "'，属于" + result0["labels"][0] + "评价")
else:
    print("'" + comment0 + "'，属于" + result0["labels"][1] + "评价")
comment1 = '食物的口感还不错，不过店员的服务态度可以进一步改善一下。'
result1 = semantic_cls(input=comment1)
if result1['scores'][0] > result1['scores'][1]:
    print("'" + comment1 + "'，属于" + result1["labels"][0] + "评价")
else:
    print("'" + comment1 + "'，属于" + result1["labels"][1] + "评价")
comment2 = '衣服尺码合适，色彩可以再鲜艳一些，客服响应速度一般。'
result2 = semantic_cls(input=comment2)
if result2['scores'][0] > result2['scores'][1]:
    print("'" + comment2 + "'，属于" + result2["labels"][0] + "评价")
else:
    print("'" + comment2 + "'，属于" + result2["labels"][1] + "评价")
comment3 = '物流慢，售后不好，货品质量差。'
result3 = semantic_cls(input=comment3)
if result3['scores'][0] > result3['scores'][1]:
    print("'" + comment3 + "'，属于" + result3["labels"][0] + "评价")
else:
    print("'" + comment3 + "'，属于" + result3["labels"][1] + "评价")
comment4 = '物流包装顺坏，不过客服处理速度比较快，也给了比较满意的赔偿。'
result4 = semantic_cls(input=comment4)
if result4['scores'][0] > result4['scores'][1]:
    print("'" + comment4 + "'，属于" + result4["labels"][0] + "评价")
else:
    print("'" + comment4 + "'，属于" + result4["labels"][1] + "评价")
comment5 = '冰箱制冷噪声较大，制冷慢。'
result5 = semantic_cls(input=comment5)
if result5['scores'][0] > result5['scores'][1]:
    print("'" + comment5 + "'，属于" + result5["labels"][0] + "评价")
else:
    print("'" + comment5 + "'，属于" + result5["labels"][1] + "评价")
comment6 = '买了一件刘德华同款鞋，穿在自己脚上不像刘德华，像扫大街的。'
result6 = semantic_cls(input=comment6)
if result6['scores'][0] > result6['scores'][1]:
    print("'" + comment6 + "'，属于" + result6["labels"][0] + "评价")
else:
    print("'" + comment6 + "'，属于" + result6["labels"][1] + "评价")

运行结果：

'非常厚实的一包大米，来自遥远的东北，盘锦大米，应该不错的，密封性很好。卖家的服务真是贴心周到！他们提供了专业的建议，帮助我选择了合适的商品。物流速度也很快，让我顺利收到了商品。'，属于正面评价

'食物的口感还不错，不过店员的服务态度可以进一步改善一下。'，属于正面评价

'衣服尺码合适，色彩可以再鲜艳一些，客服响应速度一般。'，属于正面评价

'物流慢，售后不好，货品质量差。'，属于负面评价

'物流包装顺坏，不过客服处理速度比较快，也给了比较满意的赔偿。'，属于正面评价

'冰箱制冷噪声较大，制冷慢。'，属于负面评价

'买了一件刘德华同款鞋，穿在自己脚上不像刘德华，像扫大街的。'，属于负面评价

音频，选用了通义实验室 fine-tune的emotion2vec微调模型，CPU 能跑，可本地部署；
参考论文：

title: Self-Supervised Pre-Training for Speech Emotion Representation
author：Ma, Ziyang and Zheng, Zhisheng and Ye, Jiaxin and Li, Jinchao and Gao, Zhifu and Zhang, Shiliang and Chen, Xie
journal：arXiv preprint arXiv:2312.15185
year：2023

开源地址：

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

版本依赖：

modelscope >= 1.11.1

funasr>=1.0.5

推理代码：
```
from funasr import AutoModel
model = AutoModel(model="iic/emotion2vec_base_finetuned", model_revision="v2.0.4")
wav_file = f"{model.model_path}/example/test.wav"
res = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False)
print(res)
scores = res[0]["scores"]
max_score = 0
max_index = 0
i = 0
for score in scores:
    if score > max_score:
        max_score = score
        max_index = i
    i += 1
print("音频分析后，情感基调为：" + res[0]["labels"][max_index])
```
运行结果

rtf_avg: 0.263: 100%|██████████| 1/1 [00:02

基于开源模型对文本和音频进行情感分析

应用场景

模型选用

php redis分布式锁

linux内存缓存占用过高分析和优化

stm32编写Modbus步骤

如何保证数据库和缓存的一致性

Mongodb聚合操作中的$unset

私域引流宝PHP源码以及搭建教程

应用场景

模型选用

猜你喜欢

php redis分布式锁

linux内存缓存占用过高分析和优化

stm32编写Modbus步骤

如何保证数据库和缓存的一致性

Mongodb聚合操作中的$unset

私域引流宝PHP源码 以及搭建教程

私域引流宝PHP源码以及搭建教程