今日话题
Optimum Ascend
HuggingFace Transformers用户的福音来了,昇腾推出高阶迁移工具Optimum Ascend ,只需几行代码可以让Transformers用户能够简单直接地在昇腾人工智能计算中心进行模型训练、微调和评估。
具体操作如下:
01
安装Optimum Ascend:
直接执行以下两行代码即可成功安装Optimum Ascend。
git clone https://gitee.com/ascend/transformers.git -b optimum && cd transformers
pip install -e .
02
完成自动迁移:
使用Optimum.ascend自带的自动迁移方式在NPU上使用Transformers,具体操作是在训练脚本中引入以下头文件,然后运行即可。
import torch_npu
from torch_npu.contrib import transfer_to_npu
from optimum.ascend import transfor_to_npu
通过上述描述可以看出,用户只需要5行代码即可完成迁移,从而可以进行模型的训练和评估。
近日,某用户基于昇腾人工智能计算中心完成了T5和CodeT5模型的训练微调,训练速度比主流GPU提升10%。相比于零散的算力,计算中心提供多机多卡训练环境,提升模型训练稳定性,缩短模型迭代周期。同时计算中心提供普惠算力,一定程度上降低模型研发成本。
案例介绍
昇腾系列处理器是基于华为达芬奇架构的 NPU。昇腾训练处理器具有超高算力,性能最高可达 320 TFLOPS FP16,并且支持DeepSpeed框架。
1)准备环境
开通计算中心账号,在开发环境创建一个notebook实例,创建时选择Pytorch1.8的镜像,打开后进入终端。(或准备一台昇腾服务器,并且安装好了训练卡驱动,torch,torch_npu和6.0.1版本以上的CANN)。用户需要导入以下包:
import torch
import torch_npu
import transfer_to_npu
2)安装原生Transformers框架(要求Transformers版本为4.25.1,
PyTorch版本为1.8.1。)
git clone https://gitee.com/ascend/transformers.git -b optimum && cd transformers
pip install -e .
3)选择有监督训练
from optimum.ascend import transfor_to_npu
from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("t5-small")
tokenizer = T5Tokenizer.from_pretrained("t5-small")
input_ids = tokenizer('translate English to German: The house is wonderful.', return_tensors='pt').input_ids
labels = tokenizer('Das Haus ist wunderbar.', return_tensors='pt').input_ids
# the forward function automatically creates the correct decoder_input_ids loss = model(input_ids=input_ids, labels=labels).loss
4)模型搭建
借助T5ForConditionalGeneration模型来搭建一个简单的由文本到SQL语句的翻译模型。其Pytorch实现如下:
class T5ForTextToSQL(torch.nn.Module):
'''
A basic T5 model for Text-to-SQL task.
'''
def __init__(self):
super(T5ForTextToSQL, self).__init__()
self.t5 = T5ForConditionalGeneration.from_pretrained('t5-small')
def forward(self, input_ids, labels):
out = self.t5(input_ids=input_ids, labels=labels)
return out
def generate(self, input_ids):
result = self.t5.generate(input_ids=input_ids)
return result
5)准备数据集
使用一个自定义的数据集来快速开始。其定义如下:
class TextToSQL_Dataset(torch.utils.data.Dataset):
'''
A simple text-to-sql dataset example.
'''
def __init__(self, text_l, schema_l, sql_l, tokenizer, block_size=1):
self.tokenizer = tokenizer
self.max_len = block_size
self.text = text_l
self.scheme = schema_l
self.sql = sql_l
def _text_to_encoding(self, item):
return self.tokenizer(item)
def _text_to_item(self, text):
try:
if (text is not None):
return self._text_to_encoding(text)
else:
return None
except:
return None
def __len__(self):
return len(self.sql)
def __getitem__(self, _id):
text = self.text[_id]
sql = self.sql[_id]
schema = self.scheme[_id]
text_encodings = self._text_to_item("translate Text to SQL: " + text)
sql_encodings = self._text_to_item(sql)
schema_encodings = self._text_to_item(schema)
item = dict()
item['text_encodings'] = {key: torch.tensor(value) for key, value in text_encodings.items()}
item['sql_encodings'] = {key: torch.tensor(value) for key, value in sql_encodings.items()}
item['schema_encodings'] = {key: torch.tensor(value) for key, value in schema_encodings.items()}
return item
训练集和测试集如下:
# 以下为train_set
text_l = [
"Find all student names in student database.",
"Count student's number for class 1. ",
"Given the max student age in class 1.",
"Please find the minium student age in class 1.",
"Tell me the number of classes.",
"Who is the student that older than 15."
]
schema_l = [
'Table: student$$header: name%%age%%class%%',
]*len(text_l)
sql_l = [
"SELECT name FROM student",
"SELECT COUNT(*) FROM student WHERE class=1",
"SELECT MAX(age) FROM student WHERE class=1",
"SELECT MIN(age) FROM student WHERE class=1",
"SELECT COUNT(class) FROM student",
"SELECT name FROM student WHERE age>15",
]
# 以下为test_set
test_text_l = [
"Find all student ages in student database.",
"Count student's number for class 3. ",
"Given the min student age in class 2.",
"Please find the maxium student age in class 2.",
"Who is the student that younger than 14."
]
test_schema_l = [
'Table: student$$header: name%%age%%class%%',
]*len(text_l)
test_sql_l = [
"SELECT age FROM student",
"SELECT COUNT(*) FROM student WHERE class=3",
"SELECT MIN(age) FROM student WHERE class=2",
"SELECT MAX(age) FROM student WHERE class=2",
"SELECT name FROM student WHERE age