计算model的parameter参数大小

以 openai-community/gpt2 为例

from transformers import pipeline, set_seed
generation_gpt2 = pipeline("text-generation", model="openai-community/gpt2")

def model_size(model):
    return sum(t.numel() for t in model.parameters())

print(f"GPT2 size: {model_size(generation_gpt2.model)/1000**2:.1f}M parameters")

输出

GPT2 size: 124.4M parameters

在深度学习模型参数量的描述中：

K 表示千（Kilo），即 1,000
例如：10K = 10,000
M 表示百万（Mega），即 1,000,000
例如：124M = 124,000,000
B 表示十亿（Billion），即 1,000,000,000
例如：1.5B = 1,500,000,000

这些单位常用于表示模型的参数规模。例如：

GPT-2 small 有 124M 参数（124,000,000）
GPT-3 有 175B 参数（175,000,000,000）

总结：

K = 千
M = 百万
B = 十亿

用于简洁表示模型参数数量。

numel()

在PyTorch中，numel() 是一个张量（torch.Tensor）的方法，用于返回张量中元素的总数量。名称来源于 number of elements 的缩写。

tensor.numel() -> int

计算张量的总元素数（即所有维度的乘积）。

import torch

x = torch.randn(3, 4)  # 3行4列的矩阵
print(x.numel())       # 输出: 12 (3 * 4)

THE END