import|GitHub 7.5k star量,各种视觉Transformer的PyTorch实现合集整理好了( 二 )



论文地址:https://arxiv.org/pdf/2012.12877.pdf
从 ResNet50(或任何教师网络)蒸馏到 vision transformer 的代码如下:
import torchfrom torchvision.models import resnet50from vit_pytorch.distill import DistillableViT, DistillWrapperteacher = resnet50(pretrained = True)
v = DistillableViT(
image_size = 256,
patch_size = 32,
num_classes = 1000,
dim = 1024,
depth = 6,
heads = 8,
mlp_dim = 2048,
dropout = 0.1,
emb_dropout = 0.1
distiller = DistillWrapper(
student = v,
teacher = teacher,
temperature = 3, # temperature of distillationalpha = 0.5, # trade between main loss and distillation losshard = False # whether to use soft or hard distillation
img = torch.randn(2, 3, 256, 256)labels = torch.randint(0, 1000, (2,))
loss = distiller(img, labels)loss.backward()
# after lots of training above ...pred = v(img) # (2, 1000)
除了 Vision Transformer 之外 , 该项目还提供了 Deep ViT、CaiT、Token-to-Token ViT、PiT 等其他 ViT 变体模型的 PyTorch 实现 。
import|GitHub 7.5k star量,各种视觉Transformer的PyTorch实现合集整理好了
文章图片

对 ViT 模型 PyTorch 实现感兴趣的读者可以参阅原项目 。

推荐阅读