前提・実現したいこと
最近PyTorchを使い始めたものです。
公式チュートリアルを参考にCifar10を対象に分類問題を実行してみたのですが、速度が異常に遅くなってしまいます。(42minかかりました)
Tensorflowで同様のコードを試しに組んでみたのですがこちらは11min程度とはるかに短い時間で完了します
どなたかなぜ違いが出たか、アドバイスをいただけると幸いです。
トライした条件は下記になります
environment: Colab Pro+
dataset: Cifar10
classifier: VGG16
optimizer: Adam
loss: crossentropy
batch size: 32
Pytorch
Python
1import torch, torchvision 2import time, copy 3from torch import nn 4from torchvision import transforms, models 5from tqdm import tqdm 6 7 8trans = transforms.Compose([transforms.Resize((224, 224)), 9 transforms.ToTensor(),]) 10 11data = {phase: torchvision.datasets.CIFAR10('./', train = (phase=='train'), transform=trans, download=True) for phase in ['train', 'test']} 12dataloaders = {phase: torch.utils.data.DataLoader(data[phase], batch_size=32, shuffle=True) for phase in ['train', 'test']} 13 14def train_model(model, criterion, optimizer, dataloaders, device, num_epochs=5): 15 since = time.time() 16 17 best_model_wts = copy.deepcopy(model.state_dict()) 18 best_acc = 0.0 19 20 for epoch in range(num_epochs): 21 print('Epoch {}/{}'.format(epoch, num_epochs - 1)) 22 print('-' * 10) 23 24 # Each epoch has a training and validation phase 25 for phase in ['train', 'test']: 26 if phase == 'train': 27 model.train() # Set model to training mode 28 else: 29 model.eval() # Set model to evaluate mode 30 31 running_loss = 0.0 32 running_corrects = 0 33 34 # Iterate over data. 35 for inputs, labels in tqdm(iter(dataloaders[phase])): 36 inputs = inputs.to(device) 37 labels = labels.to(device) 38 39 # zero the parameter gradients 40 optimizer.zero_grad() 41 42 # forward 43 # track history if only in train 44 with torch.set_grad_enabled(phase == 'train'): 45 outputs = model(inputs) 46 _, preds = torch.max(outputs, 1) 47 loss = criterion(outputs, labels) 48 49 # backward + optimize only if in training phase 50 if phase == 'train': 51 loss.backward() 52 optimizer.step() 53 54 # statistics 55 running_loss += loss.item() * inputs.size(0) 56 running_corrects += torch.sum(preds == labels.data) 57 58 epoch_loss = running_loss / len(dataloaders[phase]) 59 epoch_acc = running_corrects.double() / len(dataloaders[phase]) 60 61 print('{} Loss: {:.4f} Acc: {:.4f}'.format( 62 phase, epoch_loss, epoch_acc)) 63 64 # deep copy the model 65 if phase == 'test' and epoch_acc > best_acc: 66 best_acc = epoch_acc 67 best_model_wts = copy.deepcopy(model.state_dict()) 68 69 print() 70 71 time_elapsed = time.time() - since 72 print('Training complete in {:.0f}m {:.0f}s'.format( 73 time_elapsed // 60, time_elapsed % 60)) 74 print('Best val Acc: {:4f}'.format(best_acc)) 75 76 # load best model weights 77 model.load_state_dict(best_model_wts) 78 return model 79 80device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 81 82model = models.vgg16(pretrained=False) 83model = model.to(device) 84 85model = train_model(model=model, 86 criterion=nn.CrossEntropyLoss(), 87 optimizer=torch.optim.Adam(model.parameters(), lr=0.001), 88 dataloaders=dataloaders, 89 device=device, 90 ) 91
Epoch 0/4 ---------- 0%| | 0/1563 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) 100%|██████████| 1563/1563 [07:50<00:00, 3.32it/s] train Loss: 75.5199 Acc: 3.2809 100%|██████████| 313/313 [00:38<00:00, 8.11it/s] test Loss: 73.7274 Acc: 3.1949 Epoch 1/4 ---------- 100%|██████████| 1563/1563 [07:50<00:00, 3.33it/s] train Loss: 73.8162 Acc: 3.2514 100%|██████████| 313/313 [00:38<00:00, 8.13it/s] test Loss: 73.6114 Acc: 3.1949 Epoch 2/4 ---------- 100%|██████████| 1563/1563 [07:49<00:00, 3.33it/s] train Loss: 73.7741 Acc: 3.1369 100%|██████████| 313/313 [00:38<00:00, 8.11it/s] test Loss: 73.5873 Acc: 3.1949 Epoch 3/4 ---------- 100%|██████████| 1563/1563 [07:49<00:00, 3.33it/s] train Loss: 73.7493 Acc: 3.1331 100%|██████████| 313/313 [00:38<00:00, 8.12it/s] test Loss: 73.6191 Acc: 3.1949 Epoch 4/4 ---------- 100%|██████████| 1563/1563 [07:49<00:00, 3.33it/s] train Loss: 73.7289 Acc: 3.1939 100%|██████████| 313/313 [00:38<00:00, 8.13it/s]test Loss: 73.5955 Acc: 3.1949 Training complete in 42m 22s Best val Acc: 3.194888
Tensorflow
Python
1import tensorflow_datasets as tfds 2import tensorflow as tf 3import time 4 5 6ds_test, ds_train = tfds.load('cifar10', split=['test', 'train']) 7from tensorflow.keras import applications, models 8 9def resize(ip): 10 image = ip['image'] 11 label = ip['label'] 12 image = tf.image.resize(image, (224, 224)) 13 image = tf.expand_dims(image,0) 14 label = tf.one_hot(label,10) 15 label = tf.expand_dims(label,0) 16 return (image, label) 17 18ds_train_ = ds_train.map(resize) 19ds_test_ = ds_test.map(resize) 20 21 22model = applications.vgg16.VGG16(input_shape = (224, 224, 3), weights=None, classes=10) 23model.compile(optimizer='adam', loss = 'categorical_crossentropy', metrics= ['accuracy']) 24 25batch_size = 32 26since = time.time() 27history = model.fit(ds_train_, 28 batch_size = batch_size, 29 steps_per_epoch = len(ds_train)//batch_size, 30 epochs = 5, 31 validation_steps = len(ds_test), 32 validation_data = ds_test_, 33 shuffle = True,) 34time_elapsed = time.time() - since 35print('Training complete in {:.0f}m {:.0f}s'.format( time_elapsed // 60, time_elapsed % 60 )) 36
Epoch 1/5 1562/1562 [==============================] - 125s 69ms/step - loss: 36.9022 - accuracy: 0.1069 - val_loss: 2.3031 - val_accuracy: 0.1000 Epoch 2/5 1562/1562 [==============================] - 129s 83ms/step - loss: 2.3031 - accuracy: 0.1005 - val_loss: 2.3033 - val_accuracy: 0.1000 Epoch 3/5 1562/1562 [==============================] - 129s 83ms/step - loss: 2.3035 - accuracy: 0.1069 - val_loss: 2.3031 - val_accuracy: 0.1000 Epoch 4/5 1562/1562 [==============================] - 129s 83ms/step - loss: 2.3038 - accuracy: 0.1024 - val_loss: 2.3030 - val_accuracy: 0.1000 Epoch 5/5 1562/1562 [==============================] - 129s 83ms/step - loss: 2.3028 - accuracy: 0.1024 - val_loss: 2.3033 - val_accuracy: 0.1000 Training complete in 11m 23s
補足情報(FW/ツールのバージョンなど)
ここにより詳細な情報を記載してください。
回答1件
あなたの回答
tips
プレビュー