pythonでファイルを開いて画像を表示したい

背景
大量の画像データを使って、画像の判定モデルを作りたいです

そのために、URL(https://www.cs.toronto.edu/~kriz/cifar.html)のファイル
CIFAR-10 python versionリンク先のcifar-10-python.tar.gzファイルをpythonで開いてdataとlabelを表示したいです。

画像の読み込み方法
CIFAR-10 python versionからcifar-10-python.tar.gzをダウンロードして、解凍し
cifar-10というファイルに移しました。
そしてサイトを参考に下記のコードで画像を読み込んでみました。

Python
1file="cifar-10"
2def unpickle(file):
3    import pickle
4    with open(file, 'rb') as fo:
5        dict = pickle.load(fo, encoding='bytes')
6    return dict

この後にどのように画像とラベルを表示すればいいかわからなくて困っています。
例えばData[0]とやると最初の画像が表示され、
Label[0]とやると最初のラベルが表示されるようにしたいです。
画像ファイルが.jpegでないため、いろいろ調べてみましたが詰まってしまったのでこちらにて相談させて頂きました。

少々投げやりな質問ですがご教示お願い致します。

行動規範の内容に同意します

回答1件

ベストアンサー

CIFAR-10 python version をダウンロードして解凍すると、以下のフォルダが出てきます。

bash
1tree cifar-10-batches-py

cifar-10-batches-py
├── batches.meta
├── data_batch_1
├── data_batch_2
├── data_batch_3
├── data_batch_4
├── data_batch_5
├── readme.html
└── test_batch

データは pickle 形式なので、詠み込むと、dict が出てきます。このうち、キー b'labels' が0 ～ 9 のラベルの一覧、キー b'data' が画像の一覧を表します。

In other words, the first byte is the label of the first image, which is a number in the range 0-9. The next 3072 bytes are the values of the pixels of the image. The first 1024 bytes are the red channel values, the next 1024 the green, and the final 1024 the blue. The values are stored in row-major order, so the first 32 bytes are the red channel values of the first row of the image.

画像は最初の (Channles, Height, Width) の順番の1次元配列で入っています。
なので、numpy.reshape(-1, 3, 32, 32) して、チャンネルが最後の軸の都合がいいので、numpy.moveaxis(data, 1, -1) で (サンプル数, 32, 32, 3) の配列にします。

import os
import pickle

def unpickle(path):
    with open(path, 'rb') as f:
        data = pickle.load(f, encoding='bytes')data.items()}
        
    return data

dataset_dir = 'cifar-10-batches-py'
train_paths = ['data_batch_1', 'data_batch_2',
         'data_batch_3', 'data_batch_4', 'data_batch_5']
test_path = 'test_batch'

data = []
labels = []

for p in train_paths:
    unpacked = unpickle(os.path.join(dataset_dir, p))
    data.extend(unpacked[b'data'])
    labels.extend(unpacked[b'labels'])

data = np.array(data).reshape(-1, 3, 32, 32)
data = np.moveaxis(data, 1, -1)  # (N, C, H, W) -> (N, H, W, C)
labels = np.array(labels)

print(data.shape, labels.shape)  # (50000, 32, 32, 3) (50000,)

Jupyter Notebook で最初の100枚を確認してみます。

python
1import cv2
2from IPython.display import Image, display
3
4def imshow(img):
5    encoded = cv2.imencode('.png', img)[1]
6    display(Image(encoded))
7
8# 最初の100枚を結合して1枚の画像にして、表示する。
9first100_imgs = data[:100].reshape(10, 10, 32, 32, 3)
10merged_img = np.vstack([np.hstack(h_imgs) for h_imgs in first100_imgs])
11imshow(merged_img)
12
13# 見づらいので少し拡大して見てみると
14merged_img = cv2.resize(merged_img, dsize=None, fx=2.0, fy=2.0)
15imshow(merged_img)

投稿2019/03/21 05:13

編集2019/03/21 05:59

tiitoi

総合スコア21956

trey_0329

2019/03/21 05:51

tiitoiさんいつも大変ありがとうございます。お教えいたただいたコードをしっかり復習致します。また以前に、機械学習を勉強するには本を使ったほうが効率が良い、とアドバイス頂き、その後に何冊か購入し勉強してみました。ネットと違い、本（特にOrelly）だと体系的に学ぶことができ、全体像をしっかり理解できるようになりました。ありがとうございます。