Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Comments¶
Transformations in
torchvision.transformswork on images, tensors (representing images) and possibly on numpy arrays (representing images). However, a transformation (e.g.,ToTensor) might work differently on different input types. So you’d be clear about what exactly a transformation function does. A good practice is to always convert your non-tensor input data to tensors using the transformationToTensorand then apply other transformation functions (which then consumes tensors and produces tensors).It is always a good idea to normalize your input tensors to be within a small range (e.g., [0, 1]).
import torch
import torchvision
import numpy as np
from PIL import Imageimg = Image.open("../../home/media/poker/4h.png")
img
arr = np.array(img)
arrarray([[[ 37, 62, 59],
[149, 174, 171],
[225, 238, 239],
...,
[232, 250, 249],
[217, 235, 234],
[122, 156, 154]],
[[127, 133, 134],
[240, 246, 247],
[244, 243, 246],
...,
[239, 240, 243],
[243, 244, 247],
[218, 233, 236]],
[[152, 158, 159],
[245, 251, 252],
[237, 236, 239],
...,
[235, 236, 239],
[235, 236, 239],
[227, 242, 245]],
...,
[[ 45, 66, 64],
[153, 174, 172],
[233, 237, 239],
...,
[235, 239, 241],
[226, 230, 232],
[132, 157, 152]],
[[ 24, 45, 43],
[ 38, 59, 57],
[105, 109, 111],
...,
[119, 123, 125],
[ 92, 96, 98],
[ 37, 62, 57]],
[[ 25, 55, 50],
[ 17, 47, 42],
[ 15, 38, 33],
...,
[ 16, 40, 40],
[ 16, 40, 40],
[ 19, 53, 49]]], dtype=uint8)arr.shape(54, 37, 3)torchvision .transforms .ToTensor¶
Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255]
to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
if the PIL Image belongs to one of the modes (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1)
or if the numpy.ndarray has dtype = np.uint8.
This is the transformation that you alway need when preparing dataset for a computer vision task.
trans = torchvision.transforms.ToTensor()t1 = trans(img)
t1tensor([[[0.1451, 0.5843, 0.8824, ..., 0.9098, 0.8510, 0.4784],
[0.4980, 0.9412, 0.9569, ..., 0.9373, 0.9529, 0.8549],
[0.5961, 0.9608, 0.9294, ..., 0.9216, 0.9216, 0.8902],
...,
[0.1765, 0.6000, 0.9137, ..., 0.9216, 0.8863, 0.5176],
[0.0941, 0.1490, 0.4118, ..., 0.4667, 0.3608, 0.1451],
[0.0980, 0.0667, 0.0588, ..., 0.0627, 0.0627, 0.0745]],
[[0.2431, 0.6824, 0.9333, ..., 0.9804, 0.9216, 0.6118],
[0.5216, 0.9647, 0.9529, ..., 0.9412, 0.9569, 0.9137],
[0.6196, 0.9843, 0.9255, ..., 0.9255, 0.9255, 0.9490],
...,
[0.2588, 0.6824, 0.9294, ..., 0.9373, 0.9020, 0.6157],
[0.1765, 0.2314, 0.4275, ..., 0.4824, 0.3765, 0.2431],
[0.2157, 0.1843, 0.1490, ..., 0.1569, 0.1569, 0.2078]],
[[0.2314, 0.6706, 0.9373, ..., 0.9765, 0.9176, 0.6039],
[0.5255, 0.9686, 0.9647, ..., 0.9529, 0.9686, 0.9255],
[0.6235, 0.9882, 0.9373, ..., 0.9373, 0.9373, 0.9608],
...,
[0.2510, 0.6745, 0.9373, ..., 0.9451, 0.9098, 0.5961],
[0.1686, 0.2235, 0.4353, ..., 0.4902, 0.3843, 0.2235],
[0.1961, 0.1647, 0.1294, ..., 0.1569, 0.1569, 0.1922]]])t1.shapetorch.Size([3, 54, 37])t2 = trans(arr)
t2tensor([[[0.1451, 0.5843, 0.8824, ..., 0.9098, 0.8510, 0.4784],
[0.4980, 0.9412, 0.9569, ..., 0.9373, 0.9529, 0.8549],
[0.5961, 0.9608, 0.9294, ..., 0.9216, 0.9216, 0.8902],
...,
[0.1765, 0.6000, 0.9137, ..., 0.9216, 0.8863, 0.5176],
[0.0941, 0.1490, 0.4118, ..., 0.4667, 0.3608, 0.1451],
[0.0980, 0.0667, 0.0588, ..., 0.0627, 0.0627, 0.0745]],
[[0.2431, 0.6824, 0.9333, ..., 0.9804, 0.9216, 0.6118],
[0.5216, 0.9647, 0.9529, ..., 0.9412, 0.9569, 0.9137],
[0.6196, 0.9843, 0.9255, ..., 0.9255, 0.9255, 0.9490],
...,
[0.2588, 0.6824, 0.9294, ..., 0.9373, 0.9020, 0.6157],
[0.1765, 0.2314, 0.4275, ..., 0.4824, 0.3765, 0.2431],
[0.2157, 0.1843, 0.1490, ..., 0.1569, 0.1569, 0.2078]],
[[0.2314, 0.6706, 0.9373, ..., 0.9765, 0.9176, 0.6039],
[0.5255, 0.9686, 0.9647, ..., 0.9529, 0.9686, 0.9255],
[0.6235, 0.9882, 0.9373, ..., 0.9373, 0.9373, 0.9608],
...,
[0.2510, 0.6745, 0.9373, ..., 0.9451, 0.9098, 0.5961],
[0.1686, 0.2235, 0.4353, ..., 0.4902, 0.3843, 0.2235],
[0.1961, 0.1647, 0.1294, ..., 0.1569, 0.1569, 0.1922]]])t2.shapetorch.Size([3, 54, 37])