๋”ฅ๋Ÿฌ๋‹/CNN

๋”ฅ๋Ÿฌ๋‹(10) - AlexNet

SolartheNomad 2023. 3. 24. 00:48

๐Ÿ‘ฉ‍๐Ÿ’ป AlexNet

 

๐Ÿ“Œ๐Ÿ“Œ CNN ๊ตฌ์กฐ ํ†ฑ์•„๋ณด๊ธฐ 

 

-  3์ฐจ์› ๊ตฌ์กฐ

-  ๋„ˆ๋น„(width)์™€ ๋†’์ด(height)๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊นŠ์ด(depth)๋ฅผ ๊ฐ–๋Š”๋‹ค. 

-  ์ด๋ฏธ์ง€๋Š” R/G/B ์„ฑ๋ถ„ ์„ธ ๊ฐœ๋ฅผ ๊ฐ–๊ธฐ ๋•Œ๋ฌธ์— ์‹œ์ž‘์ด 3์ด์ง€๋งŒ, ํ•ฉ์„ฑ๊ณฑ์„ ๊ฑฐ์น˜๋ฉด์„œ ํŠน์„ฑ ๋งต์ด ๋งŒ๋“ค์–ด์ง€๊ณ  ์ด๊ฒƒ์— ๋”ฐ๋ผ ์ค‘๊ฐ„ ์˜์ƒ์˜ ๊นŠ์ด๊ฐ€ ๋‹ฌ๋ผ์ง€๊ฒŒ ๋œ๋‹ค. (3์—์„œ ๋” ์ปค์งˆ์ˆ˜๋„ ์žˆ์Œ)

 

 

๐Ÿ AlexNet์˜ ๊ตฌ์กฐ 

- ํ•ฉ์„ฑ๊ณฑ์ธต ์ด ๋‹ค์„ฏ ๊ฐœ + ์™„์ „์—ฐ๊ฒฐ์ธต ์„ธ ๊ฐœ

- ๋งจ ๋งˆ์ง€๋ง‰ ์™„์ „์—ฐ๊ฒฐ์ธต์€ ์นดํ…Œ๊ณ ๋ฆฌ 1000๊ฐœ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด ์†Œํ”„ํŠธ๋งฅ์Šค ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•จ

- GPU ๋‘ ๊ฐœ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋ณ‘๋ ฌ ๊ตฌ์กฐ ์ด๋‹ค. 

- ๋„คํŠธ์›Œํฌ ์ž…๋ ฅ : 

 

 

๋ชฉํ‘œํ•˜๋Š” ์ฝ”๋“œ์— ๋Œ€ํ•œ ๋„คํŠธ์›Œํฌ ์„ค๊ณ„

 

-  GPU-1์—์„œ๋Š” ์ฃผ๋กœ ์ปฌ๋Ÿฌ์™€ ์ƒ๊ด€์—†๋Š” ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•œ ์ปค๋„์ด ํ•™์Šต๋˜๊ณ , GPU-2์—์„œ๋Š” ์ฃผ๋กœ ์ปฌ๋Ÿฌ์™€ ๊ด€๋ จ๋œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•œ ์ปค๋„์ด ํ•™์Šต๋œ๋‹ค. 

 

 

โœ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํ˜ธ์ถœํ•˜๊ธฐ

import torch
import torchvision
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
from torch.autograd import Variable
from torch import optim
import torch.nn as nn
import torch.nn.functional as F
import os
import cv2
import random
from PIL import Image
from tqdm import tqdm_notebook as tqdm
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

โœ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ 

 - ์•ž์—์„œ ์„ค๋ช…์„ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ตณ์ด ์„ค๋ช…์„ ์ฒจ๋ถ€ํ•˜์ง„ ์•Š๊ฒ ๋‹ค. 

๋“œ 6-23 ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

class ImageTransform():
    def __init__(self, resize, mean, std):
        self.data_transform = {
            'train': transforms.Compose([
                transforms.RandomResizedCrop(resize, scale=(0.5,1.0)),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize(mean, std)
            ]),
            'val': transforms.Compose([
                transforms.Resize(256),
                transforms.CenterCrop(resize),
                transforms.ToTensor(),
                transforms.Normalize(mean, std)
            ])
        }

    def __call__(self, img, phase):
        return self.data_transform[phase](img)

 

โœ ์ด๋ฏธ์ง€๊ฐ€ ์œ„์น˜ํ•œ ๊ฒฝ๋กœ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์™€์„œ ํ›ˆ๋ จ, ๊ฒ€์ฆ ,ํ…Œ์ŠคํŠธ๋กœ ๋ถ„๋ฆฌํ•˜๊ธฐ 

cat_directory = '/cat'
dog_directory = '/dog'


cat_images_filepaths = sorted([os.path.join(cat_directory, f) for f in os.listdir(cat_directory)])
dog_images_filepaths = sorted([os.path.join(dog_directory, f) for f in os.listdir(dog_directory)])
images_filepaths = [*cat_images_filepaths, *dog_images_filepaths]
correct_images_filepaths = [i for i in images_filepaths if cv2.imread(i) is not None]

random.seed(42)
random.shuffle(correct_images_filepaths)
train_images_filepaths = correct_images_filepaths[:400]
val_images_filepaths = correct_images_filepaths[400:-10]
test_images_filepaths = correct_images_filepaths[-10:]
print(len(train_images_filepaths), len(val_images_filepaths), len(test_images_filepaths))

ํ›ˆ๋ จ, ๊ฒ€์ฆ, ํ…Œ์ŠคํŠธ์—์„œ ์‚ฌ์šฉํ•  ๋ฐ์ดํ„ฐ์…‹์˜ ์ด๋ฏธ์ง€ ์ˆ˜

 

โœ ์ปค์Šคํ…€๋ฐ์ดํ„ฐ์…‹์„ ์ •์˜ํ•˜๊ธฐ

class DogvsCatDataset(Dataset):
    def __init__(self, file_list, transform=None, phase='train'):
        self.file_list = file_list #์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ„์น˜ํ•œ ํŒŒ์ผ ๊ฒฝ๋กœ
        self.transform = transform #์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ
        self.phase = phase #self.phase๋Š” ImageTransform()์—์„œ ์ •์˜ํ•œ ‘train’๊ณผ ‘val’์„ ์˜๋ฏธ
    def __len__(self):
        return len(self.file_list)
        
    def __getitem__self(self, idx):
        img_path = self.file_list[idx] #์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์˜ ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ
        img = Image.open(img_path)
        img_transformed = self.transform(img, self.phase)
        
        label = img_path.split('/')[-1].split('.')[0] #๋ ˆ์ด๋ธ” ๊ฐ’ ๊ฐ€์ ธ์˜ค๊ธฐ
        if label == 'dog':
            label = 1
        elif label == 'cat':
            label = 0
            
        return img_transformed, label  # ์ „์ฒ˜๋ฆฌ๊ฐ€ ์ ์šฉ๋œ ์ด๋ฏธ์ง€์™€ ๋ ˆ์ด๋ธ”์„ ๋ฐ˜ํ™˜ํ•จ

 

 

โœ ๋ณ€์ˆ˜ ์ •์˜ํ•˜๊ธฐ(ํ‰๊ท , ํ‘œ์ค€ํŽธ์ฐจ, ๋ฐฐ์น˜์‚ฌ์ด์ฆˆ, ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ)

size = 256 #AlexNet์€ ๊นŠ์ด๊ฐ€ ๊นŠ์€ ๋„คํŠธ์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๊ฐ€ 256์ด ์•„๋‹ˆ๋ฉด ํ’€๋ง์ธต ๋•Œ๋ฌธ์— ํฌ๊ธฐ๊ฐ€ ์ค„์–ด ์˜ค๋ฅ˜ ๋ฐœ์ƒ
mean = (0.485, 0.456, 0.406)
std = (0.229, 0.224, 0.225)
batch_size = 32

 

 

 

 

 

โœ AlexNet ๋ชจ๋ธ ๋„คํŠธ์›Œํฌ ์ •์˜ํ•˜๊ธฐ

class AlexNet(nn.Module):
    def __init__(self) -> None:
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True), 
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.avgpool = nn.AdaptiveAvgPool2d((6, 6)) 
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256*6*6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, 2),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x

 

 

 

nn.ReLU(inplace=True)์˜ inplace=True

 

- ์—ฐ์‚ฐ์— ๋Œ€ํ•œ ๊ฒฐ๊ด๊ฐ’์„ ์ƒˆ๋กœ์šด ๋ณ€์ˆ˜์— ์ €์žฅํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ ๊ธฐ์กด ๋ฐ์ดํ„ฐ๋ฅผ ๋Œ€์ฒดํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธ

๊ธฐ์กด ๊ฐ’์„ ์—ฐ์‚ฐ ๊ฒฐ๊ด๊ฐ’์œผ๋กœ ๋Œ€์ฒดํ•จ์œผ๋กœ์จ ๊ธฐ์กด ๊ฐ’๋“ค์„ ๋ฌด์‹œํ•จ

 

 

ํ’€๋ง์ธต 

 

nn.AvgPool2d

 

- (N, C, Hin, Win) ํฌ๊ธฐ์˜ ์ž…๋ ฅ์„ (N, C, Hout, Wout) ํฌ๊ธฐ๋กœ ์ถœ๋ ฅํ•˜๋Š” ๊ฒƒ

 

- Hout, Wout ๊ณ„์‚ฐํ•˜๊ธฐ

 

AdaptiveAvgPool2d

 

- ํ’€๋ง ์ž‘์—…์ด ๋๋‚  ๋•Œ ํ•„์š”ํ•œ ์ถœ๋ ฅ ํฌ๊ธฐ๋ฅผ ์ •์˜