(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

<celeb_a ๋ฐ์ดํ„ฐ์…‹ ์ถœ์ฒ˜>
https://www.tensorflow.org/datasets/catalog/celeb_a

 

celeb_a  |  TensorFlow Datasets

CelebFaces Attributes Dataset (CelebA)์€ ๊ฐ๊ฐ 40 ๊ฐœ์˜ ์†์„ฑ ์ฃผ์„์ด์žˆ๋Š” 20 ๋งŒ ๊ฐœ ์ด์ƒ์˜ ์œ ๋ช…์ธ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ ๋œ ๋Œ€๊ทœ๋ชจ ์–ผ๊ตด ์†์„ฑ ๋ฐ์ดํ„ฐ ์„ธํŠธ์ž…๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ์ด๋ฏธ์ง€๋Š” ํฐ ํฌ์ฆˆ ๋ณ€ํ˜•๊ณผ ๋ฐฐ๊ฒฝ ํ˜ผ

www.tensorflow.org

caleba ๋ฐ์ดํ„ฐ์…‹์€ 40๊ฐœ์˜ ์†์„ฑ, 10,177๊ฐœ์˜ ์‹ ์›, 20๋งŒ๊ฐœ ์ด์ƒ์˜ ์œ ๋ช…์ธ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ๋œ ๋Œ€๊ทœ๋ชจ ์–ผ๊ตด ์†์„ฑ ๋ฐ์ดํ„ฐ์…‹์ด๋‹ค.
ํ™œ์šฉ - ์–ผ๊ตด ์†์„ฑ์ธ์‹, ์–ผ๊ตด ๊ฐ์ง€, ์–ผ๊ตด ์œ„์น˜ํŒŒ์•…

์ถœ์ฒ˜&nbsp;https://www.tensorflow.org/datasets/catalog/celeb_a


celeba์˜ ์†์„ฑ๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด dictionary๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

FeaturesDict({
    'attributes': FeaturesDict({
        '5_o_Clock_Shadow': tf.bool,
        'Arched_Eyebrows': tf.bool,
        'Attractive': tf.bool,
        'Bags_Under_Eyes': tf.bool,
        'Bald': tf.bool,
        'Bangs': tf.bool,
        'Big_Lips': tf.bool,
        'Big_Nose': tf.bool,
        'Black_Hair': tf.bool,
        'Blond_Hair': tf.bool,
        'Blurry': tf.bool,
        'Brown_Hair': tf.bool,
        'Bushy_Eyebrows': tf.bool,
        'Chubby': tf.bool,
        'Double_Chin': tf.bool,
        'Eyeglasses': tf.bool,
        'Goatee': tf.bool,
        'Gray_Hair': tf.bool,
        'Heavy_Makeup': tf.bool,
        'High_Cheekbones': tf.bool,
        'Male': tf.bool,
        'Mouth_Slightly_Open': tf.bool,
        'Mustache': tf.bool,
        'Narrow_Eyes': tf.bool,
        'No_Beard': tf.bool,
        'Oval_Face': tf.bool,
        'Pale_Skin': tf.bool,
        'Pointy_Nose': tf.bool,
        'Receding_Hairline': tf.bool,
        'Rosy_Cheeks': tf.bool,
        'Sideburns': tf.bool,
        'Smiling': tf.bool,
        'Straight_Hair': tf.bool,
        'Wavy_Hair': tf.bool,
        'Wearing_Earrings': tf.bool,
        'Wearing_Hat': tf.bool,
        'Wearing_Lipstick': tf.bool,
        'Wearing_Necklace': tf.bool,
        'Wearing_Necktie': tf.bool,
        'Young': tf.bool,
    }),
    'image': Image(shape=(218, 178, 3), dtype=tf.uint8),
    'landmarks': FeaturesDict({
        'lefteye_x': tf.int64,
        'lefteye_y': tf.int64,
        'leftmouth_x': tf.int64,
        'leftmouth_y': tf.int64,
        'nose_x': tf.int64,
        'nose_y': tf.int64,
        'righteye_x': tf.int64,
        'righteye_y': tf.int64,
        'rightmouth_x': tf.int64,
        'rightmouth_y': tf.int64,
    }),
})

์„ฑ๋ณ„, ์›ƒ์Œ ์—ฌ๋ถ€, ์ Š์Œ, ์•ˆ๊ฒฝ ์ฐฉ์šฉ, ๋ชจ์ž ์ฐฉ์šฉ, ์›จ์ด๋ธŒ ๋จธ๋ฆฌ, ๊ฐˆ์ƒ‰๋จธ๋ฆฌ ์—ฌ๋ถ€ ๋“ฑ๋“ฑ ๋งŽ์€ ์†์„ฑ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ์ด ์ค‘์—์„œ ๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ์„ฑ๋ณ„(Male)๊ณผ ์›ƒ์Œ ์—ฌ๋ถ€(Smiling)๋ฅผ ๋ถ„๋ฅ˜ํ•ด ๋‚ด๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค ๊ฒƒ์ด๋‹ค.

์†์„ฑ์„ ๊บผ๋‚ผ๋•Œ๋Š” ['attributes']['Male'] ์ด๋Ÿฐ์‹์œผ๋กœ ๊บผ๋‚ด์ฃผ๋ฉด ๋œ๋‹ค.




์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹ ๋‹ค์šด๋กœ๋“œ

import tensorflow_datasets as tfds 
# tfds.list_builders() -> ๋ฐ์ดํ„ฐ์…‹ ๋ชฉ๋ก ์ „์ฒด๋ณด๊ธฐ 
celeb_a = tfds.load('celeb_a') # celeb_a ๋ฐ์ดํ„ฐ์…‹ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ




๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ์ถ•์†Œ๋œ celeba ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค. ์ถ•์†Œํ•˜๋Š” ์ฝ”๋“œ๋Š” ์ƒ๋žตํ•˜์ง€๋งŒ, ๊ณผ์ •์„ ์ ์–ด๋ณด๋ฉด
1. celeb_a['validation']๊ณผ celeb_a['test'] ๋ฅผ ๊ฐ๊ฐ train, test ๋กœ ํ• ๋‹นํ•ด์ค€๋‹ค.
2. Male, Smiling ์†์„ฑ๋งŒ ๋ถˆ๋Ÿฌ์™€ train_images, train_labels, test_images, test_labels๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.
---> ์—ฌ๊ธฐ์„œ test_images์™€ test_labels ๋งŒ ์ด์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ถ•์†Œ
3. test_images์™€ test_labels์—์„œ ์›ƒ๋Š”๋‚จ์ž, ์•ˆ์›ƒ๋Š”๋‚จ์ž, ์›ƒ๋Š”์—ฌ์ž, ์•ˆ์›ƒ๋Š”์—ฌ์ž๋ฅผ ๋ถ„๋ฆฌํ•ด๋‚ด์–ด ๊ฐ๊ฐ 550๊ฐœ์”ฉ ์ž˜๋ผ ๊ณ ๋ฅด๊ฒŒ ์ถ•์†Œํ•˜์—ฌ ํ•ฉํ•ด์ง„๋‹ค ---> ๊ทธ๋Ÿผ ์ด 2200๊ฐœ๊ฐ€ ๋œ๋‹ค!
4. 2200๊ฐœ ์งœ๋ฆฌ ๋ฐ์ดํ„ฐ๋ฅผ ์„ž์–ด์ฃผ๊ณ , 2000๊ฐœ๊นŒ์ง€ train, ๋‚˜๋จธ์ง€ 200๊ฐœ๋Š” test๋กœ ํ• ๋‹นํ•œ๋‹ค.
5. ์ด๋ฅผ ๋‹ค์‹œ train_images, train_labels, test_images, test_labels ๋กœ ๋‚˜๋ˆ„์–ด์ฃผ๋ฉด ๋! ์ถ•์†Œํ•˜๋Š” ๊ฒŒ ๋ฒˆ๊ฑฐ๋กœ์šฐ๋ฏ€๋กœ npzํŒŒ์ผ๋กœ ์ €์žฅํ•ด๋†“์ž.





celeba_small ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๊ธฐ
(1) ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ณ  train, test ๋ฐ์ดํ„ฐ ๋‚˜๋ˆ„๊ธฐ

celeba_small = np.load('./celeba_small.npz') 

# ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 
train_images = celeba_small['train_images'] 
train_labels = celeba_small['train_labels'] 
test_images = celeba_small['test_images'] 
test_labels = celeba_small['test_labels']





(2) ์‚ฌ์ง„ ํ•œ์žฅ ๊บผ๋‚ด์„œ ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ

plt.imshow(train_images[777]) 
plt.colorbar() 
plt.show() 
print(train_labels[777])

777๋ฒˆ์งธ ์‚ฌ์ง„์„ ๊บผ๋‚ด๋ณด์•˜๋‹ค.

๊ฒฐ๊ณผ๋Š” ์ด๋Ÿฌํ•˜๋‹ค. ์›ƒ๋Š” ๊ฒƒ ๊ฐ™๊ธฐ๋„ ํ•œ๋ฐ ์›ƒ์ง€์•Š๋Š” ์—ฌ์ž๋ผ๊ณ  ๋ผ๋ฒจ๋ง์ด ๋˜์–ด์žˆ๋‹ค!





(3) ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„, ํฌ๊ธฐ, ๋ฐ์ดํ„ฐ ํƒ€์ž… ์•Œ์•„๋ณด๊ธฐ
- ๋ฒ”์œ„

# 0์ด ์•„๋‹Œ ์ˆซ์ž 50๊ฐœ๋งŒ ์ถœ๋ ฅํ•ด๋ณด๊ธฐ 
train_images[train_images != 0][:50] 
test_images[test_images != 0][:50] 
# ๋ฐ์ดํ„ฐ์˜ ์ตœ์†Ÿ๊ฐ’ / ์ตœ๋Œ“๊ฐ’ 
print(train_images.min(), train_images.max()) 
print(train_labels.min(), train_labels.max()) 
print(test_images.min(), test_images.max()) 
print(test_labels.min(), test_labels.max())

50๊ฐœ ์ถœ๋ ฅํ•œ ๋ฐ์ดํ„ฐ๋“ค์€ ๋ชจ๋‘ 0๊ณผ 1์‚ฌ์ด์˜ ๊ฐ’๋“ค์ด์–ด์•ผ ํ•˜๊ณ , ์ด๋ฏธ์ง€์˜ ๋ฒ”์œ„๋Š” 0.0-1.0, ๋ผ๋ฒจ์˜ ๋ฒ”์œ„๋Š” 0-1 ์œผ๋กœ ๋‚˜์˜ค๋ฉด ์ •์ƒ!

- ํฌ๊ธฐ

print(train_images.shape, test_images.shape) 
print(train_labels.shape, test_labels.shape) 

(2000, 72, 59, 3) (200, 72, 59, 3)
(2000, 2) (200, 2)
์ด์™€ ๊ฐ™์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! ์‚ฌ์ง„์˜ ํฌ๊ธฐ๋ฅผ ๋ฐ์ดํ„ฐ ์ถ•์†Œํ•  ๋•Œ ์ค„์—ฌ์„œ ์›๋ณธ๋ณด๋‹ค๋Š” ์ž‘๋‹ค. ์ €๋ฒˆ ํ”„๋กœ์ ํŠธ mnist ์™€ ๋‹ฌ๋ฆฌ ์ฑ„๋„ 3์ด ์ถ”๊ฐ€๋˜์–ด ์ƒ‰์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค!

- ๋ฐ์ดํ„ฐ ํƒ€์ž…

print(train_images.dtype, test_images.dtype) 
print(train_labels.dtype, test_labels.dtype) 

float64 float64
int8 int8
์œ„์™€ ๊ฐ™์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! ์ด๊ฑธ ํ†ตํ•ด ์•ˆ ์‚ฌ์‹ค์€ dtype์ด float64์ด๊ณ , ๋ฒ”์œ„๊ฐ€ 0.0 - 1.0 ์ด๋ฏ€๋กœ normalize๋ฅผ ์•ˆํ•ด์ค˜๋„ ๋œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

๋ฒ”์œ„, ํฌ๊ธฐ, ๋ฐ์ดํ„ฐ ํƒ€์ž…์€ ์ˆ˜์‹œ๋กœ ํ™•์ธํ•˜๋Š” ์Šต๊ด€์„ ๊ฐ–์ž!

๋‹ค์Œ์—๋Š” ์ „์ฒ˜๋ฆฌ์™€ ์‹œ๊ฐํ™”์— ๋Œ€ํ•œ ํฌ์ŠคํŒ…์„ ํ•  ์˜ˆ์ •์ด๋‹ค.

+ Recent posts