(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

 

 

์ด๋ฒˆ์—๋Š” ๋˜ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ์ธ fashion MNIST๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๋ฅผ ํ•ด๋ณผ ๊ฒƒ์ด๋‹ค. 

์—ฌ๊ธฐ์„œ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ”์ด ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋ณด๊ณ  ๋„˜์–ด๊ฐ€์ž. 

 

 

Multiclass vs multi-label

Binary Classification ์€ ํด๋ž˜์Šค๊ฐ€ 2๊ฐ€์ง€์ธ ๊ฒฝ์šฐ์ด๋‹ค. ์‚ฌ์ง„์— ๋‚˜์™€์žˆ๋Š” ๊ฒƒ ์ฒ˜๋Ÿผ (์ŠคํŒธ, ๋‚ซ์ŠคํŒธ), ์ €๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ ํ–ˆ์—ˆ๋˜ ์„ฑ๋ณ„ (๋‚จ, ๋…€), ์›ƒ์Œ์—ฌ๋ถ€ (์›ƒ์Œ, ์•ˆ์›ƒ์Œ) ์ด๋Ÿฐ์‹์ด๋‹ค. 

MultiClass Classification ์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์œ„ ์‚ฌ์ง„์ฒ˜๋Ÿผ ์‚ฌ์ง„์— ๊ฐ•์•„์ง€ ํ•œ๋งˆ๋ฆฌ๊ฐ€ ์žˆ๊ณ  ์—ฌ๋Ÿฌ ํด๋ž˜์Šค๋“ค ์ค‘ ํ•œ ์ข…๋ฅ˜๋ฅผ ์˜ˆ์ธกํ•ด ์ฃผ๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋ฒˆ์— ํ•  fashion Mnist๋ฅผ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ”๋กœ ํ•˜์ง€ ์•Š๊ณ  ๊ทธ๋Œ€๋กœ ๋ถ„๋ฅ˜๋ชจ๋ธ์„ ๋งŒ๋“ ๋‹ค๋ฉด ๋ฉ€ํ‹ฐํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ชจ๋ธ์ด ๋  ๊ฒƒ์ด๋‹ค. 

Multi-label Classification ์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ , ๋ผ๋ฒจ๋ง๋„ ์—ฌ๋Ÿฌ๊ฐœ๋กœ ๋˜์–ด์žˆ๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์œ„ ์‚ฌ์ง„์„ ๋ณด๋ฉด ์‚ฌ์ง„ ์•ˆ์— ๊ณ ์–‘์ด์™€ ์ƒˆ๊ฐ€ ์žˆ์œผ๋‹ˆ ์—ฌ๋Ÿฌ ํด๋ž˜์Šค๋“ค ์ค‘ ๋‘๊ฐ€์ง€์˜ ๋ผ๋ฒจ๋ง์ด ๋˜์–ด์žˆ๋Š” ๊ฒƒ์ด๋‹ค.

 

๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ๋ฉ€ํ‹ฐ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๋งŒ๋“ค ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ํ•œ ์‚ฌ์ง„์— ์˜๋ฅ˜๋ฅผ ๋ฌด์ž‘์œ„๋กœ ๋ถ™์—ฌ์ฃผ๋Š” ์ž‘์—…์„ ํ•˜์—ฌ ํ•œ ์‚ฌ์ง„์— ์˜๋ฅ˜๊ฐ€ ์ตœ๋Œ€ 4๊ฐ€์ง€๊ฐ€ ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ˜•์„ ํ•œ๋‹ค. 

 

 

<multi-label ์‚ฌ์ง„ ์ถœ์ฒ˜>

https://www.kaggle.com/c/lish-moa/discussion/180500

 

Mechanisms of Action (MoA) Prediction

Can you improve the algorithm that classifies drugs based on their biological activity?

www.kaggle.com

 

 

 

 

 

์ด์ œ fashion MNIST๋ฅผ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž!

์ด๋ฒˆ์—๋„ keras์—์„œ ์ œ๊ณตํ•ด์ฃผ๋Š” datasets์—์„œ ๋ถˆ๋Ÿฌ์™€ ์‚ฌ์šฉํ•œ๋‹ค. ์ˆ˜๋™์œผ๋กœ ์„ค์น˜ํ•˜๋ ค๋ฉด ๋ฐ‘์˜ ๋งํฌ๋ฅผ ์ด์šฉํ•˜๋ฉด ๋œ๋‹ค. 

 

<fashion MNIST ์ถœ์ฒ˜ ๋ฐ ๋‹ค์šด>

https://www.kaggle.com/zalando-research/fashionmnist

 

Fashion MNIST

An MNIST-like dataset of 70,000 28x28 labeled fashion images

www.kaggle.com

 

 

MNIST ๋ฐ์ดํ„ฐ์™€ ํฌ๊ธฐ๊ฐ€ ๋™์ผํ•˜๊ฒŒ 28x28 ์ด๋‹ค. train dataset์ด 60,000์žฅ, test dataset์ด 10,000 ์žฅ์ธ ๊ฒƒ๋„ ๋™์ผํ•˜๋‹ค. 

 

 

Labels

Each training and test example is assigned to one of the following labels:

  • 0 T-shirt/top
  • 1 Trouser
  • 2 Pullover
  • 3 Dress
  • 4 Coat
  • 5 Sandal
  • 6 Shirt
  • 7 Sneaker
  • 8 Bag
  • 9 Ankle boot

ํด๋ž˜์Šค๋Š” ์ด 10๊ฐœ๋กœ, ํ‹ฐ์…”ํŠธ, ๋“œ๋ ˆ์Šค, ์…”์ธ , ์ƒŒ๋“ค, ๊ฐ€๋ฐฉ ๋“ฑ๋“ฑ ์—ฌ๋Ÿฌ ์˜๋ฅ˜ ์ข…๋ฅ˜๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค.  ์–ด๋Š ์˜๋ฅ˜์˜ ์ข…๋ฅ˜์ธ์ง€ ๋ถ„๋ฅ˜ํ•ด๋‚ด๋Š” ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ด๋‹ค.

 

 

 

 

 

fashion MNIST ๋ฐ์ดํ„ฐ์…‹ ์•Œ์•„๋ณด๊ธฐ 

์ด์ œ ๋ฐ์ดํ„ฐ์…‹์„ ์•Œ์•„๋ณด๋Š” ์ ˆ์ฐจ๋Š” ์ต์ˆ™ํ•ด์กŒ์„ ๊ฒƒ์ด๋‹ค. ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ, ๋ฒ”์œ„, ํƒ€์ž…์„ ํ™•์ธํ•˜๊ณ , ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ๋Š”์ง€ ์‹œ๊ฐํ™”๋ฅผ ํ•ด๋ณธ๋‹ค. 

 

(1) ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

fashion_mnist = keras.datasets.fashion_mnist 
((train_images, train_labels), (test_images, test_labels)) = fashion_mnist.load_data()

keras์˜ datasets์—์„œ fashion MNIST๋ฅผ ๋ถˆ๋Ÿฌ์˜จ๋‹ค.

 

labels = ["T-shirt/top",  # index 0
        "Trouser",      # index 1
        "Pullover",     # index 2 
        "Dress",        # index 3 
        "Coat",         # index 4
        "Sandal",       # index 5
        "Shirt",        # index 6 
        "Sneaker",      # index 7 
        "Bag",          # index 8 
        "Ankle boot"]   # index 9

def idx2label(idx):
  return labels[idx]

๋ ˆ์ด๋ธ”์˜ ํ…์ŠคํŠธ๋ฅผ ๋ฆฌ์ŠคํŠธ์— ์ €์žฅํ•ด์„œ ์ธ๋ฑ์Šค๋ฅผ ์ด์šฉํ•˜์—ฌ ํ…์ŠคํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค.

idx2label ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•˜์—ฌ ๋ ˆ์ด๋ธ”์„ ํ•จ์ˆ˜์— ๋„ฃ์œผ๋ฉด ๋ ˆ์ด๋ธ”์˜ ํ…์ŠคํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋„๋ก ํ•˜๋Š” ์ฝ”๋“œ. ์‹œ๊ฐํ™”์—์„œ ์‚ฌ์šฉํ•  ์˜ˆ์ •์ด๋‹ค. 

 

 

 

 

(2) ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ ํ™•์ธ 

print(f"train_images: {train_images.shape}")
print(f"train_labels: {train_labels.shape}")
print(f"test_images: {test_images.shape}")
print(f"test_labels: {test_labels.shape}")

train_images: (60000, 28, 28)

train_labels: (60000,)

test_images: (10000, 28, 28)

test_labels: (10000,)

 

๊ธฐ์กด MNIST์™€ ๊ฐ™์€ ํ˜•ํƒœ๋ฅผ ๋„๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

 

(3) ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„ ํ™•์ธ 

- image ์—์„œ 0์ด ์•„๋‹Œ ๊ฐ’ ์ถœ๋ ฅํ•ด๋ณด๊ธฐ 

train_images[train_images!=0][:50]
test_images[train_images!=0][:50]

๋„ˆ๋ฌด ๋งŽ์œผ๋‹ˆ 50๊นŒ์ง€๋งŒ ์ถœ๋ ฅํ•ด๋ณธ๋‹ค. 0์„ ์ œ์™ธํ•˜๊ณ  255๊นŒ์ง€์˜ ์ •์ˆ˜๋“ค๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉด ์ •์ƒ! 

 

- image์˜ ์ตœ์†Ÿ๊ฐ’, ์ตœ๋Œ“๊ฐ’ ๊ตฌํ•ด๋ณด๊ธฐ 

print(train_images.min(), train_images.max())
print(test_images.min(), test_images.max())

 

๋‘˜๋‹ค 0 255 ๊ฐ€ ๋‚˜์˜ค๋ฉด ์ •์ƒ! 

 

 

***์ด๋ฏธ์ง€์˜ ๊ฐ’์„ ๋”ํ•ด์„œ ๊ฐ€์žฅ ํฐ index, ๊ฐ€์žฅ ์ž‘์€ index๋ฅผ ๊ตฌํ•ด๋ณด๊ณ  ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ 

์ด๋ฏธ์ง€์˜ ๊ฐ’๋“ค์„ ๋ชจ๋‘ ๋”ํ•ด์„œ ์ˆซ์ž๊ฐ€ ํฌ๋‹ค๋ฉด ์˜ท์˜ ํฌ๊ธฐ๊ฐ€ ํฌ๊ณ  ์ƒ‰์ด ๋ฐ์„ ๊ฒƒ์ด๊ณ , ์ˆซ์ž๊ฐ€ ์ž‘๋‹ค๋ฉด ์˜ท์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์œผ๋ฉด์„œ ์ƒ‰์ด ์–ด๋‘์šธ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค. ์ •๋ง ๊ทธ๋Ÿฐ์ง€ ํ™•์ธํ•ด ๋ณด์ž. 

print(train_images.reshape((60000, -1)).sum(axis=1).argmax())
print(train_images.reshape((60000, -1)).sum(axis=1).argmin())

axis=1 ๋ฐฉํ–ฅ์œผ๋กœ ๋‹ค ๋”ํ•ด์ฃผ๋ฉด ๊ฐ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ๊ฐ’๋“ค์˜ ํ•ฉ์ด ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค. ๊ทธ์ค‘์—์„œ ์ตœ๋Œ“๊ฐ’์˜ index์™€ ์ตœ์†Ÿ๊ฐ’์˜ index๋ฅผ 

์ธ๋ฑ์Šค๋Š” 55023 9230๊ฐ€ ๋‚˜์™”๋‹ค. ์‚ฌ์ง„์„ ์ถœ๋ ฅํ•ด๋ณด๋ฉด,

์˜ˆ์ƒํ–ˆ๋˜ ๋Œ€๋กœ ํ•ฉ์ด ํฐ ์ด๋ฏธ์ง€๋Š” ๋ฐ์€ ๋ถ€๋ถ„์ด ๋งŽ๊ณ , ํ•ฉ์ด ์ž‘์€ ์ด๋ฏธ์ง€๋Š” ์–ด๋‘์šด ๋ถ€๋ถ„์ด ๋Œ€๋ถ€๋ถ„์ด๋‹ค. 

 

 

 

(4) ๋ฐ์ดํ„ฐ ํƒ€์ž… ํ™•์ธ 

print(train_images.dtype)
print(train_labels.dtype)
print(test_images.dtype)
print(test_labels.dtype)

๋ชจ๋‘ uint8 ์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! 

์ด๋ฅผ ํ†ตํ•ด ์•Œ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ์ „์ฒ˜๋ฆฌ ํ•  ๋•Œ 0-1 ์‚ฌ์ด์˜ float ํ˜•ํƒœ๋กœ ๋ฐ”๊ฟ” ์ฃผ์–ด์•ผ ๋œ๋‹ค๋ผ๋Š” ๊ฒƒ.  

 

 

(5) ๋ฐ์ดํ„ฐ ํ•œ์žฅ์”ฉ ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ 

def show(idx):
  plt.imshow(train_images[idx], cmap='gray')
  plt.title(idx2label(train_labels[idx]))
  plt.show()

์‹œ๊ฐํ™” ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•ด์ฃผ์–ด ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์ง„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

show(777)

train image์˜ 777๋ฒˆ์งธ ์‚ฌ์ง„์€ sandal 

show(77)

train image์˜ 77๋ฒˆ์งธ ์‚ฌ์ง„์€ shirt ์ž„์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

 

 

๋‹ค์Œ ํฌ์ŠคํŠธ์—์„œ๋Š” fashion MNIST ์ „์ฒ˜๋ฆฌ์™€ ์—ฌ๋Ÿฌ์žฅ ์‹œ๊ฐํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•˜์—ฌ ์ž‘์„ฑํ•  ์˜ˆ์ •์ด๋‹ค. 

 

 

 

 

 

+ Recent posts