๋Ÿฌ๋‹์Šคํ‘ผ ์ˆ˜์—… ์ •๋ฆฌ 

 

 

< ์ด์ „ ๊ธ€ > 

https://silvercoding.tistory.com/48

 

[python pandas] pandas ๊ธฐ์ดˆ ์‚ฌ์šฉ (1)

๋Ÿฌ๋‹์Šคํ‘ผ ์ˆ˜์—… ์ •๋ฆฌ * ํŒ๋‹ค์Šค ๊ธฐ๋ณธ ํ•จ์ˆ˜ ๋ฐ์ดํ„ฐ ํŒŒ์ผ ์ฝ๊ธฐ : read_excel(), read_csv() ๋ฐ์ดํ„ฐ ์„ ํƒํ•˜๊ธฐ : df.loc(), df.iloc() ์ธ๋ฑ์Šค/ ์ปฌ๋Ÿผ ๋ณ€๊ฒฝํ•˜๊ธฐ : columns/ index , reset_index()  pandas vs excel panda..

silvercoding.tistory.com

 

 


 

 1. pandas ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 
import pandas as pd

 

 2. ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ & ์‚ดํŽด๋ณด๊ธฐ 
fpath = './data/exam.xlsx' 
data = pd.read_excel(fpath, index_col = '๋ฒˆํ˜ธ')

index_col='๋ฒˆํ˜ธ' ๋กœ ์ง€์ •ํ•˜์—ฌ ์—‘์…€ ํŒŒ์ผ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

 

 

* head(), info(), describe() ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๋Š” ์Šต๊ด€ ๊ฐ–๊ธฐ 

data.head()

data.info()

 

data.describe()


 

 

 3. ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€ํ•˜๊ธฐ 

df[ '์ปฌ๋Ÿผ๋ช…' ] =  data ( df.์ปฌ๋Ÿผ๋ช… = data ํ˜•ํƒœ๋Š” ์‚ฌ์šฉ ๋ถˆ๊ฐ€๋Šฅ )  

- ํ•˜๋‚˜์˜ ๊ฐ’ ์ถ”๊ฐ€ : ์ „์ฒด ๋ชจ๋‘ ๋™์ผํ•œ ๊ฐ’์œผ๋กœ ์ถ”๊ฐ€๋จ 

- ๊ทธ๋ฃน ์ถ”๊ฐ€ : ๋ฆฌ์ŠคํŠธ, ํŒ๋‹ค์Šค์˜ ์‹œ๋ฆฌ์ฆˆ๋กœ ์ถ”๊ฐ€ 

 

data['์ˆ˜ํ•™']
data.์ˆ˜ํ•™

๋ฐ์ดํ„ฐ๋ฅผ ์„ ํƒํ•  ๋• ์œ„์™€ ๊ฐ™์€ ๋‘๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ์ž‘์„ฑํ•ด์ฃผ์—ˆ๋‹ค. 

 

- ํ•œ๊ฐœ ๊ฐ’ ์ถ”๊ฐ€ 

data['์Œ์•…'] = 90             
data.head()

๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•  ๋• data.์Œ์•… ์˜ ํ˜•ํƒœ๋กœ๋Š” ๋ถˆ๊ฐ€๋Šฅ ํ•˜๋‹ค. ํ•œ๊ฐœ์˜ ๊ฐ’์„ ์ถ”๊ฐ€ํ•˜๋ฉด ๋ชจ๋“  row์— ๊ฐ™์€ ๊ฐ’์ด ๋“ค์–ด๊ฐ€๊ฒŒ ๋œ๋‹ค. 

 

- ์—ฌ๋Ÿฌ ๊ฐ’ ์ถ”๊ฐ€ 

data['์ฒด์œก'] =  [100, 80, 60]
data.head()

๋ฆฌ์ŠคํŠธ๋กœ ์—ฌ๋Ÿฌ ๊ฐ’์„ ์ถ”๊ฐ€ํ•ด ์ค„ ์ˆ˜๋„ ์žˆ๋‹ค. ์ด ๋•Œ ์ฃผ์˜ํ•  ์ ์€ ๋ฆฌ์ŠคํŠธ ์›์†Œ ๊ฐœ์ˆ˜์™€ row๊ฐœ์ˆ˜๊ฐ€ ๊ฐ™์•„์•ผ ํ•œ๋‹ค. 

 

data['๊ตญ์˜์ˆ˜'] =  (data['๊ตญ์–ด'] + data['์˜์–ด'] + data['์ˆ˜ํ•™'] ) / 3
data.head()

์ด๋ ‡๊ฒŒ ์ปฌ๋Ÿผ ๊ฐ„์˜ ์—ฐ์‚ฐ์„ ํ†ตํ•˜์—ฌ ์ƒˆ๋กœ์šด ์ปฌ๋Ÿผ์„ ๋งŒ๋“ค์–ด ์ค„ ์ˆ˜๋„ ์žˆ๋‹ค. 

 

 

 

 4. ๋ฐ์ดํ„ฐ ํ‘œ ๋ณ‘ํ•ฉํ•˜๊ธฐ 
fpath = './data/exam.xlsx'
A = pd.read_excel(fpath, index_col = '๋ฒˆํ˜ธ')
A.head()

ํŒŒ์ผ์„ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์™€์„œ A ๋ณ€์ˆ˜์— ์ €์žฅํ•ด ์ค€๋‹ค. 

fpath2 = './data/exam_extra.xlsx'
B = pd.read_excel(fpath2, index_col = '๋ฒˆํ˜ธ')
B.head()

์ถ”๊ฐ€ ํ•  ์—‘์…€ํŒŒ์ผ์„ ๋ถˆ๋Ÿฌ์™€ B ๋ณ€์ˆ˜์— ์ €์žฅํ•ด ์ค€๋‹ค.

 

 

- merge()

์ถœ์ฒ˜ - ๋Ÿฌ๋‹์Šคํ‘ผ์ฆˆ

๋ณ‘ํ•ฉ ๊ธฐ์ค€์„ ์ธ์ž์— ๋„ฃ์–ด ์„ค์ •ํ•ด์ค„ ์ˆ˜ ์žˆ๋‹ค. ์ด ๋•Œ, left_on ๊ณผ left_index ์ค‘ 1๊ฐœright_on ๊ณผ right_index ์ค‘ 1๊ฐœ๋ฅผ ์จ์•ผ ํ•˜๊ณ , ๋‘๊ฐ€์ง€๋ฅผ ํ•œ๋ฒˆ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค

total = pd.merge(A, B, how = 'left', left_index = True, right_index = True)
total.head()

left์ผ ๊ฒฝ์šฐ A๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํ•ฉ๋ณ‘์ด ๋œ๋‹ค. 4๋ฒˆ, 5๋ฒˆ์€ ๋‚˜์˜ค์ง€ ์•Š๊ณ , B์˜ 3๋ฒˆ์€ NaN์œผ๋กœ ์ฑ„์›Œ์ง„๋‹ค. 

pd.merge(A, B, how = 'right', left_index = True, right_index = True)

์œ„์™€ ๊ฐ™์ด ์ž‘์„ฑ๋˜์—ˆ์„ ๋•Œ , B์— ๋งž์ถ”์–ด ํ•ฉ๋ณ‘๋œ๋‹ค. ๋”ฐ๋ผ์„œ 3๋ฒˆ์€ ์—†๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

pd.merge(A, B, how = 'inner', left_index = True, right_index = True)

inner๋ฅผ ์‚ฌ์šฉํ•˜์˜€์„ ๊ฒฝ์šฐ , A ์™€ B ๋ชจ๋‘ ์กด์žฌํ•˜๋Š” ์ธ๋ฑ์Šค์˜๋งŒ ํ•ฉ๋ณ‘ํ•ด์ค€๋‹ค. 

pd.merge(A, B, how = 'outer', left_index= True, right_index=True)

outer๋ฅผ ์‚ฌ์šฉํ•˜์˜€์„ ๊ฒฝ์šฐ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ํ•ฉ๋ณ‘ํ•ด ์ค€๋‹ค. 

 

 

 

 5 . ์ €์žฅํ•˜๊ธฐ 
total = pd.merge(A, B, how = 'left', left_index = True, right_index = True)
total

์ตœ์ข… ๋ชจ๋ธ์€ left, A๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํ•ฉ๋ณ‘ํ•œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ total ๋ณ€์ˆ˜๋ฅผ ์„ ์–ธํ•˜๊ณ  , ์ €์žฅ์„ ํ•ด๋ณด์ž ! 

total.to_excel('./data/exam_total.xlsx')

total.to_excel('./data/exam_total_withoutindex.xlsx', index = False)

 

index = False ์ธ์ž๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ '๋ฒˆํ˜ธ' ์ปฌ๋Ÿผ์„ ์ œ์™ธํ•˜๊ณ  ์ €์žฅํ•  ์ˆ˜ ์žˆ๋‹ค. 

 


 

 

 

๋Ÿฌ๋‹์Šคํ‘ผ ์ˆ˜์—… ์ •๋ฆฌ 

 

* ํŒ๋‹ค์Šค ๊ธฐ๋ณธ ํ•จ์ˆ˜ 

๋ฐ์ดํ„ฐ ํŒŒ์ผ ์ฝ๊ธฐ : read_excel(), read_csv()

๋ฐ์ดํ„ฐ ์„ ํƒํ•˜๊ธฐ : df.loc(), df.iloc()

์ธ๋ฑ์Šค/ ์ปฌ๋Ÿผ ๋ณ€๊ฒฝํ•˜๊ธฐ : columns/ index , reset_index() 

 

 

 


 

 pandas vs excel 

pandas : ๊ฐ€๋ณ๊ณ  ๋นจ๋ผ์„œ ๋Œ€์šฉ๋Ÿ‰ ํŒŒ์ผ ์ž‘์—…์„ ์ž์œ ๋กญ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋‹ค. 

excel : ๋ชจ๋“  ๋ฐ์ดํ„ฐ๊ฐ€ ๋ˆˆ์— ๋ณด์ธ๋‹ค. (๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์•„ ์ง์ ‘ ๋ณด๊ธฐ ์–ด๋ ค์šธ ์ˆ˜ ์žˆ๋‹ค. ) 

 

 

 pandas ๊ตฌ์กฐ 
  • DataFrame : ํ‘œ ํ˜•ํƒœ 

- index : DB์˜ key ๊ฐœ๋… , ์—‘์…€์—์„œ๋Š” ๋ณดํ†ต ์ฒซ ๋ฒˆ์งธ ์—ด์— ๋ฐฐ์น˜ํ•˜๋Š” ๋ฐ์ดํ„ฐ (vlookup ๋“ฑ์— ํ™œ์šฉ) 

- columns : ํ•˜๋‚˜์˜ ์†์„ฑ์„ ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ์˜ ์ง‘ํ•ฉ -> index + column ํ•˜๋‚˜๋กœ ๋‚˜๋ˆ„์–ด ์‚ดํŽด ๋ณผ ์ˆ˜ ์žˆ์Œ 

  • Series : ํ•˜๋‚˜์˜ ์†์„ฑ์„ ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ ( DataFrame ์—์„œ ํ•˜๋‚˜์˜ ์—ด ๋ฐ์ดํ„ฐ ) 
  •  

 

 1. Pandas ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

- pandas ์„ค์น˜ 

!pip install pandas

- pandas ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

import pandas as pd

 

 

 2. ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ & ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๊ธฐ 

* ํŒŒ์ผ์˜ ๊ฒฝ๋กœ 

- ์ ˆ๋Œ€๊ฒฝ๋กœ : "c:ํด๋”1/ํด๋”2/.../ํŒŒ์ผ๋ช….ํ™•์žฅ์ž" 

- ์ƒ๋Œ€๊ฒฝ๋กœ : "./ํด๋”3/.../ํŒŒ์ผ๋ช….ํ™•์žฅ์ž" , "../ํด๋”4/.../ํŒŒ์ผ๋ช….ํ™•์žฅ์ž" (์ฅฌํ”ผํ„ฐ ๋…ธํŠธ๋ถ ํŒŒ์ผ ์œ„์น˜๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ง€์ •) 

- ./ : ํ˜„์žฌ ์œ„์น˜ ../ : ๋ถ€๋ชจ ํด๋”

 

* ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ๋“ค์ธ ๋’ค์—๋Š” head(), info(), descrive() ๋ช…๋ น์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ดํŽด๋ณด๋Š” ์Šต๊ด€ ๊ฐ–๊ธฐ 

 

temp = pd.read_excel('./data/exam.xlsx')
temp

temp.head(2)

head ์ธ์ž์— ๊ฐœ์ˆ˜๋ฅผ ์ง€์ •ํ•ด ์ค„ ์ˆ˜ ์žˆ๋‹ค. 

temp.tail()

 

- info () : ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ ์ธ๋ฑ์Šค, ์ปฌ๋Ÿผ์˜ ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜์™€ ์ข…๋ฅ˜ ํ™•์ธ 

temp.info()

- describe() : ์ˆ˜์น˜ํ˜• ๋ฐ์ดํ„ฐ (inf, float) ๊ฐ€ ๋“ค์–ด์žˆ๋Š” ์ปฌ๋Ÿผ์˜ ๊ธฐ์ดˆํ†ต๊ณ„๋Ÿ‰ (๊ฐœ์ˆ˜,ํ‰๊ท ,ํ‘œ์ค€ํŽธ์ฐจ,์‚ฌ๋ถ„์œ„ ๋“ฑ) ํ™•์ธ

temp.describe()

 

 

 

 2-1 ์ธ๋ฑ์Šค ์ง€์ • 

- set_index() : ์ธ๋ฑ์Šค ์ปฌ๋Ÿผ ์ง€์ •ํ•˜๊ธฐ (์ปฌ๋Ÿผ -> ์ธ๋ฑ์Šค) 

data = temp.set_index('๋ฒˆํ˜ธ')
data.head()

set_index๋ฅผ ์ด์šฉํ•˜์—ฌ '๋ฒˆํ˜ธ' ์ปฌ๋Ÿผ์„ ์ธ๋ฑ์Šค๋กœ ์ง€์ •ํ•ด ์ฃผ์—ˆ๋‹ค. 

 

 

- index_col : ์—‘์…€ ํŒŒ์ผ ์ฝ์–ด์˜ฌ ๋•Œ ์ธ๋ฑ์Šค ์ง€์ • 

temp2 = pd.read_excel('./data/exam.xlsx', index_col = 0) # index_col = '๋ฒˆํ˜ธ' (์ปฌ๋Ÿผ๋ช… ํ™œ์šฉ)
temp2.head()

 

 

 

 3. ๋ฐ์ดํ„ฐ ์„ ํƒํ•˜๊ธฐ 

- ์…€ ์„ ํƒํ•˜๊ธฐ (1๊ฐœ)

df.iloc[row, column] : ์ธ๋ฑ์Šค ๋ฒˆํ˜ธ

df.lic[row, column] : ์ด๋ฆ„ 

data

data.iloc[1, 2]

55

data.loc['1๋ฒˆ','์ˆ˜ํ•™']

75

print(data.loc['3๋ฒˆ','์˜์–ด'])
print(data.iloc[2, 1])

100

100

print(data.loc['1๋ฒˆ', '๊ตญ์–ด'])
print(data.iloc[0, 0])

70

70

 

 

- ์…€ ์„ ํƒํ•˜๊ธฐ (๋ณต์ˆ˜) 

: ๋ฆฌ์ŠคํŠธ ( [์กฐ๊ฑด1, ์กฐ๊ฑด2, ... ์กฐ๊ฑดn] ) ํ˜น์€ ์‹œ์ž‘:์ข…๋ฃŒ ํ˜•ํƒœ๋กœ ๋ฒ”์œ„ ์ง€์ • 

data.loc['1๋ฒˆ', ['๊ตญ์–ด', '์˜์–ด']]

๊ตญ์–ด 70

์˜์–ด 80

Name: 1๋ฒˆ, dtype: int64

data.loc[ ['1๋ฒˆ','2๋ฒˆ'] , '์ˆ˜ํ•™']

๋ฒˆํ˜ธ

1๋ฒˆ 75

2๋ฒˆ 55

Name: ์ˆ˜ํ•™, dtype: int64

 

data.loc['1๋ฒˆ', '์˜์–ด': ]

์˜์–ด 80

์ˆ˜ํ•™ 75

Name: 1๋ฒˆ, dtype: int64

 

 

- ์ปฌ๋Ÿผ ์„ ํƒํ•˜๊ธฐ (1๊ฐœ)

: data.์ปฌ๋Ÿผ๋ช… or data.['์ปฌ๋Ÿผ๋ช…']

data.loc[ : , '์ˆ˜ํ•™']

data['์ˆ˜ํ•™']

data['์˜์–ด']

 

 

- ์ปฌ๋Ÿผ ์„ ํƒํ•˜๊ธฐ (๋ณต์ˆ˜)

: ์›ํ•˜๋Š” ์ˆœ์„œ๋Œ€๋กœ ์„ ํƒ ๊ฐ€๋Šฅ

data[ ['์ˆ˜ํ•™','์˜์–ด'] ]

data[  ['์ˆ˜ํ•™','์˜์–ด','๊ตญ์–ด']  ]

์›๋ž˜๋Š” ๊ตญ์–ด ์˜์–ด ์ˆ˜ํ•™ ์ˆœ์„œ์˜€๋Š”๋ฐ ์œ„์™€๊ฐ™์ด ์ˆœ์„œ๋ฅผ ๋‹ฌ๋ฆฌ ํ•˜์—ฌ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

 

- ํŠน์ • ์กฐ๊ฑด ๋ฐ์ดํ„ฐ ์„ ํƒํ•˜๊ธฐ (ํ•œ๊ฐœ)

pd[condition] : True์ธ ๋ฐ์ดํ„ฐ๋งŒ ์ถœ๋ ฅ 

-> condition : True / False ๋กœ ๊ตฌ์„ฑ๋œ ๋ฆฌ์ŠคํŠธ or ์‹œ๋ฆฌ์ฆˆ 

data

cond = data['์ˆ˜ํ•™'] < 80
cond

์ด๋ ‡๊ฒŒ ์ˆ˜ํ•™ ์ปฌ๋Ÿผ์— ๋Œ€ํ•ด ์กฐ๊ฑด์„ ์ƒ์„ฑํ•˜๋ฉด boolํƒ€์ž…์„ ๋ฐ˜ํ™˜ํ•ด ์ค€๋‹ค. 

data[ cond ]

์œ„์˜ ์กฐ๊ฑด์„ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ์ ์šฉํ•˜๋ฉด True์ธ row๋“ค๋งŒ ๋‚˜์˜ค๊ฒŒ ๋œ๋‹ค. 

cond = [True, False, True]    # data['์˜์–ด'] >  80
data[cond]

๋ฆฌ์ŠคํŠธ์— ์ง์ ‘ boolํƒ€์ž…์„ ๋„ฃ์–ด ๋ฝ‘์•„์ค„ ์ˆ˜๋„ ์žˆ๋‹ค. ์ด ๋•Œ ๋ฆฌ์ŠคํŠธ์˜ ๊ฐœ์ˆ˜์™€ row์˜ ๊ฐœ์ˆ˜๋Š” ๊ฐ™์•„์•ผ ํ•œ๋‹ค. 

 

 

- ํŠน์ • ์กฐ๊ฑด ๋ฐ์ดํ„ฐ ์„ ํƒํ•˜๊ธฐ (์—ฌ๋Ÿฌ๊ฐœ์˜ ์กฐ๊ฑด)

& : and , ๋ชจ๋“  ์กฐ๊ฑด ๋งŒ์กฑ True

| : or , ํ•œ ๊ฐœ๋ผ๋„ ๋งŒ์กฑํ•˜๋ฉด True

cond3 = (data['์˜์–ด'] > 80)
cond4 = (data['์ˆ˜ํ•™'] > 80)

data[ cond3 | cond4]

cond3 = (data['์˜์–ด'] > 80)
cond4 = (data['์ˆ˜ํ•™'] > 80)

cond = cond3 & cond4
data[ cond ]

cond = (data['์˜์–ด'] >= 70)  & (data['์ˆ˜ํ•™'] >= 70)  & (data['์ˆ˜ํ•™'] < 90) 
data[ cond ]

cond = (data['์˜์–ด'] >= 70) \
    & (data['์ˆ˜ํ•™'] >= 70) \
     & (data['์ˆ˜ํ•™'] < 90) 

data[ cond ]

์ค„์„ ๋ฐ”๊ฟ€ ๋• \(์—ญ์Šฌ๋ž˜์‰ฌ) ๋ฅผ ์‚ฌ์šฉํ•ด ์ค€๋‹ค. ๊ฐ€๋…์„ฑ์ด ์ข‹์•„์ง„๋‹ค. 

cond_first  =  ( data['๊ตญ์–ด']  > 80)
cond_second =  ( data['์˜์–ด']  > 80)

cond = cond_first   &   cond_second
data[cond]

 

cond_first  =  ( data['๊ตญ์–ด'] > 80 )
cond_second = ( data['์˜์–ด'] > 80 )


cond = cond_first     |   cond_second

data[cond]

 

 

 

 index & column 
data.index

Index(['1๋ฒˆ', '2๋ฒˆ', '3๋ฒˆ'], dtype='object', name='๋ฒˆํ˜ธ')

 

data.index = ['๊ฐ€๋ฐ˜', '๋‚˜๋ฐ˜', '๋‹ค๋ฐ˜']

์ธ๋ฑ์Šค๋ฅผ ๋ฆฌ์ŠคํŠธ๋กœ ์„ค์ •ํ•ด ์ค„ ์ˆ˜ ์žˆ๋‹ค. 

data

์„ค์ •ํ•œ ๋Œ€๋กœ ๋ฐ”๋€ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

data.columns

Index(['๊ตญ์–ด', '์˜์–ด', '์ˆ˜ํ•™'], dtype='object')

data.columns = ['Korean','English', 'Math']

๋™์ผํ•˜๊ฒŒ ์ปฌ๋Ÿผ๋„ ๋ฐ”๊ฟ”์ค„ ์ˆ˜ ์žˆ๋‹ค. 

data

data.reset_index()

* reset_index : drop=False๊ฐ€ ๊ธฐ๋ณธ ๊ฐ’ ( ํ˜„์žฌ ์ธ๋ฑ์Šค๋ฅผ ์ปฌ๋Ÿผ์œผ๋กœ ์˜ฎ๊ฒจ ์ฃผ๊ณ  ์ธ๋ฑ์Šค๋ฅผ ๋ฆฌ์…‹) , 

drop = True ( ํ˜„์žฌ ์ธ๋ฑ์Šค์— ์žˆ๋Š” ๊ฐ’์„ ์ปฌ๋Ÿผ์œผ๋กœ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š๊ณ  ์ธ๋ฑ์Šค ์ดˆ๊ธฐํ™” ) 

 


 

 

 

 

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

 

 

<์ด์ „ ๊ธ€>

https://silvercoding.tistory.com/9

 

[fashion MNIST ํ”„๋กœ์ ํŠธ] 1. multi-label ๋ถ„๋ฅ˜, fashion MNIST ๋ฐ์ดํ„ฐ ์•Œ์•„๋ณด๊ธฐ

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) ์ด๋ฒˆ์—๋Š” ๋˜ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ์ธ fashion MNIST๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๋ฅผ ํ•ด๋ณผ ๊ฒƒ์ด๋‹ค. ์—ฌ๊ธฐ์„œ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ”์ด ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋ณด๊ณ  ๋„˜

silvercoding.tistory.com

 

 

์ด์ „ ๊ธ€์—์„œ fashion MNIST์— ๋Œ€ํ•˜์—ฌ ์•Œ์•„๋ณด์•˜๋‹ค. MNIST์™€ ๋™์ผํ•œ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์—ˆ๋‹ค. ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” ์ด๋Ÿฌํ•œ fashion MNIST ๋ฐ์ดํ„ฐ์…‹์„ ์ „์ฒ˜๋ฆฌํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์ ธ๋ณด๋„๋ก ํ•œ๋‹ค. 

 

 

 

์ „์ฒ˜๋ฆฌ ์‹œ์ž‘ 

 

(1) data type ๋ณ€๊ฒฝ (์ •์ˆ˜ -> ์‹ค์ˆ˜)

๋ฐ์ดํ„ฐ์˜ ์ด๋ฏธ์ง€์˜ ๊ฐ’์€ 0-255 ์ธ uint8 ํƒ€์ž… ์ด์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ์‹ค์ˆ˜ํ˜•์œผ๋กœ ๋ฐ”๊พธ์–ด ์ฃผ๊ณ , ๋ฒ”์œ„๋ฅผ 0-1 ๋กœ ๋ฐ”๊พธ์–ด ์ค€๋‹ค. 

 

ํ˜น์‹œ uint8 ํƒ€์ž…์ด ๊ถ๊ธˆํ•˜๋‹ค๋ฉด ์ด๊ณณ์—์„œ ์„ค๋ช…์„ ํ•ด๋‘” ๋ถ€๋ถ„์„ ์ฝ์–ด๋ณด์ž. 

https://silvercoding.tistory.com/3

 

[MNIST ํ”„๋กœ์ ํŠธ] 1. MNIST ๋ฐ์ดํ„ฐ ์•Œ์•„๋ณด๊ธฐ

(์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) ์˜ค๋Š˜ ์•Œ์•„๋ณผ ๋ฐ์ดํ„ฐ๋Š” ๋”ฅ๋Ÿฌ๋‹ ์ž…๋ฌธ ๋•Œ ๋ฌด์กฐ๊ฑด ๋ฐฐ์šฐ๋Š” ์œ ๋ช…ํ•œ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์ธ MNIST ๋ฐ์ดํ„ฐ์…‹์ด๋‹ค. ์‚ฌ์ง„๊ณผ ๊ฐ™์ด ์†์œผ๋กœ ์ง์ ‘ ์“ด ์ˆซ์ž

silvercoding.tistory.com

 

train_images = train_images.astype(np.float64)
test_images = test_images.astype(np.float64)

์ด๋ ‡๊ฒŒ astype ์„ ์ด์šฉํ•˜์—ฌ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•ด ์ฃผ๋ฉด ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ๋ฐ”๊ฟ”์ค„ ์ˆ˜ ์žˆ๋‹ค. ์ด์ œ dtype์„ ์ฐ์–ด๋ณด๋ฉด uin8์ด ์•„๋‹Œ float64๋กœ ๋ฐ”๋€Œ์–ด ์žˆ์„ ๊ฒƒ์ด๋‹ค. 

 

 

 

(2) normalize 

normalize์ž‘์—…์„ ํ•˜์—ฌ 0~1 ์˜ ๊ฐ’์„ ๊ฐ€์ง€๋„๋ก ๋ณ€ํ˜•ํ•ด์ฃผ๋„๋ก ํ•˜์ž.
์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
normalize(x) = x - ์ตœ์†Ÿ๊ฐ’ / ์ตœ๋Œ“๊ฐ’ - ์ตœ์†Ÿ๊ฐ’
normalize(x) = x / ์ตœ๋Œ“๊ฐ’ (์ตœ์†Ÿ๊ฐ’์ด 0์ผ ๋•Œ : ์ง€๊ธˆ ๋ฐ์ดํ„ฐ์…‹์˜ ๊ฒฝ์šฐ)
ํ˜„์žฌ MNIST ๋Š” 0-255 ์˜ ์ˆซ์ž์ด๋ฏ€๋กœ ๋ฐ‘์˜ ์‹์„ ๋”ฐ๋ฅด๋ฉด ๋œ๋‹ค.

(train_images / 255.0).min(), (train_images / 255.0).max()

์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๋ฐ”๋กœ ์ตœ์†Ÿ๊ฐ’์ด 0.0, ์ตœ๋Œ“๊ฐ’์ด 1.0 ์œผ๋กœ ๋ณ€ํ™”ํ•˜๊ฒŒ ๋œ๋‹ค. 

 

 

*** normalize๋ฅผ ํ•จ์ˆ˜๋กœ ๊ตฌํ˜„ํ•˜์—ฌ ์ผ๋ฐ˜ํ™” ์‹œํ‚ค๊ธฐ 

def norm(data):
  min_v = data.min()
  max_v = data.max()

  return (data - min_v) / (max_v - min_v)

์ด๋Ÿฌํ•œ normalize ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•ด ๋†“์œผ๋ฉด ์ตœ์†Ÿ๊ฐ’์ด 0์ด ์•„๋‹Œ ๋ฐ์ดํ„ฐ๋„ ์†์‰ฝ๊ฒŒ ์ •๊ทœํ™” ํ•ด์ค„ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

์—ฌ๊ธฐ์„œ ๋ฐ์ดํ„ฐ์˜ shape, dtype, ๋ฒ”์œ„(์ตœ๋Œ“๊ฐ’, ์ตœ์†Ÿ๊ฐ’)์„ ๋‹ค์‹œ ํ™•์ธํ•ด ๋ณธ๋‹ค์Œ ์‹œ๊ฐํ™”๋กœ ๋„˜์–ด๊ฐ€์ž! (ํฌ์ŠคํŒ…์€ ์ƒ๋žต)

 

 

 

 

 

์—ฌ๋Ÿฌ์žฅ ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ (ex, 5์žฅ)

์ด ์ž‘์—…๋„ ๊ณ„์† ๋ฐ˜๋ณต๋˜๋ฏ€๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ ์ž‘์„ฑํ•ด ๋‚˜๊ฐ€๋ฉฐ ๋งˆ์น˜๋„๋ก ํ•œ๋‹ค. 

- (5, 28, 28) ---> (28, 28 * 5) shape ๋ณ€๊ฒฝ (hstack, transpose๋กœ ๊ฐ€๋Šฅ) 

์ด๋ฒˆ์—๋„ hstack์€ ์ƒ๋žตํ•˜๊ณ , transpose ๋ฐฉ๋ฒ•์œผ๋กœ๋งŒ ์ง„ํ–‰ํ•œ๋‹ค. 

 

์ด๊ณณ์—์„œ hstack์„ ์‚ฌ์šฉํ•œ ์ ์ด ์žˆ๋‹ค. 

https://silvercoding.tistory.com/4

 

[MNIST ํ”„๋กœ์ ํŠธ] 2. MNIST ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒ˜๋ฆฌ, ์‹œ๊ฐํ™”

(์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) <์ด์ „ ํฌ์ŠคํŒ…> https://silvercoding.tistory.com/3 [MNIST ํ”„๋กœ์ ํŠธ] 1. MNIST ๋ฐ์ดํ„ฐ ์•Œ์•„๋ณด๊ธฐ (์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ

silvercoding.tistory.com

 

train_images[:5].transpose((1, 0, 2)).reshape(28, -1).shape

(28, 140) ์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! 

 

์ด์ œ ์ด๊ฑธ plt๋กœ ๊ทธ๋ ค๋ณด๋ฉด (์ฝ”๋“œ ์ƒ๋žต)

์ด๋ ‡๊ฒŒ ์˜ค๋ฅ˜ ์—†์ด 5์žฅ์„ ํ•œ๋ฒˆ์— ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค. 

 

 

- ์ข…๋ฅ˜๋ผ๋ฆฌ ์‹œ๊ฐํ™” ํ•˜๊ธฐ 

def filter(label, count=5) :
  imgs = train_images[np.argwhere(train_labels == label)[:count, ..., 0]].transpose((1, 0, 2)).reshape(28, -1)
  plt.imshow(imgs, cmap='gray')
  plt.title(idx2label(label))
  plt.show()

filter ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค์–ด์„œ ๋ณด๊ณ ์‹ถ์€ ์˜๋ฅ˜ ์ข…๋ฅ˜ ๋ผ๋ฒจ์„ ์ง‘์–ด ๋„ฃ์œผ๋ฉด, ๊ทธ ์ข…๋ฅ˜๋ผ๋ฆฌ ์‹œ๊ฐํ™”๋ฅผ ํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ €๋ฒˆ์— ๋งŒ๋“ค์–ด ๋‘์—ˆ๋˜ idx2label ์„ ์ด์šฉํ•˜์—ฌ title๋„ ์ง€์ •ํ•ด์ฃผ์—ˆ๋‹ค. 

filter(๋ ˆ์ด๋ธ”, ์‹œ๊ฐํ™” ๊ฐฏ์ˆ˜) ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉํ•˜๋ฉด ๋œ๋‹ค. 

 

filter(9, 6)

9๋ฒˆ์งธ ๋ผ๋ฒจ์ธ ์•ตํด๋ถ€์ธ  6๊ฐ€์ง€๋ฅผ ๊ทธ๋ ค ๋ณด์•˜๋‹ค. 

 

 

 

 

 

๋‹ค์Œ์‹œ๊ฐ„์—๋Š” Data augmentation๊ณผ ๋ชจ๋ธ๋ง์„ ํ•˜๋Š” ํฌ์ŠคํŒ…์„ ํ•  ์˜ˆ์ •์ด๋‹ค. 

 

 

 

 

 

 

 

 

 

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

 

 

์ด๋ฒˆ์—๋Š” ๋˜ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ์ธ fashion MNIST๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๋ฅผ ํ•ด๋ณผ ๊ฒƒ์ด๋‹ค. 

์—ฌ๊ธฐ์„œ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ”์ด ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋ณด๊ณ  ๋„˜์–ด๊ฐ€์ž. 

 

 

Multiclass vs multi-label

Binary Classification ์€ ํด๋ž˜์Šค๊ฐ€ 2๊ฐ€์ง€์ธ ๊ฒฝ์šฐ์ด๋‹ค. ์‚ฌ์ง„์— ๋‚˜์™€์žˆ๋Š” ๊ฒƒ ์ฒ˜๋Ÿผ (์ŠคํŒธ, ๋‚ซ์ŠคํŒธ), ์ €๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ ํ–ˆ์—ˆ๋˜ ์„ฑ๋ณ„ (๋‚จ, ๋…€), ์›ƒ์Œ์—ฌ๋ถ€ (์›ƒ์Œ, ์•ˆ์›ƒ์Œ) ์ด๋Ÿฐ์‹์ด๋‹ค. 

MultiClass Classification ์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์œ„ ์‚ฌ์ง„์ฒ˜๋Ÿผ ์‚ฌ์ง„์— ๊ฐ•์•„์ง€ ํ•œ๋งˆ๋ฆฌ๊ฐ€ ์žˆ๊ณ  ์—ฌ๋Ÿฌ ํด๋ž˜์Šค๋“ค ์ค‘ ํ•œ ์ข…๋ฅ˜๋ฅผ ์˜ˆ์ธกํ•ด ์ฃผ๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋ฒˆ์— ํ•  fashion Mnist๋ฅผ ๋ฉ€ํ‹ฐ๋ ˆ์ด๋ธ”๋กœ ํ•˜์ง€ ์•Š๊ณ  ๊ทธ๋Œ€๋กœ ๋ถ„๋ฅ˜๋ชจ๋ธ์„ ๋งŒ๋“ ๋‹ค๋ฉด ๋ฉ€ํ‹ฐํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ชจ๋ธ์ด ๋  ๊ฒƒ์ด๋‹ค. 

Multi-label Classification ์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ , ๋ผ๋ฒจ๋ง๋„ ์—ฌ๋Ÿฌ๊ฐœ๋กœ ๋˜์–ด์žˆ๋Š” ๊ฒฝ์šฐ์ด๋‹ค. ์œ„ ์‚ฌ์ง„์„ ๋ณด๋ฉด ์‚ฌ์ง„ ์•ˆ์— ๊ณ ์–‘์ด์™€ ์ƒˆ๊ฐ€ ์žˆ์œผ๋‹ˆ ์—ฌ๋Ÿฌ ํด๋ž˜์Šค๋“ค ์ค‘ ๋‘๊ฐ€์ง€์˜ ๋ผ๋ฒจ๋ง์ด ๋˜์–ด์žˆ๋Š” ๊ฒƒ์ด๋‹ค.

 

๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ๋ฉ€ํ‹ฐ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๋งŒ๋“ค ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ํ•œ ์‚ฌ์ง„์— ์˜๋ฅ˜๋ฅผ ๋ฌด์ž‘์œ„๋กœ ๋ถ™์—ฌ์ฃผ๋Š” ์ž‘์—…์„ ํ•˜์—ฌ ํ•œ ์‚ฌ์ง„์— ์˜๋ฅ˜๊ฐ€ ์ตœ๋Œ€ 4๊ฐ€์ง€๊ฐ€ ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ˜•์„ ํ•œ๋‹ค. 

 

 

<multi-label ์‚ฌ์ง„ ์ถœ์ฒ˜>

https://www.kaggle.com/c/lish-moa/discussion/180500

 

Mechanisms of Action (MoA) Prediction

Can you improve the algorithm that classifies drugs based on their biological activity?

www.kaggle.com

 

 

 

 

 

์ด์ œ fashion MNIST๋ฅผ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž!

์ด๋ฒˆ์—๋„ keras์—์„œ ์ œ๊ณตํ•ด์ฃผ๋Š” datasets์—์„œ ๋ถˆ๋Ÿฌ์™€ ์‚ฌ์šฉํ•œ๋‹ค. ์ˆ˜๋™์œผ๋กœ ์„ค์น˜ํ•˜๋ ค๋ฉด ๋ฐ‘์˜ ๋งํฌ๋ฅผ ์ด์šฉํ•˜๋ฉด ๋œ๋‹ค. 

 

<fashion MNIST ์ถœ์ฒ˜ ๋ฐ ๋‹ค์šด>

https://www.kaggle.com/zalando-research/fashionmnist

 

Fashion MNIST

An MNIST-like dataset of 70,000 28x28 labeled fashion images

www.kaggle.com

 

 

MNIST ๋ฐ์ดํ„ฐ์™€ ํฌ๊ธฐ๊ฐ€ ๋™์ผํ•˜๊ฒŒ 28x28 ์ด๋‹ค. train dataset์ด 60,000์žฅ, test dataset์ด 10,000 ์žฅ์ธ ๊ฒƒ๋„ ๋™์ผํ•˜๋‹ค. 

 

 

Labels

Each training and test example is assigned to one of the following labels:

  • 0 T-shirt/top
  • 1 Trouser
  • 2 Pullover
  • 3 Dress
  • 4 Coat
  • 5 Sandal
  • 6 Shirt
  • 7 Sneaker
  • 8 Bag
  • 9 Ankle boot

ํด๋ž˜์Šค๋Š” ์ด 10๊ฐœ๋กœ, ํ‹ฐ์…”ํŠธ, ๋“œ๋ ˆ์Šค, ์…”์ธ , ์ƒŒ๋“ค, ๊ฐ€๋ฐฉ ๋“ฑ๋“ฑ ์—ฌ๋Ÿฌ ์˜๋ฅ˜ ์ข…๋ฅ˜๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค.  ์–ด๋Š ์˜๋ฅ˜์˜ ์ข…๋ฅ˜์ธ์ง€ ๋ถ„๋ฅ˜ํ•ด๋‚ด๋Š” ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ด๋‹ค.

 

 

 

 

 

fashion MNIST ๋ฐ์ดํ„ฐ์…‹ ์•Œ์•„๋ณด๊ธฐ 

์ด์ œ ๋ฐ์ดํ„ฐ์…‹์„ ์•Œ์•„๋ณด๋Š” ์ ˆ์ฐจ๋Š” ์ต์ˆ™ํ•ด์กŒ์„ ๊ฒƒ์ด๋‹ค. ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ, ๋ฒ”์œ„, ํƒ€์ž…์„ ํ™•์ธํ•˜๊ณ , ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ๋Š”์ง€ ์‹œ๊ฐํ™”๋ฅผ ํ•ด๋ณธ๋‹ค. 

 

(1) ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 

fashion_mnist = keras.datasets.fashion_mnist 
((train_images, train_labels), (test_images, test_labels)) = fashion_mnist.load_data()

keras์˜ datasets์—์„œ fashion MNIST๋ฅผ ๋ถˆ๋Ÿฌ์˜จ๋‹ค.

 

labels = ["T-shirt/top",  # index 0
        "Trouser",      # index 1
        "Pullover",     # index 2 
        "Dress",        # index 3 
        "Coat",         # index 4
        "Sandal",       # index 5
        "Shirt",        # index 6 
        "Sneaker",      # index 7 
        "Bag",          # index 8 
        "Ankle boot"]   # index 9

def idx2label(idx):
  return labels[idx]

๋ ˆ์ด๋ธ”์˜ ํ…์ŠคํŠธ๋ฅผ ๋ฆฌ์ŠคํŠธ์— ์ €์žฅํ•ด์„œ ์ธ๋ฑ์Šค๋ฅผ ์ด์šฉํ•˜์—ฌ ํ…์ŠคํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค.

idx2label ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•˜์—ฌ ๋ ˆ์ด๋ธ”์„ ํ•จ์ˆ˜์— ๋„ฃ์œผ๋ฉด ๋ ˆ์ด๋ธ”์˜ ํ…์ŠคํŠธ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋„๋ก ํ•˜๋Š” ์ฝ”๋“œ. ์‹œ๊ฐํ™”์—์„œ ์‚ฌ์šฉํ•  ์˜ˆ์ •์ด๋‹ค. 

 

 

 

 

(2) ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ ํ™•์ธ 

print(f"train_images: {train_images.shape}")
print(f"train_labels: {train_labels.shape}")
print(f"test_images: {test_images.shape}")
print(f"test_labels: {test_labels.shape}")

train_images: (60000, 28, 28)

train_labels: (60000,)

test_images: (10000, 28, 28)

test_labels: (10000,)

 

๊ธฐ์กด MNIST์™€ ๊ฐ™์€ ํ˜•ํƒœ๋ฅผ ๋„๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

 

(3) ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„ ํ™•์ธ 

- image ์—์„œ 0์ด ์•„๋‹Œ ๊ฐ’ ์ถœ๋ ฅํ•ด๋ณด๊ธฐ 

train_images[train_images!=0][:50]
test_images[train_images!=0][:50]

๋„ˆ๋ฌด ๋งŽ์œผ๋‹ˆ 50๊นŒ์ง€๋งŒ ์ถœ๋ ฅํ•ด๋ณธ๋‹ค. 0์„ ์ œ์™ธํ•˜๊ณ  255๊นŒ์ง€์˜ ์ •์ˆ˜๋“ค๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉด ์ •์ƒ! 

 

- image์˜ ์ตœ์†Ÿ๊ฐ’, ์ตœ๋Œ“๊ฐ’ ๊ตฌํ•ด๋ณด๊ธฐ 

print(train_images.min(), train_images.max())
print(test_images.min(), test_images.max())

 

๋‘˜๋‹ค 0 255 ๊ฐ€ ๋‚˜์˜ค๋ฉด ์ •์ƒ! 

 

 

***์ด๋ฏธ์ง€์˜ ๊ฐ’์„ ๋”ํ•ด์„œ ๊ฐ€์žฅ ํฐ index, ๊ฐ€์žฅ ์ž‘์€ index๋ฅผ ๊ตฌํ•ด๋ณด๊ณ  ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ 

์ด๋ฏธ์ง€์˜ ๊ฐ’๋“ค์„ ๋ชจ๋‘ ๋”ํ•ด์„œ ์ˆซ์ž๊ฐ€ ํฌ๋‹ค๋ฉด ์˜ท์˜ ํฌ๊ธฐ๊ฐ€ ํฌ๊ณ  ์ƒ‰์ด ๋ฐ์„ ๊ฒƒ์ด๊ณ , ์ˆซ์ž๊ฐ€ ์ž‘๋‹ค๋ฉด ์˜ท์˜ ํฌ๊ธฐ๊ฐ€ ์ž‘์œผ๋ฉด์„œ ์ƒ‰์ด ์–ด๋‘์šธ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค. ์ •๋ง ๊ทธ๋Ÿฐ์ง€ ํ™•์ธํ•ด ๋ณด์ž. 

print(train_images.reshape((60000, -1)).sum(axis=1).argmax())
print(train_images.reshape((60000, -1)).sum(axis=1).argmin())

axis=1 ๋ฐฉํ–ฅ์œผ๋กœ ๋‹ค ๋”ํ•ด์ฃผ๋ฉด ๊ฐ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ๊ฐ’๋“ค์˜ ํ•ฉ์ด ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค. ๊ทธ์ค‘์—์„œ ์ตœ๋Œ“๊ฐ’์˜ index์™€ ์ตœ์†Ÿ๊ฐ’์˜ index๋ฅผ 

์ธ๋ฑ์Šค๋Š” 55023 9230๊ฐ€ ๋‚˜์™”๋‹ค. ์‚ฌ์ง„์„ ์ถœ๋ ฅํ•ด๋ณด๋ฉด,

์˜ˆ์ƒํ–ˆ๋˜ ๋Œ€๋กœ ํ•ฉ์ด ํฐ ์ด๋ฏธ์ง€๋Š” ๋ฐ์€ ๋ถ€๋ถ„์ด ๋งŽ๊ณ , ํ•ฉ์ด ์ž‘์€ ์ด๋ฏธ์ง€๋Š” ์–ด๋‘์šด ๋ถ€๋ถ„์ด ๋Œ€๋ถ€๋ถ„์ด๋‹ค. 

 

 

 

(4) ๋ฐ์ดํ„ฐ ํƒ€์ž… ํ™•์ธ 

print(train_images.dtype)
print(train_labels.dtype)
print(test_images.dtype)
print(test_labels.dtype)

๋ชจ๋‘ uint8 ์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! 

์ด๋ฅผ ํ†ตํ•ด ์•Œ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ์ „์ฒ˜๋ฆฌ ํ•  ๋•Œ 0-1 ์‚ฌ์ด์˜ float ํ˜•ํƒœ๋กœ ๋ฐ”๊ฟ” ์ฃผ์–ด์•ผ ๋œ๋‹ค๋ผ๋Š” ๊ฒƒ.  

 

 

(5) ๋ฐ์ดํ„ฐ ํ•œ์žฅ์”ฉ ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ 

def show(idx):
  plt.imshow(train_images[idx], cmap='gray')
  plt.title(idx2label(train_labels[idx]))
  plt.show()

์‹œ๊ฐํ™” ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•ด์ฃผ์–ด ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์ง„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. 

 

show(777)

train image์˜ 777๋ฒˆ์งธ ์‚ฌ์ง„์€ sandal 

show(77)

train image์˜ 77๋ฒˆ์งธ ์‚ฌ์ง„์€ shirt ์ž„์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

 

 

๋‹ค์Œ ํฌ์ŠคํŠธ์—์„œ๋Š” fashion MNIST ์ „์ฒ˜๋ฆฌ์™€ ์—ฌ๋Ÿฌ์žฅ ์‹œ๊ฐํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•˜์—ฌ ์ž‘์„ฑํ•  ์˜ˆ์ •์ด๋‹ค. 

 

 

 

 

 

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)


<์ด์ „ ํฌ์ŠคํŒ…>
https://silvercoding.tistory.com/7

 

[celeba ํ”„๋กœ์ ํŠธ] 2. celeba ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒ˜๋ฆฌ, ์‹œ๊ฐํ™”

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) <์ด์ „ ํฌ์ŠคํŒ…> https://silvercoding.tistory.com/6 [celeba ํ”„๋กœ์ ํŠธ] 1. celeba ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๊ธฐ (๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”

silvercoding.tistory.com


์ €๋ฒˆ ํฌ์ŠคํŒ…๊นŒ์ง€ celeba ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์ „์ฒ˜๋ฆฌ๋ฅผ ๋ชจ๋‘ ๋งˆ์ณค๋‹ค. ์ด์ œ ๋ชจ๋ธ๋ง ํ•˜๋Š”์ผ๋งŒ ๋‚จ์•˜๋‹ค.
๊ทธ๋Ÿฐ๋ฐ ๋ฐ”๋กœ ์ด์ „ ํ”„๋กœ์ ํŠธ์˜€๋˜ mnist ๋ชจ๋ธ์€ ์–ด๋Š ์ˆซ์ž์ธ์ง€ ๋งž์ถ”๋Š” 1๊ฐœ์˜ ์•„์›ƒํ’‹์ด ๋‚˜์˜ค๋Š” ๋ถ„๋ฅ˜๊ธฐ์˜€๋‹ค.

์ด๋ฒˆ์—๋Š” ์„ฑ๋ณ„๊ณผ, ์›ƒ์Œ ์—ฌ๋ถ€ ๋‘๊ฐ€์ง€๋ฅผ ๋งž์ถ”์–ด์•ผ ํ•œ๋‹ค. ๊ทธ๋ž˜์„œ ์ฒซ๋ฒˆ์งธ๋กœ ์ด๋ฅผ ๋”ฐ๋กœ๋”ฐ๋กœ ๊ฐ๊ฐ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด ๋ณด๊ณ ,  ๋‘๋ฒˆ์งธ๋กœ๋Š” weights๋Š” ๊ณต์œ ํ•˜๋ฉด์„œ ์•„์›ƒํ’‹๋งŒ ๋‹ค๋ฅด๊ฒŒ ํ•ด์ฃผ๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด ๋ณผ ๊ฒƒ์ด๋‹ค.





๊ฐ๊ฐ ๋ชจ๋ธ๋ง ํ•˜๊ธฐ
(1) simple model ๊ตฌํ˜„

from keras.models import Model 
from keras.layers import Conv2D, MaxPool2D, Input, Dense, Flatten 
def simple_model(): 
	inputs = Input((72, 59, 3)) 
    x = Conv2D(32, 3, activation='relu')(inputs) 
    x = MaxPool2D(2)(x) 
    x = Conv2D(64, 3, activation='relu')(x) 
    x = MaxPool2D(2)(x) 
    x = Conv2D(64, 3, activation='relu')(x) 
    x = MaxPool2D(2)(x) 
    x = Flatten()(x) 
    x = Dense(64, activation='relu')(x) 
    
    outputs = Dense(2, activation='softmax')(x) 
    model = Model(inputs, outputs) 
    
    return model

๋‘๊ฐœ๋ฅผ ๋˜‘๊ฐ™์€ ๋ชจ๋ธ๋กœ ๊ตฌํ˜„ํ•  ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•ด ๋†“๋Š”๋‹ค.








(2) ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐ ์š”์•ฝ์ •๋ณด ์ถœ๋ ฅ
- ๋ชจ๋ธ ์ƒ์„ฑ

gender_model = simple_model() 
smile_model = simple_model()

๊ฐ๊ฐ์— ๋Œ€ํ•˜์—ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•œ๋‹ค.


- ๋ชจ๋ธ ์š”์•ฝ ์ •๋ณด ์ถœ๋ ฅ

gender_model.summary() 
smile_model.summary()

์ด๋ ‡๊ฒŒ ์ƒ๊ธด ๋ชจ๋ธ์ด 2๊ฐœ๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋œ๋‹ค.








(3) loss, optimizer, metrics ์„ค์ •

gender_model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy']) 
smile_model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

์ด๊ฒƒ ๋˜ํ•œ ๋™์ผํ•˜๊ฒŒ ์ž‘์„ฑํ•ด์ค€๋‹ค. compile ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์†์‹คํ•จ์ˆ˜๋Š” categorical crossentropy, optimizer๋Š” adam, ํ‰๊ฐ€์ง€ํ‘œ๋Š” ์ •ํ™•๋„๋กœ ์„ค์ •์„ ํ•ด์ฃผ์—ˆ๋‹ค.

-- ๋‘ model ์˜ weight ํ™•์ธ ํ•ด๋ณด๊ธฐ

gender_model.get_weights()[0][0][0][0] 
smile_model.get_weights()[0][0][0][0]

๋‘ ๋ชจ๋ธ์˜ weight๋“ค์„ ์–ป์–ด์™€๋ณด๋ฉด ๋‹ค๋ฅธ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ๊ตฌ์กฐ๋งŒ ๊ฐ™์€ ๋ชจ๋ธ์ด๊ณ , ์„œ๋กœ ๋…๋ฆฝ์ ์œผ๋กœ ํ•™์Šต์ด ์ง„ํ–‰๋œ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค.







(4) ํ•™์Šต ์‹œํ‚ค๊ธฐ

gender_hist = gender_model.fit(train_images, train_male_labels, validation_data=(test_images, test_male_labels), epochs=15, verbose=1) 
smile_hist = smile_model.fit(train_images, train_smile_labels, validation_data=(test_images, test_smile_labels), epochs=15, verbose=1)

ํ•™์Šต๋„ ์—ญ์‹œ ๊ฐ๊ฐ ๋”ฐ๋กœ ์‹œํ‚จ๋‹ค. ๋ผ๋ฒจ ๋นผ๊ณ ๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ๋น„๋กฏํ•ด ๋ชจ๋‘ ๋™์ผํ•˜๋‹ค.
์ด ์‹œ์ ์—์„œ weight๋ฅผ ๋˜ ๋ถˆ๋Ÿฌ์™€ ๋ณด๋ฉด ์•„๊นŒ์™€ ๋‹ฌ๋ผ์ ธ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ๋‹น์—ฐํžˆ ๋‘ ๋ชจ๋ธ์˜ weight๋„ ์—ฌ์ „ํžˆ ๋‹ค๋ฅผ ๊ฒƒ์ด๋‹ค.


์ฐธ๊ณ ๋กœ verbose์— ๊ด€๋ จ๋œ ์„ค๋ช…์€ ์—ฌ๊ธฐ์— ์žˆ๋‹ค
https://silvercoding.tistory.com/5

 

[MNIST ํ”„๋กœ์ ํŠธ] 3. Noise ์ถ”๊ฐ€, RNN ๋ชจ๋ธ๋ง

(์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) <์ด์ „ ํฌ์ŠคํŒ…> https://silvercoding.tistory.com/4 https://silvercoding.tistory.com/3 [MNIST ํ”„๋กœ์ ํŠธ] 1. MNIST ๋ฐ์ดํ„ฐ ์•Œ์•„๋ณด๊ธฐ (์ด๋ฒˆ ํ”„๋กœ์ ..

silvercoding.tistory.com









(5) ํ•™์Šต ๊ฒฐ๊ณผ ํ™•์ธ
- ํ•™์Šต ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

plt.plot(gender_hist.history['accuracy'], label = 'gender_accuracy') 
plt.plot(gender_hist.history['loss'], label = 'gender_loss') 
plt.plot(gender_hist.history['val_accuracy'], label = 'gender_val_accuracy') 
plt.plot(gender_hist.history['val_loss'], label = 'gender_val_loss') 

plt.plot(smile_hist.history['accuracy'], label = 'smile_accuracy') 
plt.plot(smile_hist.history['loss'], label = 'smile_loss') 
plt.plot(smile_hist.history['val_accuracy'], label = 'smile_val_accuracy') 
plt.plot(smile_hist.history['val_loss'], label = 'smile_val_loss') 

plt.legend(loc='uppder left') 
plt.show()

ํžˆ์Šคํ† ๋ฆฌ์— ์ €์žฅ๋œ accuracy์™€ loss๋ฅผ ๊บผ๋‚ด์„œ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค๋ณด๋ฉด

๋‚˜๋ฆ„ ๊ดœ์ฐฎ์€ ๋ชจ์–‘์„ ๋ˆ๋‹ค!


--- test image ํ•œ์žฅ์œผ๋กœ ๊ฒฐ๊ณผ ํ™•์ธํ•ด๋ณด๊ธฐ

gender_res = gender_model.predict(test_images[77:78]) 
smile_res = smile_model.predict(test_images[77:78]) 

77๋ฒˆ์งธ ์‚ฌ์ง„์œผ๋กœ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธ ํ•ด๋ณด์ž.

์šฐ์„  77๋ฒˆ์งธ ์‚ฌ์ง„์€ ์œ„์™€ ๊ฐ™๋‹ค. ์—ฌ์ž ์›ƒ์ŒO ์ผ ๊ฒƒ์œผ๋กœ ์ถ”์ •๋œ๋‹ค.
๊ฒฐ๊ณผ๋„ ๊ฐ๊ฐ ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.

- gender

plt.bar(range(2), gender_res[0], color='red') 
plt.bar(np.array(range(2)) + 0.3, test_male_labels[77]) 
plt.xticks(range(2), ['female', 'male']) 
plt.show()

print(gender_res)

red๊ฐ€ ์˜ˆ์ธก, blue๊ฐ€ ์ •๋‹ต์ด๋‹ค. ์ž˜ ๋งž์ถ”์—ˆ๋‹ค!

- smile

(์ฝ”๋“œ ์ƒ๋žต) gender_res ---> smile_res, test_male_labels ---> test_smile_labels ๋กœ ๋ฐ”๊ฟ”์ฃผ๋ฉด ๋œ๋‹ค.
์•„์ฃผ ์กฐ๊ธˆ์€ unsmiling์œผ๋กœ ์˜ˆ์ธกํ•˜๊ณ  ๊ฑฐ์˜ ์ž˜ ๋งž์ถ˜ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค!








(6) ๋ชจ๋ธ ํ‰๊ฐ€
๋ชจ๋ธ ํ‰๊ฐ€๋„ ๋‘๊ฐœ๋ฅผ ๋”ฐ๋กœ ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.

gender_model.evaluate(test_images, test_male_labels, verbose=2) 
smile_model.evaluate(test_images, test_smile_labels, verbose=2)

๊ฐ๊ฐ ์ •ํ™•๋„ 94%, 89% ๋กœ ๋ชจ๋ธ ํ‰๊ฐ€๊นŒ์ง€ ๋งˆ์นœ๋‹ค.






๊ทธ๋Ÿฐ๋ฐ ์•„์ฃผ ๋ฒˆ๊ฑฐ๋กญ๋‹ค. ๋‘๊ฐœ๋ผ์„œ ํ• ๋งŒํ–ˆ์ง€ ๋งŒ์•ฝ ๋” ๋งŽ์€ ์†์„ฑ์„ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ–ˆ๋‹ค๋ฉด ์ •๋ง ํž˜๋“ค์—ˆ์„ ๊ฒƒ์ด๋‹ค. ๋ชจ๋ธ ์ƒ์„ฑ, ํ•™์Šต, ๊ฒฐ๊ณผ ํ™•์ธ, ํ‰๊ฐ€๊นŒ์ง€ ๋ชจ๋‘ ๋”ฐ๋กœ๋”ฐ๋กœ ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค. ๊ทธ๋ž˜์„œ ๋ฉ€ํ‹ฐ ์•„์›ƒํ’‹ ๋ชจ๋ธ๋ง์œผ๋กœ ํ•™์Šต์„ ํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค.



๋ฉ€ํ‹ฐ ์•„์›ƒํ’‹ ๋ชจ๋ธ๋ง
์šฐ์„  ์•„์ด๋””์–ด๋Š” output ์ด์ „๊นŒ์ง€๋Š” ๋‹ค ๊ฐ™์œผ๋‹ˆ ๋˜‘๊ฐ™์ด ์ž‘์„ฑํ•˜๊ณ , output๋งŒ ๋‹ค๋ฅด๊ฒŒ ์ž‘์„ฑํ•ด์ค€๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

- ๋ฐฉ๋ฒ• 1

from keras.models import Model 
from keras.layers import Conv2D, MaxPool2D, Input, Dense, Flatten, Concatenate 
def multi_model(): 
	inputs = Input((72, 59, 3)) 
    l1 = Conv2D(32, 3, activation='relu')(inputs) 
    l2 = MaxPool2D(2)(l1) 
    l3 = Conv2D(64, 3, activation='relu')(l2) 
    l4 = MaxPool2D(2)(l3) 
    l5 = Conv2D(64, 3, activation='relu')(l4) 
    l6 = MaxPool2D(2)(l5) 
    l7 = Flatten()(l6) 
    latent_vector = Dense(64, activation='relu')(l7) 
    
    gender_outputs = Dense(2, activation='softmax')(latent_vector) 
    smile_outputs = Dense(2, activation='softmax')(latent_vector) 
    
    outputs = Concatenate(axis=1)([gender_outputs, smile_outputs]) 
    model = Model(inputs, outputs) 
    
    return model

(1) ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐ ์š”์•ฝ์ •๋ณด ์ถœ๋ ฅ

# ๋ชจ๋ธ ์ƒ์„ฑ
model = multi_model() 
# ๋ชจ๋ธ ์š”์•ฝ์ •๋ณด 
model.summary()

(None, 4)๋กœ ์ตœ์ข… ์•„์›ƒํ’‹์ด 1๊ฐœ


(2) loss, optimizer, metrics ์„ค์ •

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

์ „๊ณผ ๊ฐ™์ด ํ•ด์ค€๋‹ค.

(3) ๋ชจ๋ธ ํ•™์Šต

hist1 = model.fit(train_images, train_labels2, validation_data=(test_images, test_labels2), epochs=15, verbose=1)


(4) ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„

๊ทธ๋ž˜ํ”„๊ฐ€ ๊ต‰์žฅํžˆ ์ด์ƒํ•˜๋‹ค. loss๊ฐ€ accuracy ๋ณด๋‹ค ๋†’๋‹ค. ์˜ค!



(5) ์˜ˆ์ธก ํ™•์ธ (ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์ง„ ํ•œ์žฅ ํ™•์ธํ•ด๋ณด๊ธฐ)

res = model.predict(test_images[77:78]) print(res2.shape)


res์˜ shape ์€ (1, 4) ๊ฐ€ ๋‚˜์˜จ๋‹ค.

77๋ฒˆ์งธ ์‚ฌ์ง„์€

์ด๋ถ„์ด์‹œ๊ณ , ์ž˜ ์˜ˆ์ธกํ•˜๋Š”์ง€ ์‹œ๊ฐํ™”ํ•˜์—ฌ ํ™•์ธ์„ ํ•ด๋ณด์ž.

plt.bar(range(4), res[0]) 
plt.bar(np.array(range(4)) + 0.3, test_labels2[77]) 
plt.show()

 

77๋ฒˆ์งธ ์ด๋ฏธ์ง€๋Š” ์˜ˆ์ƒ๊ณผ ๋‹ค๋ฅด๊ฒŒ ์ž˜ ๋งž์ถ˜ ๊ฒƒ ๊ฐ™๋‹ค. ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„๋Š” ์•„์ฃผ ์ด์ƒํ–ˆ๋Š”๋ฐ ๊ฝค ๋งž์ถ”๊ธด ํ•˜๋‚˜๋ณด๋‹ค.




(6) ๋ชจ๋ธ ํ‰๊ฐ€

model.evaluate(test_images, test_labels2, verbose=2)

์—„์ฒญ๋‚œ loss์™€ ์ •ํ™•๋„ 60%๋กœ ๋ฐฉ๋ฒ• 1์˜ ๋ชจ๋ธ ํ‰๊ฐ€๊นŒ์ง€ ๋งˆ์ณค๋‹ค. ํ•œ๋ฒˆ์— ํ•ฉ์ณ์„œ ๋ชจ๋ธ๋ง์„ ํ•˜๋ ค๋ฉด ๋‹ค๋ฅธ ์กฐ์น˜๊ฐ€ ๋” ํ•„์š”ํ•ด ๋ณด์ธ๋‹ค.






-๋ฐฉ๋ฒ• 2

from keras.models import Model 
from keras.layers import Conv2D, MaxPool2D, Input, Dense, Flatten, Concatenate 
def multi_model(): 
	inputs = Input((72, 59, 3)) 
    l1 = Conv2D(32, 3, activation='relu')(inputs) 
    l2 = MaxPool2D(2)(l1) 
    l3 = Conv2D(64, 3, activation='relu')(l2) 
    l4 = MaxPool2D(2)(l3) 
    l5 = Conv2D(64, 3, activation='relu')(l4) 
    l6 = MaxPool2D(2)(l5) 
    l7 = Flatten()(l6) 
    latent_vector = Dense(64, activation='relu')(l7) 
    
    gender_outputs = Dense(2, activation='softmax')(latent_vector) 
    smile_outputs = Dense(2, activation='softmax')(latent_vector) 
    
    model = Model(inputs, [gender_outputs, smile_outputs]) 
    
    return model

์—ฌ๊ธฐ์„œ๋Š” gender_outpus์™€ smile_outputs๋ฅผ concatenateํ•ด์ฃผ์ง€ ์•Š๊ณ , model์— ๋ฆฌ์ŠคํŠธ๋กœ ๋ฌถ์–ด ๋ฐ”๋กœ ๋„ฃ์–ด์ค€๋‹ค. ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐœ์ˆ˜๋Š” ์ ˆ๋ฐ˜์œผ๋กœ ์ค„์—ฌ์ฃผ๋ฉด์„œ ์ฒ˜์Œ์— ํ–ˆ๋˜ ๊ฐ๊ฐ ๋ชจ๋ธ๋ง๊ณผ ๊ฐ™์€ ํ˜•ํƒœ์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๊ฒŒ ํ•ด์ค€๋‹ค.


(1) ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐ ์š”์•ฝ์ •๋ณด ์ถœ๋ ฅ

# ๋ชจ๋ธ ์ƒ์„ฑ 
model2 = multi_model() 
# ๋ชจ๋ธ ์š”์•ฝ์ •๋ณด 
model2.summary()

(None, 2) (None, 2) ๋กœ ์ตœ์ข… ์•„์›ƒํ’‹์ด 2๊ฐœ


(2) loss, optimizer, metrics ์„ค์ •

model2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

๋˜‘๊ฐ™์ด ์„ค์ •!



(3) ๋ชจ๋ธ ํ•™์Šต

hist2 = model2.fit(train_images, [train_male_labels, train_smile_labels], validation_data=(test_images, [test_male_labels, test_smile_labels]), epochs=15, verbose=1)



(4) ๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„

๊ฐ๊ฐ ๋ชจ๋ธ๋ง ํ–ˆ์„ ๋•Œ์™€ ๋น„์Šทํ•œ ํ˜•ํƒœ๋กœ ๋‚˜์˜ด์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.


(5) ์˜ˆ์ธก ํ™•์ธ (ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์ง„ ํ•œ์žฅ ํ™•์ธํ•ด๋ณด๊ธฐ)
์•„๊นŒ์™€ ๊ฐ™์€ 77๋ฒˆ์งธ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ์ธกํ•ด๋ณธ๋‹ค.

res2 = model2.predict(test_images[77:78]) print(res[0].shape, res[1].shape)

์—ฌ๊ธฐ์„œ๋Š” ์•„์›ƒํ’‹์ด 2๊ฐœ์ด๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ๊ฐ ์ถœ๋ ฅํ•ด์ฃผ๋ฉด (1, 2) (1, 2) ์ด๋Ÿฌํ•œ ํ˜•ํƒœ๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋œ๋‹ค.
res2๋ฅผ ์ถœ๋ ฅํ•ด๋ณด๋ฉด,
[array([[9.999993e-01, 6.897806e-07]], dtype=float32),
array([[0.01424873, 0.9857512 ]], dtype=float32)] ์ด๋Ÿฌํ•œ ํ˜•ํƒœ๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด๋‹ค.

์ฝ”๋“œ๋Š” ์œ„์—์„œ ์‚ด์ง ๋ฐ”๊ฟ”์ฃผ๋ฉด ๋˜๋‹ˆ ์ƒ๋žตํ•˜๊ณ , ๊ฐ๊ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค๋ณด๋ฉด

์„ฑ๋ณ„์€ ์—ฌ์ž, 100%๋Š” ์•„๋‹ˆ์ง€๋งŒ ์›ƒ๋Š” ์‚ฌ๋žŒ์œผ๋กœ ์ž˜ ํŒ๋ณ„ ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.



(6) ๋ชจ๋ธ ํ‰๊ฐ€

model2.evaluate(test_images, [test_male_labels, test_smile_labels], verbose=2)

๋„ˆ๋ฌด ๊ธธ์–ด์„œ ์งค๋ ธ๋‹ค. ๊ฒฐ๊ณผ๋Š” ๋‘ ์„ฑ๋ณ„, ์›ƒ์Œ ๋ชจ๋ธ ๋ชจ๋‘ ์ •ํ™•๋„ 91% ์ •๋„๋กœ ๋‚˜์™”๋‹ค. ์ œ์ผ ๋†’๊ฒŒ ๋‚˜์™”๋‹ค!


(7) ๋ชจ๋ธ ๋ถ„๋ฆฌ ํ•ด๋ณด๊ธฐ

gender_model2 = Model(inputs = model2.input, outputs = model2.get_layer('dense_5').output) 
gender_model2.summary()
smile_model2 = Model(inputs = model2.input, outputs = model2.get_layer('dense_6').output) 
smile_model2.summary()

model2์—์„œ input๊ณผ gender_outputs, smile_oupts์— ํ•ด๋‹นํ–ˆ๋˜ layer๋ฅผ get_layer๋ฅผ ํ†ตํ•ด ์–ป์–ด์™€์„œ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ด์ฃผ๋ฉด ๋œ๋‹ค.
summary๋ฅผ ํ•ด๋ณด๋ฉด ๊ฐ€์žฅ ์ฒ˜์Œ์— ํ–ˆ๋˜ ๊ฐ๊ฐ ๋ชจ๋ธ๋ง์—์„œ์˜ ํ˜•ํƒœ์™€ ๊ฐ™๊ฒŒ ๋‚˜์˜จ๋‹ค.


- weights ํ™•์ธ ํ•ด๋ณด๊ธฐ

smile_model2.get_weights()[0][0][0][0] 
gender_model2.get_weights()[0][0][0][0]


๊ฐ๊ฐ๋ชจ๋ธ๋ง์—์„œ weights๋ฅผ ํ™•์ธํ•ด ๋ณด์•˜์„ ๋•Œ๋Š” ์„œ๋กœ ๋‹ฌ๋ž๋‹ค. ํ•˜์ง€๋งŒ model2์—์„œ weights๋ฅผ ๊ณต์œ ํ•˜๊ณ  ๋ถ„๋ฆฌ๋ฅผ ํ•ด์ค€ ์œ„์˜ ๋ชจ๋ธ์€ weights๊ฐ€ ๊ฐ™์€ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.



(8) ๋ชจ๋ธ ์ €์žฅ ๋ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

# ๋ชจ๋ธ ์ €์žฅ
model2.save("./multimodel.h5")
# ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 
model3 = tf.keras.models.load_model('./multimodel.h5')

์ด๋ฆ„ ์„ค์ •ํ•˜๊ณ  h5๋กœ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ค๊ธฐ!


--- ์ฝ”๋žฉ์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด ์ปดํ“จํ„ฐ์— ๋ชจ๋ธ ์ €์žฅํ•˜๋Š” ์ฝ”๋“œ

from google.colab import files 
files.download('./multimodel.h5')





๊ฒฐ๋ก ์ ์œผ๋กœ
ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜ => ๊ฐ๊ฐ ๋ชจ๋ธ๋ง (์ด ์•ฝ 40๋งŒ๊ฐœ) > ๋ฉ€ํ‹ฐ ์•„์›ƒํ’‹ ๋ชจ๋ธ๋ง (์•ฝ 20๋งŒ๊ฐœ)
์ ˆ๋ฐ˜์ด ์ค„์–ด๋“ค๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ™์€ ํ˜•ํƒœ์˜ ๋ชจ๋ธ์˜ ์•„์›ƒํ’‹๋งŒ ๋‹ฌ๋ฆฌ ํ•ด์ค„ ๋•Œ๋Š” ๋ฉ€ํ‹ฐ ์•„์›ƒํ’‹ ๋ชจ๋ธ๋ง์ด ๋” ํšจ์œจ์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค.


๋ฐฉ๋ฒ•1 vs ๋ฐฉ๋ฒ•2
๋ฐฉ๋ฒ• 1 ์€ ์•„์›ƒํ’‹์ด 1๊ฐœ (None, 4) , ๋ฐฉ๋ฒ•2๋Š” ์•„์›ƒํ’‹์ด 2๊ฐœ (None, 2) (None, 2) (๊ฐ๊ฐ ๋ชจ๋ธ๋ง๊ณผ ๊ฐ™์Œ)
๋ฐฉ๋ฒ•1์€ ์ด๋ฏธ์ง€ ํ•œ์žฅ์€ ์ž˜ ๋ถ„๋ฅ˜ํ•œ ๊ฒƒ ๊ฐ™์€๋ฐ, loss๋‚˜ accuracy๊ฐ€ ์ด์ƒํ•ด์„œ ์กฐ์น˜๋ฅผ ๋” ์ทจํ•ด์ฃผ๊ฑฐ๋‚˜, ์กฐ๊ธˆ ๋” ์•Œ์•„๋ณผ ํ•„์š”๊ฐ€ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

<์ด์ „ ํฌ์ŠคํŒ…>
https://silvercoding.tistory.com/6

 

[celeba ํ”„๋กœ์ ํŠธ] 1. celeba ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๊ธฐ

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) https://www.tensorflow.org/datasets/catalog/celeb_a celeb_a  | TensorFlow Datasets CelebFaces Attributes Dataset (CelebA)์€ ๊ฐ๊ฐ 40 ๊ฐœ์˜ ์†์„ฑ..

silvercoding.tistory.com


์ „์ฒ˜๋ฆฌ๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด์„  ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ํŒŒ์•…ํ•ด์•ผ ํ•œ๋‹ค. ๋˜ํ•œ ์‹ค์ˆ˜๋ฅผ ํ•˜์ง€ ์•Š๊ธฐ ์œ„ํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„, ํฌ๊ธฐ, ๋ฐ์ดํ„ฐ ํƒ€์ž…์„ ์ˆ˜์‹œ๋กœ ํ™•์ธํ•ด ์ฃผ๋ฉฐ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ง„ํ–‰ํ•˜๋„๋ก ํ•œ๋‹ค. (ํฌ์ŠคํŒ…์—์„œ๋Š” ์ƒ๋žต)

์ด์ „ ํฌ์ŠคํŒ…์— ๋”ฐ๋ฅด๋ฉด, ๋ฒ”์œ„๋Š” 0.0-1.0, ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋Š” (2000, 72, 59, 3) (200, 72. 59. 3), ๋ผ๋ฒจ ํฌ๊ธฐ๋Š” (2000, 2) (200, 2), ๋ฐ์ดํ„ฐ ํƒ€์ž…์€ ์ด๋ฏธ์ง€ float64, ๋ผ๋ฒจ์€ int8 ์ด์—ˆ๋‹ค.


์ด์ „์—๋„ ๋งํ–ˆ๋“ฏ์ด ์ด๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” normalize๋ฅผ ํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋ฒˆ์—๋Š” ๋ผ๋ฒจ๋งŒ ์ „์ฒ˜๋ฆฌ ์‹œ์ผœ์ฃผ๋ฉด ๋œ๋‹ค.

์ „์ฒ˜๋ฆฌ ์‹œ์ž‘
(1) ๋ผ๋ฒจ ํฌ๊ธฐ ๋ณ€๊ฒฝ
(๋ฐฐ์น˜, 2) --> (๋ฐฐ์น˜, 2) (๋ฐฐ์น˜, 2)
(๋ฐฐ์น˜, (์„ฑ๋ณ„, ์›ƒ์Œ)) --> (๋ฐฐ์น˜, ๋‚จ์ž, ์—ฌ์ž) (๋ฐฐ์น˜, ์›ƒ์Œ, ์•ˆ์›ƒ์Œ)

# (๋ฐฐ์น˜, 2) ---> (๋ฐฐ์น˜, 1) (๋ฐฐ์น˜, 2) 
train_male_labels, train_smile_labels = np.split(train_labels, 2, axis=1) 
test_male_labels, test_smile_labels = np.split(test_labels, 2, axis=1) 
# ์ž˜ ๋‚˜๋ˆ ์กŒ๋Š”์ง€ ํ™•์ธ 
print(train_male_labels.shape, train_smile_labels.shape) 
print(train_male_labels[777], train_smile_labels[777], train_labels[777]) 

shape์€ ๊ฐ๊ฐ (2000, 1) ์ด ๋‚˜์˜ค๋ฉด ๋œ๋‹ค. test์˜ shape์„ ์ถœ๋ ฅํ•ด๋ณด๋ฉด (200, 1) ์ด ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค.
[0] [0] [0 0] ๋‚˜๋ˆ ์ง„ ๋ผ๋ฒจ๋“ค๊ณผ ๋‚˜๋ˆ ์ง€๊ธฐ ์ „ ๋ผ๋ฒจ์„ ๋น„๊ตํ•œ ์ฝ”๋“œ์ด๋‹ค. ์ „์— 777๋ฒˆ์งธ ์‚ฌ์ง„ ์•ˆ์›ƒ๋Š” ์—ฌ์ž์˜€๊ธฐ ๋•Œ๋ฌธ์— ์ž˜ ์ถœ๋ ฅ๋œ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

from tensorflow.keras.utils import to_categorical train_male_labels = to_categorical(train_male_labels) train_smile_labels = to_categorical(train_smile_labels) test_male_labels = to_categorical(test_male_labels) test_smile_labels = to_categorical(test_smile_labels)

๊ทธ๋‹ค์Œ์œผ๋กœ๋Š” ์›ํ•ซ์ธ์ฝ”๋”ฉ์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ค€๋‹ค.
(2000, 2) (2000, 2)
(200, 2) (200, 2)


๋ชจ๋ธ๋ง ๋‹จ๊ณ„ ๋•Œ ์„ฑ๋ณ„๊ณผ ์›ƒ์Œ ์—ฌ๋ถ€๋ฅผ ๊ฐ๊ฐ ๋ชจ๋ธ๋งํ•˜๊ธฐ๋„ ํ•˜๊ณ , ๋ฉ€ํ‹ฐ ์•„์›ƒํ’‹ ๋ชจ๋ธ๋ง๋„ ํ•  ์˜ˆ์ •์ด๋‹ค.
๊ทธ๋ž˜์„œ (2000, 2) (2000, 2) ---> (2000, 4) ๋กœ ํ•ฉ์นœ ๋ผ๋ฒจ๋„ ํ•„์š”ํ•˜๋‹ค. ๋งŒ๋“ค์–ด ๋†“์ž.

train_labels2 = np.concatenate([train_male_labels, train_smile_labels], axis = 1) test_labels2 = np.concatenate([test_male_labels, test_smile_labels], axis = 1) print(train_labels2.shape, test_labels2.shape)

(2000, 4) (200, 4) ์ด๋ ‡๊ฒŒ ํ•ฉ์นœ ๋ผ๋ฒจ๋„ ์ƒ์„ฑํ•œ๋‹ค! ์˜ˆ๋ฅผ๋“ค์–ด ๋‚จ์ž๊ณ  ์›ƒ๊ณ ์žˆ์ง€ ์•Š๋‹ค๋ฉด [0 1 1 0] ์ด๋Ÿฐ์‹์œผ๋กœ ๋‚˜์˜ค๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค.

์ด๋ฒˆ์—” ์ด๋ ‡๊ฒŒ ํ•ด์„œ ์ „์ฒ˜๋ฆฌ๋ฅผ ๋๋‚ธ๋‹ค.
---> ๊ฒฐ๋ก ์ ์œผ๋กœ ์ „์ฒ˜๋ฆฌ ๊ฒฐ๊ณผ : (๋ฐฐ์น˜, 2) --> (๋ฐฐ์น˜, 2) (๋ฐฐ์น˜, 2) / (๋ฐฐ์น˜, 4) ์ด๋ ‡๊ฒŒ ๋‘ ์ข…๋ฅ˜์˜ ๋ผ๋ฒจ์„ ํš๋“ํ–ˆ๋‹ค!



์—ฌ๋Ÿฌ ์žฅ ์‹œ๊ฐํ™” ํ•˜๊ธฐ
(1) ์ด๋ฏธ์ง€ shape ๋ณ€๊ฒฝ
์ด ๋‚ด์šฉ์€ ์ €๋ฒˆ ํ”„๋กœ์ ํŠธ์™€ ๋™์ผํ•˜๋‹ค. ๊ทธ๋ž˜์„œ hstack ์€ ์ƒ๋žตํ•˜๊ณ , transpose ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ shape์„ ๋ณ€๊ฒฝํ•ด ์ค„ ๊ฒƒ์ด๋‹ค.

train_images[:5].transpose((1, 0, 2, 3)).reshape((72, -1, 3)).shape

shape์„ (5, 72, 59, 3) ---> (72, 5*59, 3) ์œผ๋กœ ๋ณ€๊ฒฝํ•ด ์ฃผ์–ด์•ผ ํ•œ๋‹ค. ๋”ฐ๋ผ์„œ transpose๋กœ ์œ„์น˜๋ฅผ ๋ณ€๊ฒฝํ•ด์ฃผ๊ณ , reshape์œผ๋กœ shape์„ ๋งž์ถฐ์ฃผ๋ฉด ๋œ๋‹ค.


์ด๋ฅผ plt๋กœ ์‹œ๊ฐํ™” ํ•ด๋ณด๋ฉด

์ด๋ ‡๊ฒŒ ์—ฐ์†์œผ๋กœ 5์žฅ์„ ์‹œ๊ฐํ™” ํ•  ์ˆ˜ ์žˆ๋‹ค.




์ด๋ฒˆ์—๋Š” ์ •๋ง ๊ฐ„๋‹จํ•˜๊ฒŒ ์ „์ฒ˜๋ฆฌ์™€ ์‹œ๊ฐํ™”๋ฅผ ๊ตฌํ˜„ํ•ด ๋ณด์•˜๋‹ค. ๋‹ค์Œ์€ ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํฌ์ŠคํŒ… ํ•  ์˜ˆ์ •์ด๋‹ค.

(๋ณธ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)

<celeb_a ๋ฐ์ดํ„ฐ์…‹ ์ถœ์ฒ˜>
https://www.tensorflow.org/datasets/catalog/celeb_a

 

celeb_a  |  TensorFlow Datasets

CelebFaces Attributes Dataset (CelebA)์€ ๊ฐ๊ฐ 40 ๊ฐœ์˜ ์†์„ฑ ์ฃผ์„์ด์žˆ๋Š” 20 ๋งŒ ๊ฐœ ์ด์ƒ์˜ ์œ ๋ช…์ธ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ ๋œ ๋Œ€๊ทœ๋ชจ ์–ผ๊ตด ์†์„ฑ ๋ฐ์ดํ„ฐ ์„ธํŠธ์ž…๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ์ด๋ฏธ์ง€๋Š” ํฐ ํฌ์ฆˆ ๋ณ€ํ˜•๊ณผ ๋ฐฐ๊ฒฝ ํ˜ผ

www.tensorflow.org

caleba ๋ฐ์ดํ„ฐ์…‹์€ 40๊ฐœ์˜ ์†์„ฑ, 10,177๊ฐœ์˜ ์‹ ์›, 20๋งŒ๊ฐœ ์ด์ƒ์˜ ์œ ๋ช…์ธ ์ด๋ฏธ์ง€๊ฐ€ ํฌํ•จ๋œ ๋Œ€๊ทœ๋ชจ ์–ผ๊ตด ์†์„ฑ ๋ฐ์ดํ„ฐ์…‹์ด๋‹ค.
ํ™œ์šฉ - ์–ผ๊ตด ์†์„ฑ์ธ์‹, ์–ผ๊ตด ๊ฐ์ง€, ์–ผ๊ตด ์œ„์น˜ํŒŒ์•…

์ถœ์ฒ˜&nbsp;https://www.tensorflow.org/datasets/catalog/celeb_a


celeba์˜ ์†์„ฑ๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด dictionary๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

FeaturesDict({
    'attributes': FeaturesDict({
        '5_o_Clock_Shadow': tf.bool,
        'Arched_Eyebrows': tf.bool,
        'Attractive': tf.bool,
        'Bags_Under_Eyes': tf.bool,
        'Bald': tf.bool,
        'Bangs': tf.bool,
        'Big_Lips': tf.bool,
        'Big_Nose': tf.bool,
        'Black_Hair': tf.bool,
        'Blond_Hair': tf.bool,
        'Blurry': tf.bool,
        'Brown_Hair': tf.bool,
        'Bushy_Eyebrows': tf.bool,
        'Chubby': tf.bool,
        'Double_Chin': tf.bool,
        'Eyeglasses': tf.bool,
        'Goatee': tf.bool,
        'Gray_Hair': tf.bool,
        'Heavy_Makeup': tf.bool,
        'High_Cheekbones': tf.bool,
        'Male': tf.bool,
        'Mouth_Slightly_Open': tf.bool,
        'Mustache': tf.bool,
        'Narrow_Eyes': tf.bool,
        'No_Beard': tf.bool,
        'Oval_Face': tf.bool,
        'Pale_Skin': tf.bool,
        'Pointy_Nose': tf.bool,
        'Receding_Hairline': tf.bool,
        'Rosy_Cheeks': tf.bool,
        'Sideburns': tf.bool,
        'Smiling': tf.bool,
        'Straight_Hair': tf.bool,
        'Wavy_Hair': tf.bool,
        'Wearing_Earrings': tf.bool,
        'Wearing_Hat': tf.bool,
        'Wearing_Lipstick': tf.bool,
        'Wearing_Necklace': tf.bool,
        'Wearing_Necktie': tf.bool,
        'Young': tf.bool,
    }),
    'image': Image(shape=(218, 178, 3), dtype=tf.uint8),
    'landmarks': FeaturesDict({
        'lefteye_x': tf.int64,
        'lefteye_y': tf.int64,
        'leftmouth_x': tf.int64,
        'leftmouth_y': tf.int64,
        'nose_x': tf.int64,
        'nose_y': tf.int64,
        'righteye_x': tf.int64,
        'righteye_y': tf.int64,
        'rightmouth_x': tf.int64,
        'rightmouth_y': tf.int64,
    }),
})

์„ฑ๋ณ„, ์›ƒ์Œ ์—ฌ๋ถ€, ์ Š์Œ, ์•ˆ๊ฒฝ ์ฐฉ์šฉ, ๋ชจ์ž ์ฐฉ์šฉ, ์›จ์ด๋ธŒ ๋จธ๋ฆฌ, ๊ฐˆ์ƒ‰๋จธ๋ฆฌ ์—ฌ๋ถ€ ๋“ฑ๋“ฑ ๋งŽ์€ ์†์„ฑ๋“ค์ด ์กด์žฌํ•œ๋‹ค. ์ด ์ค‘์—์„œ ๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ์„ฑ๋ณ„(Male)๊ณผ ์›ƒ์Œ ์—ฌ๋ถ€(Smiling)๋ฅผ ๋ถ„๋ฅ˜ํ•ด ๋‚ด๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค ๊ฒƒ์ด๋‹ค.

์†์„ฑ์„ ๊บผ๋‚ผ๋•Œ๋Š” ['attributes']['Male'] ์ด๋Ÿฐ์‹์œผ๋กœ ๊บผ๋‚ด์ฃผ๋ฉด ๋œ๋‹ค.




์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹ ๋‹ค์šด๋กœ๋“œ

import tensorflow_datasets as tfds 
# tfds.list_builders() -> ๋ฐ์ดํ„ฐ์…‹ ๋ชฉ๋ก ์ „์ฒด๋ณด๊ธฐ 
celeb_a = tfds.load('celeb_a') # celeb_a ๋ฐ์ดํ„ฐ์…‹ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ




๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ์ถ•์†Œ๋œ celeba ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค. ์ถ•์†Œํ•˜๋Š” ์ฝ”๋“œ๋Š” ์ƒ๋žตํ•˜์ง€๋งŒ, ๊ณผ์ •์„ ์ ์–ด๋ณด๋ฉด
1. celeb_a['validation']๊ณผ celeb_a['test'] ๋ฅผ ๊ฐ๊ฐ train, test ๋กœ ํ• ๋‹นํ•ด์ค€๋‹ค.
2. Male, Smiling ์†์„ฑ๋งŒ ๋ถˆ๋Ÿฌ์™€ train_images, train_labels, test_images, test_labels๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.
---> ์—ฌ๊ธฐ์„œ test_images์™€ test_labels ๋งŒ ์ด์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ถ•์†Œ
3. test_images์™€ test_labels์—์„œ ์›ƒ๋Š”๋‚จ์ž, ์•ˆ์›ƒ๋Š”๋‚จ์ž, ์›ƒ๋Š”์—ฌ์ž, ์•ˆ์›ƒ๋Š”์—ฌ์ž๋ฅผ ๋ถ„๋ฆฌํ•ด๋‚ด์–ด ๊ฐ๊ฐ 550๊ฐœ์”ฉ ์ž˜๋ผ ๊ณ ๋ฅด๊ฒŒ ์ถ•์†Œํ•˜์—ฌ ํ•ฉํ•ด์ง„๋‹ค ---> ๊ทธ๋Ÿผ ์ด 2200๊ฐœ๊ฐ€ ๋œ๋‹ค!
4. 2200๊ฐœ ์งœ๋ฆฌ ๋ฐ์ดํ„ฐ๋ฅผ ์„ž์–ด์ฃผ๊ณ , 2000๊ฐœ๊นŒ์ง€ train, ๋‚˜๋จธ์ง€ 200๊ฐœ๋Š” test๋กœ ํ• ๋‹นํ•œ๋‹ค.
5. ์ด๋ฅผ ๋‹ค์‹œ train_images, train_labels, test_images, test_labels ๋กœ ๋‚˜๋ˆ„์–ด์ฃผ๋ฉด ๋! ์ถ•์†Œํ•˜๋Š” ๊ฒŒ ๋ฒˆ๊ฑฐ๋กœ์šฐ๋ฏ€๋กœ npzํŒŒ์ผ๋กœ ์ €์žฅํ•ด๋†“์ž.





celeba_small ๋ฐ์ดํ„ฐ ์‚ดํŽด๋ณด๊ธฐ
(1) ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ณ  train, test ๋ฐ์ดํ„ฐ ๋‚˜๋ˆ„๊ธฐ

celeba_small = np.load('./celeba_small.npz') 

# ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 
train_images = celeba_small['train_images'] 
train_labels = celeba_small['train_labels'] 
test_images = celeba_small['test_images'] 
test_labels = celeba_small['test_labels']





(2) ์‚ฌ์ง„ ํ•œ์žฅ ๊บผ๋‚ด์„œ ์‹œ๊ฐํ™” ํ•ด๋ณด๊ธฐ

plt.imshow(train_images[777]) 
plt.colorbar() 
plt.show() 
print(train_labels[777])

777๋ฒˆ์งธ ์‚ฌ์ง„์„ ๊บผ๋‚ด๋ณด์•˜๋‹ค.

๊ฒฐ๊ณผ๋Š” ์ด๋Ÿฌํ•˜๋‹ค. ์›ƒ๋Š” ๊ฒƒ ๊ฐ™๊ธฐ๋„ ํ•œ๋ฐ ์›ƒ์ง€์•Š๋Š” ์—ฌ์ž๋ผ๊ณ  ๋ผ๋ฒจ๋ง์ด ๋˜์–ด์žˆ๋‹ค!





(3) ๋ฐ์ดํ„ฐ์˜ ๋ฒ”์œ„, ํฌ๊ธฐ, ๋ฐ์ดํ„ฐ ํƒ€์ž… ์•Œ์•„๋ณด๊ธฐ
- ๋ฒ”์œ„

# 0์ด ์•„๋‹Œ ์ˆซ์ž 50๊ฐœ๋งŒ ์ถœ๋ ฅํ•ด๋ณด๊ธฐ 
train_images[train_images != 0][:50] 
test_images[test_images != 0][:50] 
# ๋ฐ์ดํ„ฐ์˜ ์ตœ์†Ÿ๊ฐ’ / ์ตœ๋Œ“๊ฐ’ 
print(train_images.min(), train_images.max()) 
print(train_labels.min(), train_labels.max()) 
print(test_images.min(), test_images.max()) 
print(test_labels.min(), test_labels.max())

50๊ฐœ ์ถœ๋ ฅํ•œ ๋ฐ์ดํ„ฐ๋“ค์€ ๋ชจ๋‘ 0๊ณผ 1์‚ฌ์ด์˜ ๊ฐ’๋“ค์ด์–ด์•ผ ํ•˜๊ณ , ์ด๋ฏธ์ง€์˜ ๋ฒ”์œ„๋Š” 0.0-1.0, ๋ผ๋ฒจ์˜ ๋ฒ”์œ„๋Š” 0-1 ์œผ๋กœ ๋‚˜์˜ค๋ฉด ์ •์ƒ!

- ํฌ๊ธฐ

print(train_images.shape, test_images.shape) 
print(train_labels.shape, test_labels.shape) 

(2000, 72, 59, 3) (200, 72, 59, 3)
(2000, 2) (200, 2)
์ด์™€ ๊ฐ™์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! ์‚ฌ์ง„์˜ ํฌ๊ธฐ๋ฅผ ๋ฐ์ดํ„ฐ ์ถ•์†Œํ•  ๋•Œ ์ค„์—ฌ์„œ ์›๋ณธ๋ณด๋‹ค๋Š” ์ž‘๋‹ค. ์ €๋ฒˆ ํ”„๋กœ์ ํŠธ mnist ์™€ ๋‹ฌ๋ฆฌ ์ฑ„๋„ 3์ด ์ถ”๊ฐ€๋˜์–ด ์ƒ‰์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค!

- ๋ฐ์ดํ„ฐ ํƒ€์ž…

print(train_images.dtype, test_images.dtype) 
print(train_labels.dtype, test_labels.dtype) 

float64 float64
int8 int8
์œ„์™€ ๊ฐ™์ด ๋‚˜์˜ค๋ฉด ์ •์ƒ! ์ด๊ฑธ ํ†ตํ•ด ์•ˆ ์‚ฌ์‹ค์€ dtype์ด float64์ด๊ณ , ๋ฒ”์œ„๊ฐ€ 0.0 - 1.0 ์ด๋ฏ€๋กœ normalize๋ฅผ ์•ˆํ•ด์ค˜๋„ ๋œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

๋ฒ”์œ„, ํฌ๊ธฐ, ๋ฐ์ดํ„ฐ ํƒ€์ž…์€ ์ˆ˜์‹œ๋กœ ํ™•์ธํ•˜๋Š” ์Šต๊ด€์„ ๊ฐ–์ž!

๋‹ค์Œ์—๋Š” ์ „์ฒ˜๋ฆฌ์™€ ์‹œ๊ฐํ™”์— ๋Œ€ํ•œ ํฌ์ŠคํŒ…์„ ํ•  ์˜ˆ์ •์ด๋‹ค.

(์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค)


<์ด์ „ ํฌ์ŠคํŒ…>
https://silvercoding.tistory.com/4

 

[MNIST ํ”„๋กœ์ ํŠธ] 2. MNIST ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒ˜๋ฆฌ, ์‹œ๊ฐํ™”

(์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ์ด๋‹ค) <์ด์ „ ํฌ์ŠคํŒ…> https://silvercoding.tistory.com/3 [MNIST ํ”„๋กœ์ ํŠธ] 1. MNIST ๋ฐ์ดํ„ฐ ์•Œ์•„๋ณด๊ธฐ (์ด๋ฒˆ ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ๋Š” ํŒจ์บ  ๋”ฅ๋Ÿฌ๋‹ ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•œ

silvercoding.tistory.com




Noise ์ถ”๊ฐ€ํ•˜๊ธฐ

https://www.tensorflow.org/tutorials/images/data_augmentation

 

๋ฐ์ดํ„ฐ ์ฆ๊ฐ•  |  TensorFlow Core

๊ฐœ์š” ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ์ด๋ฏธ์ง€ ํšŒ์ „๊ณผ ๊ฐ™์€ ๋ฌด์ž‘์œ„(๊ทธ๋Ÿฌ๋‚˜ ์‚ฌ์‹ค์ ์ธ) ๋ณ€ํ™˜์„ ์ ์šฉํ•˜์—ฌ ํ›ˆ๋ จ ์„ธํŠธ์˜ ๋‹ค์–‘์„ฑ์„ ์ฆ๊ฐ€์‹œํ‚ค๋Š” ๊ธฐ์ˆ ์ธ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์˜ ์˜ˆ๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ฐ์ดํ„ฐ ์ฆ

www.tensorflow.org

์šฐ์„  Data augmentation ์€ ๋ฌด์ž‘์œ„ ๋ณ€ํ™˜์„ ์ ์šฉํ•˜์—ฌ ํ›ˆ๋ จ ์„ธํŠธ์˜ ๋‹ค์–‘์„ฑ์„ ์ฆ๊ฐ€์‹œํ‚ค๋Š” ๊ธฐ์ˆ ์ด๋‹ค.

์ถœ์ฒ˜&nbsp; https://www.tensorflow.org/tutorials/images/data_augmentation &nbsp;

์ด ์‚ฌ์ง„๊ณผ ๊ฐ™์ด ์‚ฌ๋žŒ ๋ˆˆ์—๋Š” ํšŒ์ „์„ ํ•˜๋“  ํ™•๋Œ€๋ฅผ ํ•˜๋“  ๊ฐ™์€ ๊ฝƒ์ด๋ผ๋Š” ๊ฑธ ํŒ๋ณ„ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ปดํ“จํ„ฐ ์ž…์žฅ์—์„œ๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์‚ฌ์ง„์œผ๋กœ ์ž…๋ ฅ๋œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ๋ฌด์ž‘์œ„ ๋ณ€ํ˜•์„ ์‹œ์ผœ ํ›ˆ๋ จ์„ธํŠธ์˜ ๋‹ค์–‘ํ™”๋ฅผ ํ•˜๊ณ ์ž ํ•œ๋‹ค.

์ด ๊ธ€์—์„œ๋Š” MNIST์— ์ด๋Ÿฌํ•œ Noise๋ฅผ ์ž…ํžŒ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค.

(1) (28, 28) ํฌ๊ธฐ์˜ ๋žœ๋ค ๋…ธ์ด์ฆˆ ์ƒ์„ฑํ•˜๊ธฐ
- np.random.random

print(np.random.random((2, 2))) 

np.random.random() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด 0-1์‚ฌ์ด์˜ ์‹ค์ˆ˜๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋œ๋‹ค. ๊ด„ํ˜ธ ์•ˆ์— ์‚ฌ์ด์ฆˆ๋ฅผ ์ž…๋ ฅํ•ด์ฃผ๋ฉด

์ด๋ ‡๊ฒŒ (2, 2) ํ˜•ํƒœ๋กœ ๋žœ๋ค๊ฐ’์ด ๋‚˜์˜ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

np.random.random((28,28)).shape

๋”ฐ๋ผ์„œ ์ด๋ ‡๊ฒŒ ํ•ด์ฃผ๋ฉด (28, 28) ์‚ฌ์ด์ฆˆ์˜ ๋žœ๋ค ๋…ธ์ด์ฆˆ๊ฐ€ ์ƒ์„ฑ๋œ๋‹ค.
์ด๋ฅผ plt.imshow()์— ๋„ฃ์–ด ํ™•์ธํ•ด๋ณด๋ฉด ์œ„์—์„œ ๋ณด์•˜๋˜ ๋…ธ์ด์ฆˆ ๊ทธ๋ฆผ์„ ๋ณผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

๊ทธ๋Ÿฐ๋ฐ ์œ„์—์„œ ๋ณด์•˜๋˜ ๊ทธ๋ฆผ๋ณด๋‹ค๋Š” ์ง„ํ•˜๋‹ค. ๋…ธ์ด์ฆˆ๋ฅผ ์ฃผ๊ธฐ์—” ๋„ˆ๋ฌด ์„ธ๋‹ค.

- np.random.normal

print(np.random.normal(0.0, 0.1, (28, 28))) 

๊ทธ๋ž˜์„œ np.random.normal๋กœ ํ‰๊ท ๊ณผ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ์ง€์ •ํ•ด์ค€๋‹ค. ํ‰๊ท  0, ํ‘œ์ค€ํŽธ์ฐจ 0.1 ๋กœ ์ง€์ •ํ•ด์ค€๋‹ค.
์ด๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ ค์ฃผ๋ฉด

์ ๋‹นํ•œ ๋…ธ์ด์ฆˆ๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ๋‹ค!




(2) ์ด๋ฏธ์ง€ ํ•œ์žฅ์— ์ ์šฉํ•ด๋ณด๊ธฐ
777๋ฒˆ์งธ ์ด๋ฏธ์ง€์— ๋…ธ์ด์ฆˆ๋ฅผ ์”Œ์›Œ๋ณด์ž.

noisy_image = train_images[777] + np.random.normal(0.5, 0.1, (28, 28))

์ฐจ์ด๋ฅผ ๋” ์„ ๋ช…ํžˆ ๋ณด๊ธฐ ์œ„ํ•ด ํ‰๊ท ์„ 0.5๋กœ ์ค€๋‹ค.

๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค๋ณด๋‹ˆ ๋…ธ์ด์ฆˆ๊ฐ€ ์ƒ๊ฒผ์ง€๋งŒ 1์ด ๋„˜๋Š” ๊ฐ’์ด ์ƒ๊ฒจ๋ฒ„๋ฆฐ๋‹ค.

noisy_image[noisy_image > 1.0] = 1.0 

๊ทธ๋ž˜์„œ 1.0์ด ๋„˜๋Š” ๊ฐ’์€ 1.0์œผ๋กœ ๋Œ€์ฒดํ•œ๋‹ค๋Š” ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•ด์ฃผ๋ฉด

0๊ณผ 1์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ๋…ธ์ด์ฆˆ ์ด๋ฏธ์ง€๊ฐ€ ์™„์„ฑ๋œ๋‹ค.




(3) ๋ชจ๋“  ์ด๋ฏธ์ง€์— ๋…ธ์ด์ฆˆ ์ ์šฉํ•˜๊ธฐ

train_noisy_images = train_images + np.random.normal(0.5, 0.1, train_images.shape) 
train_noisy_images[train_noisy_images > 1.0] = 1.0 

test_noisy_images = test_images + np.random.normal(0.5, 0.1, test_images.shape) 
test_noisy_images[test_noisy_images > 1.0] = 1.0

์ตœ์ข…์ ์œผ๋กœ train์ด๋ฏธ์ง€์™€ test์ด๋ฏธ์ง€ ๋ชจ๋‘ ๋…ธ์ด์ฆˆ๋ฅผ ์ ์šฉ์‹œํ‚ค๋Š” ์ฝ”๋“œ์ด๋‹ค.
์ €๋ฒˆ์‹œ๊ฐ„์— ์—ฌ๋Ÿฌ์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ํ•œ๋ฒˆ์— ์‹œ๊ฐํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฒซ 5๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ถœ๋ ฅํ•ด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์ƒ์ ์œผ๋กœ ๋‚˜์˜ค๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.





๋“œ๋””์–ด
๋ชจ๋ธ๋ง ํ•˜๊ธฐ

(1) ๋ชจ๋ธ๋ง ์ค€๋น„ - ๋ผ๋ฒจ ์›ํ•ซ์ธ์ฝ”๋”ฉ ์ž‘์—… (๋ฐฐ์น˜์‚ฌ์ด์ฆˆ,) -> (๋ฐฐ์น˜์‚ฌ์ด์ฆˆ, ํด๋ž˜์Šค ๊ฐœ์ˆ˜)
(60000,) (10000,) ์˜ ํ˜•ํƒœ์˜€๋˜ ๋ผ๋ฒจ์„ (60000, 10) (10000, 10) ์˜ ํ˜•ํƒœ๋กœ one-hot encoding ํ•ด์ค„ ๊ฒƒ์ด๋‹ค.

from keras.utils import to_categorical 
train_labels = to_categorical( train_labels, 10) 
test_labels = to_categorical( test_labels, 10) 

keras.utils์˜ to_categorical์„ import ํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค. to_categorical(์›ํ•ซ์ธ์ฝ”๋”ฉํ•  ๋ผ๋ฒจ, ํด๋ž˜์Šค ๊ฐœ์ˆ˜) ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉํ•˜๋ฉด ๋œ๋‹ค.




(2) simpleRNN classification ๋ชจ๋ธ ์ƒ์„ฑ

from keras.layers import simpleRNN 
from keras.layers import Dense, Input 
from keras.models import Model 

inputs = Input(shape=(28, 28)) 
x1 = simpleRNN(64, activation="tanh")(inputs) 
x2 = Dense(10, activation="softmax")(x1) 

model = Model(inputs, x2)

keras.layers์˜ simpleRNN์œผ๋กœ ๋ชจ๋ธ ์ƒ์„ฑ์„ ํ•œ๋‹ค. activation ํ•จ์ˆ˜๋Š” ๊ฐ๊ฐ tanh, softmax๋กœ ๊ตฌ์„ฑ์ด ๋˜์–ด์žˆ๋‹ค.

model.summary()

summaryํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์š”์•ฝ์ •๋ณด๋ฅผ ์–ป์–ด์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐœ์ˆ˜์™€ ์•„์›ƒํ’‹ shape์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.




(3) loss, optimizer, metrics ์„ค์ •

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ["accuracy"])


compile ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์†์‹คํ•จ์ˆ˜๋Š” categorical crossentropy, optimizer ๋Š” adam, ์ง€ํ‘œ๋Š” ์ •ํ™•๋„๋กœ ์„ค์ •ํ•ด ์ค€๋‹ค.




(4) ํ•™์Šต์‹œํ‚ค๊ธฐ

hist = model.fit(train_noisy_images, train_labels, validation_data=(test_noisy_images, test_labels), epochs=5, verbose=2)

๋‹ค๋ฅธ ๊ฑด ๋‹ค ์˜ˆ์ƒ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ verbose๋Š” ๋ฌด์—‡์ธ์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์–ด์„œ ์ฐพ์•„๋ณด์•˜๋‹ค.

verbose: 'auto', 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch. 'auto' defaults to 1 for most cases, but 2 when used with ParameterServerStrategy. Note that the progress bar is not particularly useful when logged to a file, so verbose=2 is recommended

<์ถœ์ฒ˜>
https://keras.io/api/models/model_training_apis/

 

Keras documentation: Model training APIs

Model training APIs compile method Model.compile( optimizer="rmsprop", loss=None, metrics=None, loss_weights=None, weighted_metrics=None, run_eagerly=None, steps_per_execution=None, **kwargs ) Configures the model for training. Arguments optimizer: String

keras.io



** ๋น„๊ตํ•ด๋ณด๊ธฐ
- verbose = 1

์ง„ํ–‰์ƒํ™ฉ + ๊ฐ’์ด ๋‚˜์˜ด

- verbose = 2

๊ฐ’๋งŒ ๋‚˜์˜ด


(5) ํ•™์Šต ๊ฒฐ๊ณผ ํ™•์ธ

plt.plot(hist.history['accuracy'], label='accuracy') plt.plot(hist.history['loss'], label='loss') plt.plot(hist.history['val_accuracy'], label='val_accuracy') plt.plot(hist.history['val_loss'], label='val_loss') plt.legend(loc='upper left') plt.show()

ํ•™์Šตํ•œ ๊ฒฐ๊ณผ๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ ค๋ณด์•˜์„ ๋•Œ ์ •ํ™•๋„๋Š” ๋งค์šฐ ๋†’๊ณ  ์˜ค๋ฅ˜๋Š” ๋งค์šฐ ๋‚ฎ์€ ๊ฑธ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๊ฐ„๋‹จํ•œ RNN๋ชจ๋ธ๋กœ ๊ตฌํ˜„์„ ํ•˜์—ฌ๋„ ์„ฑ๋Šฅ์ด ๊ดœ์ฐฎ๋‹ค!


--- ์™„์„ฑ๋œ ๋ชจ๋ธ์— test ์ด๋ฏธ์ง€ ํ•œ์žฅ์œผ๋กœ ๊ฒฐ๊ณผ ํ™•์ธํ•ด๋ณด๊ธฐ

res = model.predict( test_noisy_images[777:778] ) 

777๋ฒˆ์งธ ์ด๋ฏธ์ง€๋ฅผ ํ™•์ธํ•ด๋ณด์ž.

plt.bar(range(10), res[0], color='red') plt.bar(np.array(range(10)) + 0.35, test_labels[777]) plt.show()

red๊ฐ€ ์˜ˆ์ธกํ•œ ํ™•๋ฅ , blue๊ฐ€ ์ •๋‹ต์ด๋‹ค. ๋ณด๋ฉด 1๋กœ ์ž˜ ์˜ˆ์ธกํ–ˆ์ง€๋งŒ, 7๊ณผ 8๋กœ ์˜ˆ์ธกํ•œ ๊ฒƒ์ด ๋ฏธ์„ธํ•˜๊ฒŒ ๋ณด์ธ๋‹ค. ์„ฑ๋Šฅ์€ ๋‚˜์˜์ง€ ์•Š์•„๋ณด์ธ๋‹ค.




(6) ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ‰๊ฐ€ํ•˜๊ธฐ

loss, acc = model.evaluate(test_noisy_images, test_labels, verbose=2) print(loss, acc)

evaluate์— ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์„ ๋„ฃ์–ด์ฃผ๋ฉด ๋œ๋‹ค.

์ •ํ™•๋„ 95%๋กœ ๋ชจ๋ธ ํ‰๊ฐ€๊นŒ์ง€ ๋งˆ์ณค๋‹ค.




(7) ๋ชจ๋ธ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

# ๋ชจ๋ธ ์ €์žฅ 
model.save("./mnist_rnn.h5")
# ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ 
new_model = tf.keras.models.load_model('./mnist_rnn.h5')

h5๋กœ ์ €์žฅํ•ด์ฃผ๋ฉด ๋œ๋‹ค.


** ํ˜น์‹œ ์ฝ”๋žฉ์œผ๋กœ ํ–ˆ๋‹ค๋ฉด, ์ฝ”๋žฉ์— ์ €์žฅ๋œ ๋ชจ๋ธ์„ ์ปดํ“จํ„ฐ์— ์ €์žฅํ•˜๋Š” ์ฝ”๋“œ

from google.colab import files 
files.download('./mnist_rnn.h5')

 

+ Recent posts