본문 바로가기
IT/Python

sklearn.cross_validation.train_test_split(*arrays, **options)

by Jang HyunWoong 2014. 12. 19.

sklearn.cross_validation.train_test_split(*arrays, **options) 함수는

배열 또는 매트릭스를 랜덤하게 트레인과 테스트 셋으로 나눈다.

 

사용법: Split into training and test set (e.g., 80/20) 80대 20으로 나눈다. 

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

 

Examples

>>>
>>> import numpy as np
>>> from sklearn.cross_validation import train_test_split
>>> a, b = np.arange(10).reshape((5, 2)), range(5)
>>> a
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])
>>> list(b)
[0, 1, 2, 3, 4]
>>>
>>> a_train, a_test, b_train, b_test = train_test_split(
...     a, b, test_size=0.33, random_state=42)
...
>>> a_train
array([[4, 5],
       [0, 1],
       [6, 7]])
>>> b_train
[2, 0, 3]
>>> a_test
array([[2, 3],
       [8, 9]])
>>> b_test
[1, 4]

 

반응형