problem using nltk.pos

www.nltk.org 에서 nltk nltk-3.0.0.win32.exe (md5) 버전을 다운 받고

>>> import nltk

>>> text = nltk.word_tokenize("And now for something completely different")

>>> text

['And', 'now', 'for', 'something', 'completely', 'different']

실행 결과 잘 나왔다.

하지만

nltk.pos_tag(text) 를 했을 때, 다음과 같은 에러가 나왔다.

UnicodeDecodeError: 'ascii' codec can't decode byte in position 0: ordinal not in range(128)

쉬운 해결방법:http://www.nltk.org/nltk3-alpha/ 여기에서

를 다운 받는다.

다시 설치하면

>>> nltk.pos_tag(text)

[('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something', 'NN'), ('completely', 'RB'), ('different', 'JJ')]

제대로 나온다!

python으로 간단하게 스크래핑 해보기 in windows7 (0)	2014.12.19
Python 과 OpenCV 기초 (scaling, rotating, cropping) (0)	2014.12.19
정규표현식 python re r' (raw string) (2)	2014.12.19
Python Regular Expressions (0)	2014.12.19
str.startswith()와 str.endswith()를 사용해서 문자열의 처음 텍스트나 마지막 텍스트 매칭 (0)	2014.12.19

Creation & Freedom