windows - NLTK v3.2: Unable to nltk.pos_tag() -
hi text mining champions,
i'm using anaconda nltk v3.2 on windows 10.(client's environment)
when try pos tag, keep getting urllib2 error:
urlerror: <urlopen error unknown url type: c> it seems urllib2 unable recognize windows paths? how can work around this?
the command simple as:
nltk.pos_tag(nltk.word_tokenize("hello world"))
edit: there duplicate question, think answers obtained here manan , alvas better fix.
edited
this issue has been resolved nltk v3.2.1. upgrading nltk version resolve issue, e.g. pip install -u nltk.
i faced same issue , error encountered follows;
traceback (most recent call last): file "<stdin>", line 1, in <module> file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\__init__.py", line 110, in pos_tag tagger = perceptrontagger() file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py", line 141, in __init__ self.load(ap_model_loc) file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py", line 209, in load self.model.weights, self.tagdict, self.classes = load(loc) file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\data.py", line 801, in load opened_resource = _open(resource_url) file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\data.py", line 924, in _open return urlopen(resource_url) file "c:\python27\lib\urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) file "c:\python27\lib\urllib2.py", line 391, in open response = self._open(req, data) file "c:\python27\lib\urllib2.py", line 414, in _open 'unknown_open', req) file "c:\python27\lib\urllib2.py", line 369, in _call_chain result = func(*args) file "c:\python27\lib\urllib2.py", line 1206, in unknown_open raise urlerror('unknown url type: %s' % type) urllib2.urlerror: <urlopen error unknown url type: c> the urlerror mentioned due bug in perceptron.py file within nltk library windows. in machine, file @ location
c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py (basically @ equivalent location within yours wherever have python27 folder)
the bug in code find corresponding location averaged_perceptron_tagger within machine. 1 can have @ line 801 , 924 mentioned in data.py file regarding this.
i think nltk developer community fixed bug in code. have @ commit made code few days back.
the snippet change made follows;
self.tagdict = {} self.classes = set() if load: ap_model_loc = 'file:'+str(find('taggers/averaged_perceptron_tagger/'+pickle)) self.load(ap_model_loc) # was:ap_model_loc = str(find('taggers/averaged_perceptron_tagger/'+pickle)) def tag(self, tokens): updating file recent commit worked me , able use nltk.pos_tag command. believe resolve problem (assuming have else set up).
Comments
Post a Comment