windows - NLTK v3.2: Unable to nltk.pos_tag() -


hi text mining champions,

i'm using anaconda nltk v3.2 on windows 10.(client's environment)

when try pos tag, keep getting urllib2 error:

urlerror: <urlopen error unknown url type: c> 

it seems urllib2 unable recognize windows paths? how can work around this?

the command simple as:

nltk.pos_tag(nltk.word_tokenize("hello world"))

edit: there duplicate question, think answers obtained here manan , alvas better fix.

edited

this issue has been resolved nltk v3.2.1. upgrading nltk version resolve issue, e.g. pip install -u nltk.


i faced same issue , error encountered follows;

traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\__init__.py", line 110, in pos_tag tagger = perceptrontagger()   file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py", line 141, in __init__ self.load(ap_model_loc)   file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py", line 209, in load self.model.weights, self.tagdict, self.classes = load(loc)   file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\data.py", line 801, in load opened_resource = _open(resource_url)   file "c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\data.py", line 924, in _open return urlopen(resource_url)   file "c:\python27\lib\urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout)   file "c:\python27\lib\urllib2.py", line 391, in open response = self._open(req, data)   file "c:\python27\lib\urllib2.py", line 414, in _open 'unknown_open', req)   file "c:\python27\lib\urllib2.py", line 369, in _call_chain result = func(*args)   file "c:\python27\lib\urllib2.py", line 1206, in unknown_open raise urlerror('unknown url type: %s' % type) urllib2.urlerror: <urlopen error unknown url type: c> 

the urlerror mentioned due bug in perceptron.py file within nltk library windows. in machine, file @ location

c:\python27\lib\site-packages\nltk-3.2-py2.7.egg\nltk\tag\perceptron.py 

(basically @ equivalent location within yours wherever have python27 folder)

the bug in code find corresponding location averaged_perceptron_tagger within machine. 1 can have @ line 801 , 924 mentioned in data.py file regarding this.

i think nltk developer community fixed bug in code. have @ commit made code few days back.

https://github.com/nltk/nltk/commit/d3de14e58215beebdccc7b76c044109f6197d1d9#diff-26b258372e0d13c2543de8dbb1841252

the snippet change made follows;

self.tagdict = {} self.classes = set()     if load:         ap_model_loc = 'file:'+str(find('taggers/averaged_perceptron_tagger/'+pickle))           self.load(ap_model_loc)         # was:ap_model_loc = str(find('taggers/averaged_perceptron_tagger/'+pickle))   def tag(self, tokens): 

updating file recent commit worked me , able use nltk.pos_tag command. believe resolve problem (assuming have else set up).


Comments

Popular posts from this blog

java - Run spring boot application error: Cannot instantiate interface org.springframework.context.ApplicationListener -

python - pip wont install .WHL files -

Excel VBA "Microsoft Windows Common Controls 6.0 (SP6)" Location Changes -