python - importing user defined library in redshift UDF -


here trying import library inside user defined python function in redshift

i have created library called nltk follows

[create or replace library nltk language plpythonu 's3://nltk.zip' credentials 'aws_access_key_id=*****;aws_secret_access_key=****';] 

once created tried import in function as

create  or replace function f_function (sentence varchar)     returns varchar stable $$     nltk import tokenize     token = nltk.word_tokenize(sentence)     return token $$ language plpythonu; 

tokenize sub directory inside nltk library

but when try run function calling on table as

select f_function(text) table_txt; 

i getting error such

amazon invalid operation: importerror: no module named nltk. please @ svl_udf_log more information
details:
-----------------------------------------------
error: importerror: no module named nltk. please @ svl_udf_log more information
code: 10000
context: udf
query: 69145
location: udf_client.cpp:298
process: query0_21 [pid=3165]

can me doing wrong?

first, there obvious problem python code: never importing nltk, , calling nltk.word_tokenize.

second, after downloading nltk package, need zip module folder inside package , upload zip redshift.

nltk-x.y.zip ├─ setup.py ├─ requirements.txt ├─ nltk <- folder should zipped , uploaded s3 ...  ├─ __init__.py      ├─ tokenize.py 

redshift can load modules -- root folder should have __init__.py file. http://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html


Comments

Popular posts from this blog

java - Run spring boot application error: Cannot instantiate interface org.springframework.context.ApplicationListener -

python - pip wont install .WHL files -

Excel VBA "Microsoft Windows Common Controls 6.0 (SP6)" Location Changes -