pyflink, ImportError: No module named pyflink - apache-flink

I am testing pyflink on
os: centos7
flink version: flink-1.14.3
virtualenv python version: Python 3.6.8
pip list:
apache-beam 2.27.0
apache-flink 1.14.3
apache-flink-libraries 1.14.3
avro-python3 1.9.2.1
certifi 2021.10.8
charset-normalizer 2.0.11
cloudpickle 1.2.2
crcmod 1.7
dill 0.3.1.1
docopt 0.6.2
fastavro 0.23.6
future 0.18.2
grpcio 1.43.0
hdfs 2.6.0
httplib2 0.17.4
idna 3.3
mock 2.0.0
numpy 1.19.5
oauth2client 4.1.3
pandas 1.1.5
pbr 5.8.1
pip 21.3.1
protobuf 3.17.3
py4j 0.10.8.1
pyarrow 2.0.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pydot 1.4.2
pyflink 1.0
pymongo 3.12.3
pyparsing 3.0.7
python-dateutil 2.8.0
pytz 2021.3
requests 2.27.1
rsa 4.8
setuptools 59.6.0
six 1.16.0
typing-extensions 3.7.4.3
urllib3 1.26.8
wheel 0.37.1
I tried to run this command :
(virtualenv) [myuser#myvm flink-1.14.3] ./bin/flink run -py examples/python/table/word_count.py
And got the following error:
Caused by: java.io.IOException: Failed to execute the command: python -c import pyflink;import os;print(os.path.join(os.path.abspath(os.path.dirname(pyflink.file)), 'bin'))
output: Traceback (most recent call last):
File "", line 1, in
ImportError: No module named pyflink
I am sure pyflink package is already installed. Does anyone know why?

To install PyFlink, you only need to execute:
python -m pip install apache-flink
and make sure you have a compatible Python version (>= 3.5).
The problem may be the Python Virtual Environment, refer to https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/faq/#preparing-python-virtual-environment
Also may be you can add option '-pyexec venv.zip/venv/bin/python3' and have a try

You have to check if pyflink is well installed (in your venv)
also check if you are running Flink
if no, start it with :
start-cluster.sh
here is full documentation about PyFlink:
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/overview/

Related

Django deployment using lightsail site-packages are not updated

I am trying to deploy my Django application on AWS Lightsail.
I've been following the AWS tutorial and thought got it quite well.
I've successfully tested application on 0.0.0.0:8000 and the app is working fine.
However, when I deployed with apache, it keeps giving me the Internal Error 500. Hence, I've gone to look up the error_logs to investigate.
Based on my investigation, it seems like all of pip packages I've installed for my app are not installed on system's python's site-packages.
This is the list of packages when I run pip list
pip list
Package Version
appdirs 1.4.4
argon2-cffi 20.1.0
asgiref 3.2.10
astroid 2.4.2
async-generator 1.10
attrs 20.2.0
backcall 0.2.0
bleach 3.2.1
boto3 1.15.18
botocore 1.18.18
cffi 1.14.3
decorator 4.4.2
defusedxml 0.6.0
distlib 0.3.1
Django 3.1
django-carton 1.2.1
django-extensions 3.0.9
django-storages 1.10.1
entrypoints 0.3
filelock 3.0.12
importlib-metadata 2.0.0
ipykernel 5.3.4
ipython 7.16.1
ipython-genutils 0.2.0
ipywidgets 7.5.1
isort 5.6.4
jedi 0.17.2
Jinja2 2.11.2
jmespath 0.10.0
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 6.1.7
jupyter-console 6.2.0
jupyter-core 4.6.3
jupyterlab-pygments 0.1.2
lazy-object-proxy 1.4.3
MarkupSafe 1.1.1
mccabe 0.6.1
mistune 0.8.4
mysqlclient 1.4.6
nbclient 0.5.1
nbconvert 6.0.7
nbformat 5.0.8
nest-asyncio 1.4.1
notebook 6.1.4
packaging 20.4
pandocfilters 1.4.2
parso 0.7.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.0.0
pip 20.2.1
prometheus-client 0.8.0
prompt-toolkit 3.0.8
psycopg2 2.8.5
ptyprocess 0.6.0
pycparser 2.20
Pygments 2.7.1
pylint 2.6.0
pyparsing 2.4.7
pyrsistent 0.17.3
python-dateutil 2.8.1
pytz 2020.1
pyzmq 19.0.2
qtconsole 4.7.7
QtPy 1.9.0
s3transfer 0.3.3
Send2Trash 1.5.0
setuptools 46.4.0
six 1.15.0
sorl-thumbnail 12.6.3
sqlparse 0.4.1
terminado 0.9.1
testpath 0.4.4
toml 0.10.1
tornado 6.0.4
traitlets 4.3.3
typed-ast 1.4.1
urllib3 1.25.10
virtualenv 20.0.30
wcwidth 0.2.5
webencodings 0.5.1
widgetsnbextension 3.5.1
wrapt 1.12.1
zipp 3.3.1
However, these are not shown in the site-packages folder.
bitnami#ip-172-26-7-56:/opt/bitnami/python/lib/python3.8/site-packages$ ls
appdirs-1.4.4-py3.8.egg mysqlclient-1.4.6-py3.8.egg-info setuptools-46.4.0-py3.8.egg
asgiref MySQLdb setuptools.pth
asgiref-3.2.10.dist-info pip-20.2.1-py3.8.egg six-1.15.0-py3.8.egg
distlib-0.3.1-py3.8.egg psycopg2 sqlparse
django psycopg2-2.8.5-py3.8.egg-info sqlparse-0.3.1.dist-info
Django-3.1-py3.8.egg-info pytz virtualenv-20.0.30-py3.8.egg
easy-install.pth pytz-2020.1.dist-info
filelock-3.0.12-py3.8.egg README.txt
As a result, when the app is run by apache, there is internal server error and the packages I installed are not recognised.
I have spent about 3 days in a row trying to figure out deployment. For me, deployment is the most difficult task and I don't know where I am going. Would anyone recommend the easiest way to deploy django application?
Thanks a million!!

PyFlink - Kafka - Missing module

I am trying to start with PyFlink and Kafka, but get below error.
Thanks for your support !
Installation
python -m pip install apache-flink
pip install pyFlink
Code
from pyFlink.datastream import StreamExecutionEnvironment
Error
ModuleNotFoundError: No module named 'pyFlink'
To install PyFlink, you only need to execute:
python -m pip install apache-flink
and make sure you have a compatible Python version (>= 3.5).
Imports are case-sensitive; the error is thrown because the package name is "pyflink", not "pyFlink". So, instead, you can try:
from pyflink.datastream import StreamExecutionEnvironment
If you're going to use Kafka, please remember to also add the required (JAR) dependencies, using:
config = t_env.get_config().get_configuration()
config.set_string("pipeline.jars",
"file:///path/to/jar/jarfile.jar")
You can read more about handling connectors and other dependencies in the PyFlink documentation.

Getting HTTP Client error using the s3 client

Operating system is Ubuntu 16.04
Python version is 3
Installed the AWS CLI using PIP
AWS CLI version is aws-cli/1.16.309 Python/3.6.9 Linux/4.20.17-042017-generic botocore/1.13.45
I set the HMAC creds correctly and type the following in my command line:
ncheaz#thinkburger:~/Projects/drupal-interact$ aws --endpoint-url 'https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints' s3 ls
An HTTP Client raised and unhandled exception: __init__() got an unexpected keyword argument 'ssl_context'
Check your urllib3 version. Upgrading urllib3 to 1.25.11 (latest) solved the problem for me. https://pypi.org/project/urllib3/1.25.11/
You can load the urllib (or urllib3) module by loading it in the Python Interpreter and then checking it's version:
$ python
>>> import urllib
>>> urllib.__version__
'1.17'

pipenv install glob fails

I tried to install glob in my virtual python (version 3.5) environment. This is an error I got. I found similar questions on this channel, but not much of help.
$pipenv install glob
Installing glob…
Collecting glob
Error: An error occurred while installing glob!
Could not find a version that satisfies the requirement glob (from versions: )
No matching distribution found for glob
The issue is that pipenv looks up the version in the url specified in [[source]] in the Pipfile and glob is not in there. However, glob is part of the Standard Library in Python so you do not need to install it via pipenv and you can just call it from your script 'import glob' and it should work.
You are using python 3.X
here are the correct glob versions
for python 2.7
sudo pip install glob2
for python 3.7
sudo pip3 install glob3

Pandas and Numpy Import error when using Apache2, Anaconda and Django

Am getting the following error - Missing required dependencies ['numpy']
Standalone and via Django, without Apache2 integration - the code work likes charm, however things start to fall when used with Apache2. It refuses to import pandas or numpy giving one error after another.
I am using Apache2, libapache2-mod-wsgi-py3, Python 3.5 and Anaconda 2.3.0
Request Method: GET
Request URL: http://127.0.0.1/api/users/0/
Django Version: 1.10.5
Exception Type: ImportError
Exception Value:
Missing required dependencies ['numpy']
Exception Location: /home/fractaluser/anaconda3/lib/python3.4/site-packages/pandas/__init__.py in <module>, line 18
Python Executable: /usr/bin/python3
Python Version: 3.5.2
Python Path:
['/home/fractaluser/anaconda3/lib/python3.4/site-packages',
'/home/fractaluser/anaconda3/lib/python3.4/site-packages/Sphinx-1.3.1-py3.4.egg',
'/home/fractaluser/anaconda3/lib/python3.4/site-packages/setuptools-27.2.0-py3.4.egg',
'/usr/lib/python35.zip',
'/usr/lib/python3.5',
'/usr/lib/python3.5/plat-x86_64-linux-gnu',
'/usr/lib/python3.5/lib-dynload',
'/usr/local/lib/python3.5/dist-packages',
'/usr/lib/python3/dist-packages',
'/var/www/html/cgmvp']
Server time: Fri, 9 Jun 2017 11:12:37 +0000
You can't force mod_wsgi built with the system Python version to use a Python virtual environment built for a different Python version, nor different Python installation. That is what it appears you are doing. You would need to uninstall mod_wsgi and install it from source code, compiling it against the Anaconda Python distribution. Best to use the pip install method and follow steps to integrate it into existing Apache installation. See:
https://pypi.python.org/pypi/mod_wsgi
Also see the following documentation for setting up a Python virtual environment with mod_wsgi, as it appears you aren't doing that in the recommended way either.
http://modwsgi.readthedocs.io/en/develop/user-guides/virtual-environments.html
First task though is to reinstall mod_wsgi.
I had the same problem using apache2 with mod_wsgi python 3.6 envinronmet 64, the version numpy used was 1.13, only change the version with previous and worked !!.
pip3 install numpy==1.12

Resources