Is it possible to use pyflink on windows? - apache-flink

Has anyone ever had success running using python and windows with flink?
I'm trying the following command:
.\bin\pyflink.bat examples\python\WordCount.py
and getting the following error
Starting execution of program
Usage: ./bin/pyflink<2/3>.[sh/bat] <pathToScript>[ <pathToPackage1>[ <pathToPackageX]][ - <parameter1>[ <parameterX>]]
The program didn't contain a Flink job. Perhaps you forgot to call execute() on the execution environment.

The program didn't contain a Flink job. Perhaps you forgot to call
execute() on the execution environment.
This looks like you forgot to add
env.execute(local=True)
in case you are running it locally. Without this the flink job wouldnt be able to execute. You can find more information with this simple example program

I got this information in Win10. Maybe it is not supported at present.

In the following release-1.11, PyFlink will support windows in local mode. https://issues.apache.org/jira/browse/FLINK-12717

Related

What environment_config for Beam launching flink

I am hoping for guidance on how to set --environment_config when running the Beam wordcount.py demo.
It runs fine with DirectRunner. Flink's wordcount also runs fine (ie running Flink via flink run).
I would like to run Beam using the Flink runner using a "seperate Flink cluster" as described in the beam documentation. I can't use Docker, so I plan to use --environment_type=PROCESS.
I am using the following inside the python code to set environment_config:
environment_config = dict()
environment_config['os'] = platform.system().lower()
environment_config['arch'] = platform.machine()
environment_config['command'] = 'ls'
ec = "--environment_config={}".format(json.dumps(environment_config))
Obviously the command is incorrect. When I run this, Flink does receive and successfully process the DataSource sub-tasks. It eventually time-outs on the CHAIN MapPartitions.
Could someone provide guidance (or links) as to how to set environment_config? I am running Beam within a Singularity container.
For environment_type=DOCKER, most everything's taken care of for you, but in process mode you have to do a lot of setup yourself. The command you're looking for is sdks/python/container/build/target/launcher/linux_amd64/boot. You will be required to have both that executable (which you can build from source using ./gradlew :sdks:python:container:build) and a Python installation including Beam and other dependencies on all of your worker machines.
The best example I know of is here: https://github.com/apache/beam/blob/cbf8a900819c52940a0edd90f59bf6aec55c817a/sdks/python/test-suites/portable/py2/build.gradle#L146-L165

Flink 1.5-SNAPSHOT web interface doesn't work

I recently came across a bug in Flink, reported (https://issues.apache.org/jira/browse/FLINK-8685) and found out that it has been reported and a pull request has been created (https://github.com/apache/flink/pull/5174).
Now I clone 1.5-SNAPSHOT, apply the patch and build Flink. Even though it builds (no matter patch is applied or not), when I run Flink (using start-cluster.sh), web dashboard doesn't work and command
tail log/flink-*-jobmanager-*.log returns "tail: cannot open 'log/flink-*-jobmanager-*.log' for reading: No such file or directory"
. I tested with a batch programs and surprisingly it returned results on terminal, but streaming programs and other things still don't work.
Any suggestions on this issue?
Thank you.
In case flink dashboard does not start change port in conf file and restart. Default port of flink could be occupied by other process in windows.
Also change log level for flink to debug.

Could I use Qpyton as a library to interpreter a necessary python subroutine?

I have an APP which needs a python lib to support.
So I would like to retrieve qpython in order to execute python.
Is there anyone has tried to do it before?
I am finding a way to retrieve the part of Qpython to serve my purpose, instead of execute on its ternimal; and, finally get the execution result back to my APP.
Thanks~
Yes, you could call the QPython's OPEN API,
http://www.qpython.org/en/guide_extend.html

Process leaked file descriptors error on JENKINS

I am getting this error when I configured a job to do stop and start of tomcat server:
Process leaked file descriptors. See http://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build for more information
When i googled it, i got a recommended solution as set BUILD_ID=dontKillMe
Is this the exact solution?
If yes, where do I need to set BUILD_ID? Inside ant/post build script?
Can anyone please clarify this?
Yes, creating fake BUILD_ID for process tells Jenkins to ignore this process during detection spawned processes, so this process will be not killed after finishing job.
Usage: Enter BUILD_ID=dontKillMe before your command, for example into Execute shell build step:
BUILD_ID=dontKillMe nohup ./yourStartScript.sh &
Note: See also nohup
By default, Jenkins will kill all the spawned process at the completion of build.
To override this, you need to create environment variable BUILD_ID.
Go to Jenkins -> Manage Jenkins -> Configure System.
Now under Global properties section, under Environment variables, click on ADD button to add new Environment variable.
Give name=BUILD_ID and value=allow_to_run_as_daemon start_my_service
Click on save button. And you are done.
Now the spawn process will continue to execute even after the build got completed.
Add this line as a JAVA_ARGS argument when you start your jenkins server (I put mine on /etc/default/jenkins in my Ubuntu box)
-Dhudson.util.ProcessTree.disable=true
And you're done
You are calling a command from Jenkins that spawns another process -
the tomcat-start command ends, but its child-process is still running
(this is the actual tomcat web-server you attempted to start).
Jenkins sometimes identifies this situation as a possible problem,
but the page you have mentioned also explains how to solve it
(in short: Don't start tomcat from Jenkins unless you know how).
Tried different suggestions but none of the options worked for me. Finally I switched to previous version of jenkins and it worked. I switched from 2.3 to 1.581 and it worked.

How to run matlab script on server? or Is there an online matlab interpreter?

I have a matlab script Temp_script.m (say) which,I want to execute on remote
server.
The server(remote) that I am using is free online hosting which gives me 1.5GB storage.
Since the server is remote one, I have no access to it, to install matlab software or runtime environment.
locally I can run the matlab script on my server.Obviously because I have matlab installed on my system.
My question "Is there a method to run the script online.? or Is there
any online interpreter for matlab?"
Thanking in advance
-Ryaan Dias
You can compile your project using deploytool.
This will give you several options, you can make a dll and probably even an exe.
However, the program is not going to run itself, so if you want it to run automatically on the server you need to have a framework there. Example could be .net but i guess there are easier ones.
AFAIK there isn't a web interface for Matlab, and I doubt that the Matlab license would cover such a use case. However, there you could always try to use the open-source equivalent Octave . Octave can execute Matlab code with only minor modifications.
A quick google search for Octave Web Server yielded many results. This was the first hit:http://knn.mimuw.edu.pl/weboctave-project/

Resources