I'd like to know how to change the maximum number of concurrent tasks in a queue. I know that this can somehow be done in the yaml files, but is this possible using gcloud commands from the terminal?
You can use the following command to set the maximum number of concurrent tasks:
gcloud tasks queues update [QUEUE_ID] \
--max-dispatches-per-second=[DISPATCH_RATE] \
--max-concurrent-dispatches=[MAX_RUNNING]
To only set the maximum number:
gcloud tasks queues update QUEUE \
--max-concurrent-dispatches=20
Related
if i want to run a flink job on yarn ,the command is
./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar
but the command will run a default cluster which have 2 taskmanagers ;
if i am only submmit single job,why the default taskmanagers is setted 2?
and when do I need mutiple taskmanager in single job?
The basic idea of any distributed data processing framework is to run the same job across multiple compute nodes. In this way, applications that process too much data for one particular node, simply scale out to multiple nodes and could in theory process arbitrary much data. I suggest you to read the basic concepts of Flink.
Btw, there is no particular reason to have a default of 2 though. It could be any number, but it happens to be 2.
I am running flink streaming job on AWS yarn cluster with below configuration
Master Node - 1, Core Node - 1, Task Nodes - 3
And I enabled
jobmanager.execution.failover-strategy: region
As one of my task nodes are failing and trying to restart at region level (in my case at task node level) and I enabled the restart strategy as fixedDelayrestart with 5 attempts of 5 minutes delay and my checkpoints are disabled.
Reference Image
If you see the image it is restarting more than expected.
Can anybody help me understand why does it is behaving like this?
The documentation has a section about the "Restart Pipelined Region Failover Strategy" [1]. The bottom line is, if you have a streaming job with an operator that physically partitions the stream, such as keyBy, all tasks will end up being in the same region, and therefore all tasks will be restarted as a whole. For batch jobs, you need to configure the ExecutionMode [2] to be BATCH or BATCH_FORCED.
[1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/api/common/ExecutionMode.html
I followed the implementation of Apache-flink via: quick_start
I am not able to perform the last task i.e. 'Analyze the Result' because there is no result file inside the kmeans folder.
If you look into the above screenshot of flink JobManager, there you can see Status as FAILED for KMeans Example. And may be due to this failed status there is no result file inside the kmeans folder.
Now on clicking the KMeans Example, I get the following visualization:
And below is the screenshot of exceptions:
Could you please guide me what am I doing wrong.
The problem is that the cluster has been started with a single TaskManager which has only a single slot and that you want to execute the KMeans job at the same time with a parallelism of 4.
In order to run the job with parallelism of 4, you have to increase the number of TaskManager of your cluster or the number of slots on each TaskManager. The latter can be set in the Flink configuration flink-conf.yaml with taskmanager.numberOfTaskSlots: 4. For the former, you can modify the conf/slaves file to add new machines for the additional TaskManager.
Alternatively, you can decrease the parallelism of your job to 1. You can control the parallelism with the command line option -p. E.g. bin/flink run -p 1 -c JobClass job.tar.
I know that there was some functionality for controlling the max idle instances and max latency after which a new instance was started. Now I can't find those in the new console.
Is this completely gone, or has the mechanism changed?
Those settings have all moved into your application's configuration files. Here are the python docs on it https://cloud.google.com/appengine/docs/python/modules/#Python_Instance_scaling_and_class
I'm considering moving from AppEngine to EC2/Elastic Beanstalk as I need my servers located within the EU [AppEngine doesn't offer a server location option AFAIK]. I've run the Elastic Beanstalk sample application, which is good as far as it goes; however one of the AppEngine features I rely on heavily is the offline task queues / cron facility, as I periodically fetch a lot of data from other sites. I'm wondering what I would need to setup on Elastic Beanstalk / EC2 to replicate this task queue facility, whether there are any best practices yet, how much work it would take etc.
Thanks!
A potential problem with cron services in Beanstalk is that a given scheduled command might be invoked by more than one service if the application is running on more than one instance. Coordination is needed between the running Tomcat instances to ensure that jobs are run by only one, and that if one of them dies the cron service doesn't get interrupted.
How I'm implementing it is like this:
Package the cron job "config file" with the WAR. This file should contain frequencies and URLs (as each actual cron is simply an invocation of a specific URL, as AE does it)
Use a single database table to maintain coordination. It requires at least two columns.
a primary or unique key which (string) to hold the command along with its frequency. (e.g. "#daily http://your-app/some/cron/handler/url")
a second column which holds the last execution time.
each tomcat instance will run a cron thread which should read the configuration from the WAR and schedule itself to sleep as long as needed until the next service invocation. once the time hits, the instance should first attempt to "claim" the invocation by first grabbing the last invocation time for that command from the database, then updating it to get the "lock".
query(SELECT last_execution_time FROM crontable WHERE command = ?)
if(NOW() - last_execution_time < reasonable window) skip;
query(UPDATE crontable SET last_execution_time = NOW() WHERE command = ? AND last_execution_time = ?)
if(number of rows updated == 0) skip;
run task()
The key element here is that we also include the last_execution_time in the WHERE clause, ensuring that if some other instance updates it between when we SELECT and UPDATE, the update will return that no rows were affected and this instance will skip executing that task.
If you're moving your app, you're probably better off simply using TyphoonAE or AppScale. Both are alternate environments in which you can run your App Engine app unmodified, and both support EC2.