how to cancel job in Dolphindb - database

I use the function of submitJob to get the jobId and I try to cancel the job by using the cancelJob function, but I failed to stop the job. What function should I use to stop the job?
I use the code below:
submitJob("aa", "a1", replay, [ds], [sink], date, `time, 10)
cancelJob(aa)

The function submitJob will return the actual jobId that might be different from the input jobId. So please use the jobId returned by the submitJob function.
DolphinDB uses thread pool to run jobs. So if the job is simple and contains no sub tasks, we still can't cancel the job.

Related

TASKS in Snowflake

I have created two tasks to run once a day
create or replace task TESTDB.TESTSCHEMA.TASK_EXTERNAL_REFRESH
warehouse=W_TEST_DEVELOPER
schedule='USING CRON 0 4 * * * UTC'
TIMESTAMP_INPUT_FORMAT='YYYY-MM-DD HH24'
as
call TESTDB.TESTSCHEMA.TEST_EXTERNAL_TABLE_REFRESH();
create or replace task ESTDB.TESTSCHEMA.TASK_LOAD_TABLES
warehouse=W_TEST_DEVELOPER
schedule='USING CRON 0 5 * * * UTC'
TIMESTAMP_INPUT_FORMAT='YYYY-MM-DD HH24'
as
call TESTDB.TESTSCHEMA.TEST_LOAD_TABLES();
Now I want to ensure that TESTDB.TESTSCHEMA.TASK_EXTERNAL_REFRESH runs before TASK_LOAD_TABLES runs.
How should I do this ?
Also, should the error details from task run be captured in config tables? What is "TESTDB.TESTSCHEMA.TASK_EXTERNAL_REFRESH" fails? If this fails, next one should not run.
The precedence rule should be added instead of schedule:
ALTER TASK TESTDB.TESTSCHEMA.TASK_LOAD_TABLES
ADD AFTER TESTDB.TESTSCHEMA.TASK_EXTERNAL_REFRESH;
CREATE TASK:
AFTER string [ , string , ... ]
Specifies one or more predecessor tasks for the current task. Use this option to create a DAG of tasks or add this task to an existing DAG. A DAG is a series of tasks that starts with a scheduled root task and is linked together by dependencies.
Related: Snowflake - Many tasks dependencies for a task
For query on the predecessor and successor tasks, you should use the "after taskname" option
create task task2
after task1
as
insert into t1(ts) values(current_timestamp);
https://docs.snowflake.com/en/sql-reference/sql/create-task.html#single-sql-statement
A few options to check the status of the task and decide on the successor/child task execution are given below.
You can use the SUSPEND_TASK_AFTER_FAILURES = number
https://docs.snowflake.com/en/user-guide/tasks-intro.html#automatically-suspend-tasks-after-failed-runs
Create a task that calls a UDF to check the ACCOUNT_USAGE.TASK_HISTORY or INFORMATION_SCHEMA.TASK_HISTORY views for task status.
You can use external tools to check the status of the task and integrate it.

Asynchronous cursor execution in Snowflake

(Submitting on behalf of a Snowflake user)
At the time of query execution on Snowflake, I need its query id. So I am using following code snippet:
cursor.execute(query, _no_results=True)
query_id = cursor.sfqid
cursor.query_result(query_id)
This code snippet working fine for small running queries. But for query which takes more than 40-45 seconds to execute, query_result function fails with KeyError u'rowtype'.
Stack trace:
File "snowflake/connector/cursor.py", line 631, in query_result
self._init_result_and_meta(data, _use_ijson)
File "snowflake/connector/cursor.py", line 591, in _init_result_and_meta
for column in data[u'rowtype']:
KeyError: u'rowtype'
Why would this error occur? How to solve this problem?
Any recommendations? Thanks!
The Snowflake Python Connector allows for async SQL execution by using ​cur.execute(sql, _no_results=True)​
This ​"fire and forget"​ style of SQL execution allows for the parent process to continue without waiting for the SQL command to complete (think long-running SQL that may time-out).
If this is used, many developers will write code that captures the unique Snowflake Query ID (like you have in your code) and then use that Query ID to ​"check back on the query status later"​, in some sort of looping process. When you check back to get the status, you can then get the results from that query_id using the result_scan( ) function.
https://docs.snowflake.net/manuals/sql-reference/functions/result_scan.html
I hope this helps...Rich

How to execute a sample just before thread shutdown in Jmeter?

Is there a way in Jmeter to execute a sample just before thread shutdown?
For example, I have a test plan that inserts data into a database and autocommit is disabled on the connection. Each thread spawns its own connection to the database. Plan runs on a schedule (i.e. I don't know samples count) and I want to commit all inserted rows at the end of the test. Is there a way to do that?
The easiest is going for tearDown Thread Group which is designed for performing clean-up actions.
The harder way is to add a separate Thread Group with 1 thread and 1 iteration and 1 JSR223 Sampler with the following Groovy code:
class ShutdownListener implements Runnable {
#Override
public void run() {
//your code which needs to be executed before test ends
}
}
new ShutdownListener().run()
Try running the commit sample based on some if condition w.r.t duration or iterationnum
For ex: if you are supposed to run 100 iterations :
An If controller with the condition -
__groovy(${__iterationNum}==100)
should help.
ok this might not be the most optimal but could be workable
Add the following code in a JSRSampler inside a onceonly controller
def scenarioStartTime = System.currentTimeMillis();
def timeLimit= ctx.getThreadGroup().getDuration()-10; //Timelimit to execute the commit sampler
vars.put("scenarioStartTime",scenarioStartTime.toString());
vars.put("timeLimit",timeLimit.toString());
Now after your DB insert sampler add the following condition in a if controller and add the commit sampler.
${__groovy(System.currentTimeMillis()-Long.valueOf(vars.get("scenarioStartTime"))>=Long.valueOf(vars.get("timeLimit"))*1000)}
This condition should let you execute the commit sampler just before the end of test duration.

Polling a restful service in Talend

I am building a job in Talend that queries a restful service. In the job, I initiate a job and get a job ID back. I then query a status service, and need to wait for the job to complete. How would I go about doing this in Talend? I have been playing around with tLoop, tFlowToIterate, tIterateToFlow and tJavaRow components to try get this to work, but am not sure how to configure it.
Here's a summary of what I'm trying to do:
1. tRest: Start a job and get job ID
|
--> 2. tRest: Poll status of job
|
--> 3. tUnknown?: If the job is running, sleep and re-run Step 2.
|
--> 4. tRest: when the job is complete, retrieve the results
How would I set up step 3 above?
Basically you want something like
tInfiniteLoop --iterate--> (subjob for querying the service and determining if result is ready) --if (result is ready)--> (subjob for fetching the result) --on subjob ok--> tjava with "counter_tInfiniteLoop_1 = -1;" to leave loop (don't know of a better alternative)
I would advice to implementing a timeout or maximum number of lookups and maybe even an automatically increasing sleep time.

how long the map call can last?

I want to do some heavy processing in the map() call of the mapper.
I was going through the source file MapReduceServlet.java:
// Amount of time to spend on actual map() calls per task execution.
public static final int PROCESSING_TIME_PER_TASK_MS = 10000;
Does it mean, the map call can last only for 10secs. What happens after 10sec?
Can I increase this to large number like 1min or 10min.
-Aswath
MapReduce operations are executed in tasks using Push Queues, and as said in the documentation the task deadline is currently 10 minutes (limit after which you will get a DeadlineExceededException).
If the task failed to execute, by default App Engine retries it until it succeed. If you need longer deadline that 10 minutes, you can use Backend for executing your tasks.
Looking at the actual usage of PROCESSING_TIME_PER_TASK_MS in Worker.java, this value is used to limit the number of map call done in a single task.
After each map call has been executed if more than 10s has elapsed since the beginning of the task, it will spawn a new task to handle the rest of the map calls.
Worker.scheduleWorker spawns a new Task for a each given shard
Each task will call Worker.processMapper
processMapper execute 1 map call
if less than PROCESSING_TIME_PER_TASK_MS have elapsed since 2. go back to 3.
else if processing is not finished reschedule a new worker task
In the worst case scenario the default task request deadline (10 minutes) should apply to each of your individual map call.

Resources