I'm trying to figure out how to get System.AbortJob() to actually work. My assumption (could be wrong) is that when I pass the current jobId to System.AbortJob(), the job will stop right there and abort. Here is my test that isn't working, as I am seeing the System.debug() showing up in my logs.
Executing from execute anonymous:
queueableTest tst = new queueableTest();
System.enqueueJob(tst);
Queueable class:
public class queueableTest implements Queueable {
public static void execute(QueueableContext Context)
{
ID jobID = Context.getJobId();
System.AbortJob(jobID);
shouldntExecute();
}
public static void shouldntExecute()
{
System.debug('Why is this executing?');
}
}
Any help/feedback greatly appreciated!
The System.abortJob() call will only take effect after its execution context is completed. Since you are calling abortJob from the same context that you want to abort, by the time it takes effect your code has already finished executing which makes the System.abortJob() call irrelevant.
If you want to abort the current job, you need to use return; or System.assert(false, 'Aborting'); In the first case your job will terminate with a status of 'Completed', and in the second case with a 'Failed' status. Throwing an exception would also have the same result.
Related
I am running a batch to update opportunity records. I see that the query fetches around 1.1 million records but it processes only approx 100k records. I have checked the query in query editor and its working fine. Even when I process these records separately using the same code which batch is using to process its working as expected. Not sure why batch is not processing.
global class BatchAssignment implements Database.Batchable<sobject>{
global Database.Querylocator start (Database.BatchableContext BC){
return Database.getQueryLocator('SELECT id,Booking_Domain__c,Region__c,Primary_LOB__c,NA_Contract_Sales_Type__c,Owner_Sales_Group__c,Connected_Technologies_Opportunity__c FROM Opportunity where CreatedDate > LAST_N_YEARS:3 ORDER BY CreatedDate DESC');
}
global void execute (Database.BatchableContext BC, List<Opportunity> oppList) {
try{
if(!oppList.isEmpty()){
CustomBSNAReportingHandler.updateOpportunityBSNAReporting(oppList);
}
if(test.isRunningTest()) { throw new DMLException();}
}catch(Exception ex){
}
}
global void finish(Database.BatchableContext BC){
// Peform post transaction update
}
}
Is there any restriction with batch? Why its behaving like this
Is the batch job still running? Check Setup -> Jobs to see if it finished.
You have empty try-catch that swallows any exceptions. Remove it and see what kind of errors you'll get.
Yes, it will mean that if there's 1 bad Opportunity - other 199 in same execute() call will fail to update. You can mitigate that by setting smaller chunk size (Database.executeBatch(new BatchAssignment(), 10)), clever use of database.update(myrecords, false) inside that method you're calling.
Maybe even look into https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/apex_batch_platformevents.htm for some error raising (and handling it with some monitoring software? trigger that would convert them to Tasks related to problematic opportunities?)
I have a job running, and I'm interested in to use only one recover retry, because in the meantime that this flink restart is not triggered I have a Thread that try to solve the problem, then when the problem was solved flink will restart, but sometimes the thread takes longer the usual to fix the issue and restart strategy is triggered, failing because of the issue still, then the job is stopped but the thread maybe has another iteration, and then the application never dies because I'm running it as a jar application. So, my question:
Is there anyway to know from java code the status of the job? Something like (JobStatus.CANCELED == true).
Thanks in advance!
Kind regards
Thanks a lot Felipe. This is what I was needing and thanks to you it is done. I share the code here in case of someone else needed.
Prepare the listener
final StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment(...);
final AtomicReference<JobID> jobIdReference = new AtomicReference<>();
//Environment configurations
env.registerJobListener(new JobListener() {
#Override
public void onJobSubmitted(#Nullable JobClient jobClient, #Nullable Throwable throwable) {
assert jobClient != null;
jobIdReference.set(jobClient.getJobID());
jobClient = jobClient /*jobClient static public object in the main class*/;
}#Override
public void onJobExecuted(#Nullable JobExecutionResult jobExecutionResult, #Nullable Throwable throwable) {
assert jobExecutionResult != null;
jobExecutionResult.notify();
}
});
Use the code:
Preconditions.checkNotNull(jobClient);
final String status = jobClient.getJobStatus().get().name();
if (status.equals(JobStatus.FAILED.name())) System.exit(1);
I want to evaluate the time costed between an event reaches the system and get finished, and I think getting ingestion time will help, but how to do get it?
You probably want to use latency tracking. Alternatively, you can add the processing time directly after the source in a chained process function (with Context->TimerService#currentProcessingTime()).
Based on the reply from David, to get the ingest time we can chain the process method with source.
Below code shows the way to get the ingest time. Also in case the same need to be used for metrics to get the difference between ingest time & event time, I have used histogram metric group to do that.
Below code snippet might help you to better understand.
DataStream<EventDataMapping> text = env
.fromSource(source, WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(5)),"Kafka Source")
.process(new ProcessFunction<EventDataMapping, EventDataMapping>() {
private transient DescriptiveStatisticsHistogram eventVsIngestionTimeLag;
private static final int EVENT_TIME_LAG_WINDOW_SIZE = 10_000;
#Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
eventVsIngestionTimeLag = getRuntimeContext().getMetricGroup().histogram("eventVsIngestionTimeLag",
new DescriptiveStatisticsHistogram(EVENT_TIME_LAG_WINDOW_SIZE));
}
#Override
public void processElement(EventDataMapping eventDataMapping, Context context, Collector<EventDataMapping> collector) throws Exception {
LOG.info("process element event time "+context.timestamp()+" current ingestTime "+context.timerService().currentProcessingTime());
eventVsIngestionTimeLag.update(context.timerService().currentProcessingTime() - context.timestamp());
}
}).returns(EventDataMapping.class);
I am implementing a RichParallelSourceFunction which reads files over SFTP. RichParallelSourceFunction inherits cancel() from SourceFunction and close() from RichFunction(). As far as I understand it, both cancel() and close() are invoked before the source is teared down. So in both of them I have to add logic for stopping the endless loop which reads files.
When I set the parallelism of the source to 1 and I run the Flink job from the IDE, Flink runtime invokes stop() right after it invokes start() and the whole job is stopped. I didn't expect this.
When I set the parallelism of the source to 1 and I run the Flink job in a cluster, the job runs as usual.
If I leave the parallelism of the source to the default (in my case 4), the job runs as usual.
Using Flink 1.7.
public class SftpSource<TYPE_OF_RECORD>
extends RichParallelSourceFunction<TYPE_OF_RECORD>
{
private final SftpConnection mConnection;
private boolean mSourceIsRunning;
#Override
public void open(Configuration parameters) throws Exception
{
mConnection.open();
}
#Override
public void close()
{
mSourceIsRunning = false;
}
#Override
public void run(SourceContext<TYPE_OF_RECORD> aContext)
{
while (mSourceIsRunning)
{
synchronized ( aContext.getCheckpointLock() )
{
// use mConnection
// aContext.collect() ...
}
try
{
Thread.sleep(1000);
}
catch (InterruptedException ie)
{
mLogger.warn("Thread error: {}", ie.getMessage() );
}
}
mConnection.close();
}
#Override
public void cancel()
{
mSourceIsRunning = false;
}
}
So I have workarounds and the question is more about the theory. Why is close() invoked if parallelism is 1 and the job is run from the IDE (i.e. from the command line)?
Also, do close() and cancel() do the same in a RichParallelSourceFunction?
Why is close() invoked if parallelism is 1 and the job is run from the
IDE.
close is called after the last call to the main working methods (e.g. map or join). This method can be used for clean up work.
It will be called independent of the number defined in parallelism.
Also, do close() and cancel() do the same in a RichParallelSourceFunction?
They aren't the same thing, take a look at how it's described.
Cancels the source. Most sources will have a while loop inside the run(SourceContext) method. The implementation needs to ensure that the source will break out of that loop after this method is called.
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.html#cancel--
The following link may help you to understand the task lifecycle:
https://ci.apache.org/projects/flink/flink-docs-stable/internals/task_lifecycle.html#operator-lifecycle-in-a-nutshell
I think javadocs are more than self-explanatory:
Gracefully Stopping Functions
Functions may additionally implement the {#link org.apache.flink.api.common.functions.StoppableFunction} interface. "Stopping" a function, in contrast to "canceling" means a graceful exit that leaves the state and the emitted elements in a consistent state.
-- SourceFunction.cancel
Cancels the source. Most sources will have a while loop inside the run(SourceContext) method. The implementation needs to ensure that the source will break out of that loop after this method is called.
A typical pattern is to have an "volatile boolean isRunning" flag that is set to false in this method. That flag is checked in the loop condition.
When a source is canceled, the executing thread will also be interrupted (via Thread.interrupt()). The interruption happens strictly after this method has been called, so any interruption handler can rely on the fact that this method has completed. It is good practice to make any flags altered by this method "volatile", in order to guarantee the visibility of the effects of this method to any interruption handler.
-- SourceContext.close
This method is called by the system to shut down the context.
Note, you can cancel SourceFunction, but stop SourceContext
I found a bug in my code. Here is the fix
public void open(Configuration parameters) throws Exception
{
mConnection.open();
mSourceIsRunning = true;
}
Now close() is not invoked until I decide to stop the workflow in which case first is invoked cancel() and then close(). I am still wondering how did parallelism affect the behaviour.
I have fully isolated this problem to a very simple play app
I think it has to do with some DB caching, but I can't figure it out
BasicTest.java
==========
import org.junit.*;
import play.test.*;
import play.Logger;
import models.*;
import play.mvc.Http.*;
public class BasicTest extends FunctionalTest {
#Before public void setUp() {
Fixtures.deleteDatabase();
Fixtures.loadModels("data.yml");
Logger.debug("countFromSetup=%s",User.count());
}
#Test
public void test() {
Response response= GET("/");
Logger.debug("countFromTest=%s",User.count());
assertIsOk(response);
}
}
Uncommented Configs
================
%prod.application.mode=prod
%test.application.mode=dev
%test.db.url=jdbc:h2:mem:play;MODE=MYSQL;LOCK_MODE=0
%test.db=mysql:root:xxx#t_db
%test.jpa.ddl=create
%test.mail.smtp=mock
application.mode=dev
application.name=test
application.secret=jXKw4HabjhaNvosxgzq39to9BJECtOr39EXrEabsQAZKi7YoWAwQWo3B BFUOQnJw
attachments.path=data/attachments
date.format=yyyy-MM-dd
db=mysql:root:xxx#db
mail.smtp=mock
Application.java
============
package controllers;
import play.*;
import play.mvc.*;
import models.*;
public class Application extends Controller {
public static void index() {
Logger.debug("countFromIndex=%s",User.count());
render();
}
}
>play test
Output of log after running the BasicTest http://localhost:9000/#tests
==================================================
11:54:59,008 DEBUG ~ countFromSetup=1
11:54:59,021 DEBUG ~ countFromIndex=0
11:54:59,034 DEBUG ~ countFromTest=1
point to browser=> http://localhost:9000
12:25:59,781 DEBUG ~ countFromIndex=1
What happened to the record during?
Response response= GET("/");
This 'bug' almost makes my test cases useless
It has probably something to do with transactions. I've came across a similar case once with Spring/JUnit couple.
Here is the transactionnal execution of the test (I think) :
Start transaction t1,
Execute setup, result is fetched from cache.
Execute test.
Start transaction t2 for controller execution GET("/")
Result is fetched from database but since t1 hasn't been commmited, it isn't displayed.
Close transaction t2 and commit t1!
Close transaction t1 and commit t2!
By the way, that is not really a Functionnal Test. For functionnal tests, you are not supposed to check such data but only http status. Turn to UnitTests for that. When looking at source code of functionnal tests, you can see all the checks implemented are for response/http checking.
I think its the default behavior of JUnit, #Before annotation makes the method run before every test:
When writing tests, it is common to find that several tests need
similar objects created before they can run. Annotating a public void
method with #Before causes that method to be run before the Test
method. The #Before methods of superclasses will be run before those
of the current class.
From : http://junit.sourceforge.net/javadoc/org/junit/Before.html
IF you want the setup to be run once you can use #BeforeClass Annotation : http://junit.sourceforge.net/javadoc/org/junit/BeforeClass.html
In PlayFramework, there's n+1 threads for prod and 1 thread for test profile or compile profile. So if you have a dual-core CPU, there's 3 threads if you are running in prod, and one thread if you started the application with "test".
Now, one more interesting fact : there'x one Tx per execution. Thus when your application starts, and you launch your very first test, here is what happens :
Play starts with one thread.
The JUnitRunner starts, the first test myTest gets executed. It's an HTTP connection to the application. The reason why you see 0 is because of the Response GET that is executed before the #Before statement.
The #Before gets executed, creates your entries and the result count is accurate in the #Before, because it's done in the same Tx.
So what I suggest is that you either use #BeforeClass, or perform the setup not in a #Before but from a direct call in myTest for the very specific test case with Response.
I assume that if you replace this code
#Test
public void myTest() {
Response response= GET("/test");
}
with this
#Test
public void myTest() {
assertEquals(1,User.count());
}
Correct ?
So the reason why you get this is not a bug. It's simply because of this one thread configuration we have for test environment.
Nicolas