I am pulling messages continuously from Google PubSub. All going good, except the time taken to fetch the message is around 12 to 15 seconds and its not acceptable in our case. Following is my CallTiming settings:
public CallSettings GetPullSetting()
{
CallTiming timing = CallTiming.FromRetry(new RetrySettings(
retryBackoff: new BackoffSettings(new TimeSpan(0, 0, 0, 0, 50), new TimeSpan(0, 0, 5), 1),
timeoutBackoff: new BackoffSettings(new TimeSpan(0, 0, 0, 18, 0), new TimeSpan(0, 0, 20), 1),
totalExpiration: Google.Api.Gax.Expiration.FromTimeout(TimeSpan.FromMilliseconds(600000))));
return CallSettings.FromCallTiming(timing);
}
I trying all kind of combination to reduce this latency to max 3 seconds.
One observation is that whenever a message is pulled successfully, and in the very next iteration of the pull if there is a message on pubsub, it fetches that message immediately. That means if message is found in consecutive pull latency is very low.
But the problem is, say in one iteration I get Deadline exceeded exception since pubsub has no message. Then I push a message in pubsub for the next iteration. At this point it takes lot of time (13 to 16 seconds). So the condition to reproduce this issue is I shall have one failed attempt to pull the message.
Code pasted here:
public void PullTest()
{
var cont = true;
SubscriberSettings settings = new SubscriberSettings()
{
PullSettings = GetPullSetting()
};
SubscriberClient subscriberClient = SubscriberClient.Create(settings: settings);
var subscriberName = new SubscriptionName("project-name", "subscription-name");
while (cont)
{
try
{
PullResponse response = subscriberClient.Pull(subscriberName, returnImmediately: false, maxMessages: 1);
System.Diagnostics.Trace.WriteLine(">>>>>> " + DateTime.Now.ToString());
System.Diagnostics.Trace.WriteLine(">>>>>> " + "Job Recieved" + response.ReceivedMessages.ToList().FirstOrDefault());
subscriberClient.Acknowledge(subscriberName, new List<string>() { response.ReceivedMessages.ToList().FirstOrDefault().AckId });
}
catch (Exception ex)
{
System.Diagnostics.Trace.WriteLine(">>>>>> " + DateTime.Now.ToString());
System.Diagnostics.Trace.WriteLine(">>>>>> " + ex.Message);
}
}
}
public CallSettings GetPullSetting()
{
CallTiming timing = CallTiming.FromRetry(new RetrySettings(
retryBackoff: new BackoffSettings(new TimeSpan(0, 0, 0, 0, 50), new TimeSpan(0, 0, 5), 1),
timeoutBackoff: new BackoffSettings(new TimeSpan(0, 0, 0, 18, 0), new TimeSpan(0, 0, 20), 1),
totalExpiration: Google.Api.Gax.Expiration.FromTimeout(TimeSpan.FromMilliseconds(600000))));
return CallSettings.FromCallTiming(timing);
}
You have a problem with back off between consecutive pulls when timeout deadline exceeds. You are essentially looking for a long polling solution, for which you either need to reduce backoff timings to close to 0, or use a custom connection/client to retry without delay.
To minimize latency, you are going to want to have multiple pull requests outstanding simultaneously. Depending on your throughput requirements, dozens of outstanding requests at once may be required. If your throughput is low, you'll still want to have multiple pull requests at once (at least two or three). As soon as any one of them returns, whether it be with a deadline exceeded or with a message, start another pull request. The goal is to always have a pull requests outstanding waiting to receive messages that have been published.
Related
I'm encountering similar issue to Flink EventTime Processing Watermark is always coming as -9223372036854725808 However, the suggested solutions (set parallelism and disable checkpointing) do not have any effect. In this example, I'm simply streaming 1000 events 1 second apart, and then comparing the event timestamp to ctx.timerService().currentWatermark()
>>> v=(61538659200000,0), watermark=-9223372036854775808
>>> v=(61538659201000,1), watermark=-9223372036854775808
>>> v=(61538660198000,998), watermark=-9223372036854775808
>>> v=(61538660199000,999), watermark=-9223372036854775808
public void watermarks()
throws Exception
{
final var env = StreamExecutionEnvironment.createLocalEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.STREAMING);
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.setMaxParallelism(1);
final long startMs = new Date(2020, 1, 1).getTime();
final var events = new ArrayList<Tuple2<Long, Integer>>();
for (var ii = 0; ii < 1000; ++ii ) {
events.add(new Tuple2<Long, Integer>(startMs + ii * 1000, ii));
}
env.fromCollection(events)
.assignTimestampsAndWatermarks(
WatermarkStrategy.<Tuple2<Long, Integer>>forMonotonousTimestamps()
.withTimestampAssigner((event, ts) -> event.f0))
.setParallelism(1)
.keyBy(row -> row.f1 % 2)
.process(new ProcessFunction<Tuple2<Long, Integer>, String>()
{
#Override
public void processElement(
final Tuple2<Long, Integer> value,
final Context ctx,
final Collector<String> out)
throws Exception
{
out.collect("v=" + value + ", watermark=" + ctx.timerService().currentWatermark());
}
})
.setParallelism(1)
.print()
.setParallelism(1);
final var result = env.execute();
System.out.println(result);
}
forMonotonousTimestamps is a periodic watermark generator that only generates watermarks when triggered by a timer. By default this timer fires every 200 msec (this is the autoWatermarkInterval). Your job doesn't run long enough for this timer to fire.
Bounded sources do generate a watermark with its timestamp set to MAX_WATERMARK when they reach the end of their input -- just before shutting down the job. You're not seeing this watermark in the output from your job because there are no events that follow it.
If you want to generate watermarks with every event, you can implement a custom watermark strategy that emits a watermarks in the onEvent method of the WatermarkGenerator (docs). This is usually a bad idea in production, as you'll waste CPU cycles and network bandwidth on these extra watermarks, but sometimes for testing this is helpful.
According to source code comments:
/**
* Creates a new enriched {#link WatermarkStrategy} that also does idleness detection in the
* created {#link WatermarkGenerator}.
*
* <p>Add an idle timeout to the watermark strategy. If no records flow in a partition of a
* stream for that amount of time, then that partition is considered "idle" and will not hold
* back the progress of watermarks in downstream operators.
*
* <p>Idleness can be important if some partitions have little data and might not have events
* during some periods. Without idleness, these streams can stall the overall event time
* progress of the application.
*/
default WatermarkStrategy<T> withIdleness(Duration idleTimeout) ...
So, You can try to use WatermarkStrategy.forMonotonousTimestamps.withIdleness(...)
I am trying to use queryable state on Flink (version 1.4.2) but unfortunately I keep getting the following error:
INFO my.test.flink.QueryableState - Params are a96438fa12879b7598c9cf32684e2669, kafka-cluster_jobmanager_1, 6123
INFO my.test.flink.QueryableState - Before the call java.util.concurrent.CompletableFuture#26aa12dd[Not completed]
java.util.concurrent.ExecutionException: java.lang.IndexOutOfBoundsException: readerIndex(0) + length(4) exceeds writerIndex(0): PooledUnsafeDirectByteBuf(ridx: 0, widx: 0, cap: 0)
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at my.test.flink.QueryableState.main(QueryableState.java:67)
Caused by: java.lang.IndexOutOfBoundsException: readerIndex(0) + length(4) exceeds writerIndex(0): PooledUnsafeDirectByteBuf(ridx: 0, widx: 0, cap: 0)
at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1166)
at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.readInt(AbstractByteBuf.java:619)
at org.apache.flink.queryablestate.network.messages.MessageSerializer.deserializeHeader(MessageSerializer.java:231)
at org.apache.flink.queryablestate.network.ClientHandler.channelRead(ClientHandler.java:76)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
On the client side I am using flink-queryable-state-client-java_2_11.jar and the relevant part of code for the queryable client is
QueryableStateClient client = new QueryableStateClient(jobManagerHost, jobManagerPort);
TypeInformation<MyEvent> typeInformation = TypeInformation.of(new TypeHint<MyEvent>() {});
ListStateDescriptor<MyEvent> descriptor = new ListStateDescriptor<MyEvent>("myEvents",
typeInformation.createSerializer(new ExecutionConfig()));
CompletableFuture<ListState<MyEvent>> resultFuture =
client.getKvState(JobID.fromHexString(jobIdParam),"myEvents", "1",
BasicTypeInfo.STRING_TYPE_INFO , descriptor );
logger.info("Before the call " + resultFuture);
try {
logger.info("Finished"+ resultFuture.get());
} catch(Exception ex) {
ex.printStackTrace();
}
Finally the job running on Flink has a ListState configured as it can been seen below. Note that data are keyed on ListState by String
TypeInformation<MyEvent> typeInformation = TypeInformation.of(new TypeHint<MyEvent>() {});
ListStateDescriptor<MyEvent> eventState =
new ListStateDescriptor<MyEvent>("myEvents",typeInformation);
eventState.setQueryable("myEvents");
eventListState = getRuntimeContext().getListState(eventState);
It seems to me like a serialization error but I do not know what I need to do to fix it. Does anybody have an idea what might be wrong with code above ? Am I missing something?
I ran into that exact same problem when updating this queryable state demo for Flink 1.4. If I recall correctly, the important part is dealing with the CompletableFuture correctly -- you can't just call get() straightaway.
See the code for a working example, the key part of which looks something like this:
try {
CompletableFuture<FoldingState<BumpEvent, Long>> resultFuture =
client.getKvState(jobId, EventCountJob.ITEM_COUNTS, key,
BasicTypeInfo.STRING_TYPE_INFO, countingState);
resultFuture.thenAccept(response -> {
try {
Long count = response.get();
// now we could do something with the value
} catch (Exception e) {
e.printStackTrace();
}
});
resultFuture.get(5, TimeUnit.SECONDS);
} catch (Exception e) {
e.printStackTrace();
}
I'm having problem to get consistent scaling in my WinForms application that uses a Metafile with millimeter as the unit of measure. I wrote a small sample application to illustrate the problem.
This is how the application looks on a Windows 7 desktop machine:
This is how the application looks on a Windows 8 laptop machine:
The source code:
private void MainForm_Paint(object sender, PaintEventArgs e)
{
var g = e.Graphics;
var blueBrush = new SolidBrush(Color.Blue);
var bluePen = new Pen(blueBrush);
g.DrawRectangle(bluePen, 0, 0, 200, 200);
g.DrawLine(bluePen, 100, 0, 100, 200);
g.DrawLine(bluePen, 0, 100, 200, 100);
g.DrawString(g.DpiX+" dpi", new Font("Arial", 10), blueBrush, 0, 205);
Metafile metafile;
var size = new Size(200, 200);
using (var stream = new MemoryStream())
{
using (Graphics offScreenBufferGraphics = Graphics.FromHwndInternal(IntPtr.Zero))
{
IntPtr deviceContextHandle = offScreenBufferGraphics.GetHdc();
metafile = new Metafile(stream, deviceContextHandle, new RectangleF(0, 0, size.Width, size.Height), MetafileFrameUnit.Millimeter, EmfType.EmfPlusOnly);
offScreenBufferGraphics.ReleaseHdc();
using (Graphics mg = Graphics.FromImage(metafile))
{
mg.PageUnit = GraphicsUnit.Millimeter;
var redPen = new Pen(new SolidBrush(Color.Red));
const float scaleFactor = 0.75f;
mg.ScaleTransform(scaleFactor, scaleFactor);
mg.DrawLine(redPen, 0, 0, 200, 200);
mg.DrawLine(redPen, 0, 200, 200, 0);
}
}
}
g.DrawImage(metafile, 0, 0, 200, 200);
}
Both machines are set on 96dpi, yet the Win8 machine renders the metafile (the red cross) smaller.
The scale factor 0.75 is calculated from the difference between the standard 72 dpi and the current 96 dpi, 72/96=0.75, is this correct? Edit: See answer below why this will not work.
But mostly, why is it scaled differently on the Win8 machine and what setting can I fetch to compensate? Seems like the Win8 machine needs a scale factor around 1.25 to make the red cross align with the blue rectangle.
Thanks!
Found the answer myself after reading this code project article. Turns out that the screen size on the machine (1920x1200 on desktop and 1600x900 on laptop) affect the resolution of the metafile. The assumption of 72dpi that was used to calculate the scale factor 0.75 was somewhat correct on my desktop machine, but not on the Win8 laptop.
The metafile resolution can be fetched from the metafile header, and then used to calculate the correct scaling factor:
var metafileHeader = metafile.GetMetafileHeader();
float sx = metafileHeader.DpiX/g.DpiX;
float sy = metafileHeader.DpiY/g.DpiY;
mg.ScaleTransform(sx, sy);
The complete code can be found here.
Then I get correct scaling on both machines:
I'm trying to add an event to a calendar on an android device, and I'm using MonoDroid. I found the following example in Java: http://www.androidcookbook.com/Recipe.seam?recipeId=3852
I tried to translate the first code snippet to C#, but I have trouble setting the "beginTime" and "endTime" fields, especially translating from Calendar.getTimeInMillis() to System.DateTime. This is my code:
DateTime epoch = new System.DateTime(1970, 1, 1, 0, 0, 0, 0);
TimeSpan startSpan = fromDate - epoch;
TimeSpan endSpan = toDate - epoch;
Intent intent = new Intent(Intent.ActionEdit);
intent.SetType("vnd.android.cursor.item/event");
intent.PutExtra("beginTime", startSpan.TotalMilliseconds);
intent.PutExtra("endTime", endSpan.TotalMilliseconds);
The result is that the from and to fields are filled with today's date and a time slot with length of one hour.
How do I correctly set begin/end time of the event?
I have used a helper method in the past that has worked out pretty well. Here is a quick sample that should set the date and time properly.
protected override void OnCreate(Bundle bundle)
{
base.OnCreate(bundle);
// Set our view from the "main" layout resource
SetContentView(Resource.Layout.Main);
AddEvent(this, "Sample Event", DateTime.UtcNow, DateTime.UtcNow.AddHours(5));
}
public void AddEvent(Context ctx, String title, DateTime start, DateTime end)
{
var intent = new Intent(Intent.ActionEdit);
intent.SetType("vnd.android.cursor.item/event");
intent.PutExtra("title", title);
intent.PutExtra("beginTime", TimeInMillis(start));
intent.PutExtra("endTime", TimeInMillis(end));
intent.PutExtra("allDay", false);
ctx.StartActivity(intent);
}
private readonly static DateTime jan1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
private static Int64 TimeInMillis(DateTime dateTime)
{
return (Int64)(dateTime - jan1970).TotalMilliseconds;
}
I am writing an app to read in a very large tif file (15000x10000), then chop it up into 256x256 tiles which get saved as jpegs. I am trying to use the WPF windows.media.imaging objects to do the chopping. The example below runs fine and will extract a number of tiles for me (it just gets multiple copies of the same tile for this example) but the app uses up memory and that memory never gets released. Even forcing a CG.Collect to test this still doesn't free up the memory.
Dim croppedImage As CroppedBitmap
Dim strImagePath As String = "C:\Huge.tif"
Dim imageSource As BitmapSource = TiffBitmapDecoder.Create(New Uri(strImagePath), BitmapCreateOptions.IgnoreImageCache, BitmapCacheOption.None).Frames(0) 'CreateImage(imageBytes, 0, 0)
Dim enc As JpegBitmapEncoder
Dim stream As FileStream
For i As Integer = 0 To 5000
croppedImage = New CroppedBitmap()
croppedImage.BeginInit()
croppedImage.Source = imageSource
croppedImage.SourceRect = New Int32Rect(0, 0, 256, 256)
croppedImage.EndInit()
enc = New JpegBitmapEncoder()
enc.QualityLevel = 70
enc.Frames.Add(BitmapFrame.Create(croppedImage))
stream = New FileStream("C:\output\" & i & ".jpg", FileMode.Create, FileAccess.Write)
enc.Save(stream)
stream.Close()
enc = Nothing
stream = Nothing
croppedImage.Source = Nothing
croppedImage = Nothing
Next
imageSource = Nothing
Am I missing something fundamental here? How can I ensure that these resources are released correctly?
Thanks
More Information:
The answers provided below definitely help. Thanks for that. I have a another issue to add to this now. I am trying to watermark each tile before it is save by adding the following code:
Dim targetVisual = New DrawingVisual()
Dim targetContext = targetVisual.RenderOpen()
targetContext.DrawImage(croppedImage, New Rect(0, 0, tileWidth, tileHeight))
targetContext.DrawImage(watermarkSource, New Rect(0, 0, 256, 256))
Dim target = New RenderTargetBitmap(tileWidth, tileHeight, 96, 96, PixelFormats.[Default])
targetContext.Close()
target.Render(targetVisual)
Dim targetFrame = BitmapFrame.Create(target)
This is starting to use some serious memory. Running through the large tif uses over 1200MB of memory as reported by task manager. It looks like this memory gets released eventually, but I am slightly concerned that something is not right with the code and is there anyway to stop it consuming all this memory in the first place. Perhaps this is simply down to the issue that Franci discussed?
Andrew
Your large object is definitely promoted to gen 2, after the ~25000 object allocations in the loop. Gen 2 objects are collected only on full collections, which are done rarely, thus your object might sit in memory for a while.
You can try forcing a full collection using GC.Collect(). You can also use the other GC methods to check for the allocated memory before you force the full allocation, then wait for it to finish and then check the allocated memory again to determine if indeed the large object was collected.
Note that even though a full collection might occur, the memory has already been included in the process working set, and it might take a bit for the OS to react to freeing that memory and remove it from the working set. This is important to keep in mind if you are trying to judge if the memory was properly collect by looking at the process working set in Task Manager.
I recreated your code locally and did a small experimentation, as a result I found that if you make the stream local to the for loop and use the enc.Save inside Using stream As New FileStream(System.IO.Path.Combine(outputDir, i.ToString() & ".jpg"), FileMode.Create, FileAccess.Write) the total private bytes stays approximately at 60 MB (against 80+ MB when not used inside using).
My complete code for your reference:
Dim fileName As String = System.IO.Path.Combine(Environment.CurrentDirectory, "Image1.tif")
Dim outputDir As String = System.IO.Path.Combine(Environment.CurrentDirectory, "Out")
Dim imageSource As BitmapSource = TiffBitmapDecoder.Create(New Uri(fileName), BitmapCreateOptions.IgnoreImageCache, BitmapCacheOption.None).Frames(0)
Dim enc As JpegBitmapEncoder
Dim croppedImage As CroppedBitmap
For i As Integer = 0 To 4999
croppedImage = New CroppedBitmap()
croppedImage.BeginInit()
croppedImage.Source = imageSource
croppedImage.SourceRect = New Int32Rect(0, 0, 256, 256)
croppedImage.EndInit()
enc = New JpegBitmapEncoder()
enc.QualityLevel = 70
enc.Frames.Add(BitmapFrame.Create(croppedImage))
Using stream As New FileStream(System.IO.Path.Combine(outputDir, i.ToString() & ".jpg"), FileMode.Create, FileAccess.Write)
enc.Save(stream)
End Using
enc = Nothing
croppedImage.Source = Nothing
croppedImage = Nothing
Next
imageSource = Nothing
try creating your TiffBitmaDecoder like so:
Dim imageSource As BitmapSource = TiffBitmapDecoder.Create(New Uri(fileName), BitmapCreateOptions.IgnoreImageCache, BitmapCacheOption.OnLoad).Frames(0)