How to get native code of flink from apache beam build file? - apache-flink

I defined a pipline of stream processing with Apache Beam and I build it for Fling runner.
I got a jar file, but when I extracted the java files, the java code is not a native Flink code.
Though I found some generated code in my classe:
// $FF: synthetic method
private static Object $deserializeLambda$(SerializedLambda lambda) {
String var1 = lambda.getImplMethodName();
byte var2 = -1;
switch(var1.hashCode()) {
case -30470307:
if (var1.equals("lambda$main$fd9fc9ef$1")) {
var2 = 1;
}
break;
case -30470306:
if (var1.equals("lambda$main$fd9fc9ef$2")) {
var2 = 0;
}
}
Is there any generated file with Flink code?
Regards,
Ali

Related

Jenkinsfile read file and use in a loop

I have a scripted Jenkins pipeline located in Jenkinsfile in Github repo. I need to read some data and use it for my script, for this I have this piece of code:
def mydata = [‘val1’, ‘val2’]
mydata.each() {
…
}
Now I need to place the data in the .txt file in the same Github repository and read data from that file. The format in the file is:
val1
val2
I tried this way:
def tmpval = readFile file: ‘values.txt'
env.Mydata = tmpval
Mydata.each() {
......
}
but it doesn’t work as expected, I received
“Caused: java.io.NotSerializableException: java.util.ArrayList$Itr”
Resolved:
String[] mydata = new File("${WORKSPACE}/values.txt")
mydata.each {

Codename One API to append / merge files

To merge Storage files in Codename One I elaborated this solution:
/**
* Merges the given list of Storage files in the output Storage file.
* #param toBeMerged
* #param output
* #throws IOException
*/
public static synchronized void mergeStorageFiles(List<String> toBeMerged, String output) throws IOException {
if (toBeMerged.contains(output)) {
throw new IllegalArgumentException("The output file cannot be contained in the toBeMerged list of input files.");
}
// Note: the temporary file used for merging is placed in the FileSystemStorage because it offers the method
// openOutputStream(String file, int offset) that allows appending to a stream. Storage doesn't have a such method.
long writtenBytes = 0;
String tempFile = FileSystemStorage.getInstance().getAppHomePath() + "/tempFileUsedInMerge";
for (String partialFile : toBeMerged) {
InputStream in = Storage.getInstance().createInputStream(partialFile);
OutputStream out = FileSystemStorage.getInstance().openOutputStream(tempFile, (int) writtenBytes);
Util.copy(in, out);
writtenBytes = FileSystemStorage.getInstance().getLength(tempFile);
}
Util.copy(FileSystemStorage.getInstance().openInputStream(tempFile), Storage.getInstance().createOutputStream(output));
FileSystemStorage.getInstance().delete(tempFile);
}
This solution is based on the API FileSystemStorage.openOutputStream(String file, int offset), that is the only API that I found to allow to append the content of a file to another.
Are there other API that can be used to append or merge files?
Thank you
Since you end up copying everything to a Storage entry I don't see the value of using FileSystemStorage as an intermediate merging tool.
The only reason I can think of is integrity of the output file (e.g. if failure happens while writing) but that can happen here too. You can guarantee integrity by setting a flag e.g. creating a file called "writeLock" and deleting it when write has finished successfully.
To be clear I would copy like this which is simpler/faster:
try(OutputStream out = Storage.getInstance().createOutputStream(output)) {
for (String partialFile : toBeMerged) {
try(InputStream in = Storage.getInstance().createInputStream(partialFile)) {
Util.copyNoClose(in, out, 8192);
}
}
}

Migrating custom dynamic partitioner from Flink 1.7 to Flink 1.9

I am trying to migrate a custom dynamic partitioner from Flink 1.7 to Flink 1.9. The original partitioner implemented the selectChannels method within the StreamPartitioner interface like this:
// Original: working for Flink 1.7
//#Override
public int[] selectChannels(SerializationDelegate<StreamRecord<T>> streamRecordSerializationDelegate,
int numberOfOutputChannels) {
T value = streamRecordSerializationDelegate.getInstance().getValue();
if (value.f0.isBroadCastPartitioning()) {
// send to all channels
int[] channels = new int[numberOfOutputChannels];
for (int i = 0; i < numberOfOutputChannels; ++i) {
channels[i] = i;
}
return channels;
} else if (value.f0.getPartitionKey() == -1) {
// random partition
returnChannels[0] = random.nextInt(numberOfOutputChannels);
} else {
returnChannels[0] = partitioner.partition(value.f0.getPartitionKey(), numberOfOutputChannels);
}
return returnChannels;
}
I am not sure how to migrate this to Flink 1.9, since the StreamPartitioner interface has changed as illustrated below:
// New: required by Flink 1.9
#Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> streamRecordSerializationDelegate) {
T value = streamRecordSerializationDelegate.getInstance().getValue();
if (value.f0.isBroadCastPartitioning()) {
/*
It is illegal to call this method for broadcast channel selectors and this method can remain not
implemented in that case (for example by throwing UnsupportedOperationException).
*/
} else if (value.f0.getPartitionKey() == -1) {
// random partition
returnChannels[0] = random.nextInt(numberOfChannels);
} else {
returnChannels[0] = partitioner.partition(value.f0.getPartitionKey(), numberOfChannels);
}
//return returnChannels;
return returnChannels[0];
}
Note that selectChannels has been replaced with selectChannel. So, it is no longer possible to return multiple output channels as originally done above for the case of broadcasted elements. As a matter of fact, selectChannel should not be invoked for this particular case. Any thoughts on how to tackle this?
With Flink 1.9, you cannot dynamically broadcast to all channels anymore. Your StreamPartitioner has to statically specify if it's a broadcast with isBroadcast. Then, selectChannel is never invoked.
Do you have a specific use case, where you'd need to dynamically switch?

Dart VM itself implement `eval` in `dart:mirrors` and developers use it. Are planned to make this method public?

Here is code that use this eval method in Dart platform.
This is done via reflection.
runtime/lib/mirrors_impl.dart
_getFieldSlow(unwrapped) {
// ..... Skipped
var atPosition = unwrapped.indexOf('#');
if (atPosition == -1) {
// Public symbol.
f = _eval('(x) => x.$unwrapped', null);
} else {
// Private symbol.
var withoutKey = unwrapped.substring(0, atPosition);
var privateKey = unwrapped.substring(atPosition);
f = _eval('(x) => x.$withoutKey', privateKey);
}
// ..... Skipped
}
static _eval(expression, privateKey)
native "Mirrors_evalInLibraryWithPrivateKey";
runtime/lib/mirrors.cc
DEFINE_NATIVE_ENTRY(Mirrors_evalInLibraryWithPrivateKey, 2) {
GET_NON_NULL_NATIVE_ARGUMENT(String, expression, arguments->NativeArgAt(0));
GET_NATIVE_ARGUMENT(String, private_key, arguments->NativeArgAt(1));
const GrowableObjectArray& libraries =
GrowableObjectArray::Handle(isolate->object_store()->libraries());
const int num_libraries = libraries.Length();
Library& each_library = Library::Handle();
Library& ctxt_library = Library::Handle();
String& library_key = String::Handle();
if (library_key.IsNull()) {
ctxt_library = Library::CoreLibrary();
} else {
for (int i = 0; i < num_libraries; i++) {
each_library ^= libraries.At(i);
library_key = each_library.private_key();
if (library_key.Equals(private_key)) {
ctxt_library = each_library.raw();
break;
}
}
}
ASSERT(!ctxt_library.IsNull());
return ctxt_library.Evaluate(expression);
runtime/vm/bootstrap_natives.h
V(Mirrors_evalInLibraryWithPrivateKey, 2) \
P.S.
I ask question here becuase I cannot ask it at Dart mail lists.
P.S.
As we can see it static private method in mirrors_impl.dart:
static _eval(expression, privateKey) native "Mirrors_evalInLibraryWithPrivateKey";
Does anyone want that this method should be public? (this is not a question but just a thought aloud).
According to the Dart FAQ a pure string eval like that is not likely to make it into the language, even though other dynamic features will likely be added:
So, for example, Dart isn’t likely to support evaluating a string as
code in the current context, but it may support loading that code
dynamically into a new isolate. Dart isn’t likely to support adding
fields to a value, but it may (through a mirror system) support adding
fields to a class, and you can effectively add methods using
noSuchMethod(). Using these features will have a runtime cost; it’s
important to us to minimize the cost for programs that don’t use them.
This area is still under development, so we welcome your thoughts on
what you need from runtime dynamism.

Why is this JeroMQ (ZeroMQ port) benchmark so slow?

I would like to use this library I found, it's a pure java port (not a wrapper) of zeromq.
I am trying to test it and while it claims some good numbers, the test I am performing is giving rather poor results and it's even performed locally (client and serve on the same machine). I'm sure it's something I am doing wrong. It takes approx. 5 seconds to execute this 10.000 messages loop.
All I did is take the Hello world example and removed pause and sysouts. Here is the code:
The Server:
package guide;
import org.jeromq.ZMQ;
public class hwserver{
public static void main(String[] args) throws Exception{
// Prepare our context and socket
ZMQ.Context context = ZMQ.context(1);
ZMQ.Socket socket = context.socket(ZMQ.REP);
System.out.println("Binding hello world server");
socket.bind ("tcp://*:5555");
while (true) {
byte[] reply = socket.recv(0);
String requestString = "Hello" ;
byte[] request = requestString.getBytes();
socket.send(request, 0);
}
}
}
The Client:
package guide;
import org.jeromq.ZMQ;
public class hwclient{
public static void main(String[] args){
ZMQ.Context context = ZMQ.context(1);
ZMQ.Socket socket = context.socket(ZMQ.REQ);
socket.connect ("tcp://localhost:5555");
System.out.println("Connecting to hello world server");
long start = System.currentTimeMillis();
for(int request_nbr = 0; request_nbr != 10_000; request_nbr++) {
String requestString = "Hello" ;
byte[] request = requestString.getBytes();
socket.send(request, 0);
byte[] reply = socket.recv(0);
}
long end = System.currentTimeMillis();
System.out.println(end-start);
socket.close();
context.term();
}
}
Is is possible to fix this code and get some decent numbers?
You're doing round-trip request-reply, and this will be just as slow using the C++ libzmq. You will only get fast performance on JeroQM, ZeroMQ, or any I/O when you do streaming.
Round-tripping is slow due to how I/O and TCP work. On libzmq we can do about 20K messages/second using round-tripping, and 8M/sec using streaming. Streaming has additional optimizations like batching which you can't do with round-trip request-reply.
For a throughput performance test, send 10M messages from node 1 to node 2, then send back a single ACK when you get them. Time that on ZeroMQ and on JeroMQ, you should see around 3x difference in speed.
Please refer the throughput test between synchronous round-trip and asynchronous round-trip at
https://github.com/zeromq/jeromq/blob/master/src/test/java/guide/tripping.java
The asynchronous was x40 faster than the synchronous round-trip.
If you want to benchmark the full speed of jeromq, please run perf.LocalThr and perf.RemoteThr on your environment.

Resources