Akka doc is unclear about how to get an ExtendedActorSystem to deserialize ActorRef

Akka doc is unclear about how to get an ExtendedActorSystem to deserialize ActorRef - akka-cluster

I'm trying to serialize/deserialize ActorRef through protobuf. According to the Akka doc, the only way to do it is to convert the ActorRef into a String, and convert it back in the remote actor system.
The doc mentions using an ExtendedActorSystem to do the deserialization (see here). However, it is unclear how to get the ExtendedActorSystem:
// Serialize
// (beneath toBinary)
val identifier: String = Serialization.serializedActorPath(theActorRef)
// Then just serialize the identifier however you like
// Deserialize
// (beneath fromBinary)
// ==== Where is this extendedSystem from? ====
val deserializedActorRef = extendedSystem.provider.resolveActorRef(identifier)
// Then just use the ActorRef
Edit
I found this question here: Akka (JVM): Serialize an actorref with protobuf within another message, which mentions casting an ActorSystem to ExtendedActorSystem. Is this the right approach? Will it always work?

dear #stackoverflower,
whenever you use ActorSystem(...) it build an instance of ActorSystemImpl.
The type-tree looks like:
ActorSystemImpl extends ExtendedActorSystem
and
ExtendedActorSystem implements ActorSystem
you can use statements like
val system: ExtendedActorSystem = ActorSystem(...).asInstanceOf[ExtendedActorSystem]
to access the correct type autocomplete. ActorSystemImpl is unfortunately scoped to [akka].

Related

Flink: TypeExtractor complains about protobuf class even though a ProtobufSerializer is registered for it

Using Flink 1.12.
My protobuf is
syntax = "proto3";
package flink.protobuf;
message TimestampedMessage {
int64 timeMs = 1;
string message = 2;
}
and tried to use it like so
final var env = StreamExecutionEnvironment.createLocalEnvironment();
env.getConfig().registerTypeWithKryoSerializer(TimestampedMessage.class, ProtobufSerializer.class);
env.fromCollection(new EventsIter(), TimestampedMessage.class)
...
But the logs show this
flink.protobuf.Test$TimestampedMessage does not contain a setter for field timeMs_
2021-08-12 06:38:19,940 INFO org.apache.flink.api.java.typeutils.TypeExtractor Class class
flink.protobuf.Test$TimestampedMessage cannot be used as a POJO type because not all fields are
valid POJO fields, and must be processed as GenericType. Please read the Flink
documentation on "Data Types & Serialization" for details of the effect on performance.

Seems like it is using the ProtobufSerializer despite the warning.

Object id is missing in Django framework when posted from AngularJS MongoDB

I am posting the following object
{
skillName : "Professional Skills"
_id : {$oid: "5adf23946ab671bf6cb36aff"}
}
to the DjangoService given below:
#csrf_exempt
#api_view(['GET','POST'])
def saveSubjectView(request): #this service will add & update Subject
if request.method == 'POST':
try:
stream = StringIO(request.body)
subject = JSONParser().parse(stream)
print("The subejct is ")
pp.pprint(subject)
serializedsubject = json.loads(json_util.dumps(subject))
print("serializedsubject")
pp.pprint(serializedsubject)
The output that I am getting is
'skillType': { u'_id': { }, u'skillName': u'Professional Skills'}
The ObjectId posted from the front end (AngularJS) is not printed in the service. I know that I can fix it by removing the $oid while posting from the AngularJS application. But I would like to know why this is not happening. I have searched the documents and I couldn't get a proper reply. May be the keywords I used are wrong. Keywords used are : "JSON serialisation of ObjectId", "$oid json serialization using Django".
The complete object I am posting to the Django service is given below:

Exactly. $oid or anything prefixed with $ is an internal format and reserved, so you cannot post field names. The convention is from MongoDB Extended JSON where such prefixes are used to identify the BSON Type for proper conversion, and used as a serializable transport since these "types" are not supported in basic JSON.
So the solution is to actually use the bson.json_util to "deserialize" the JSON string right from the start:
from bson import json_util
# serializedsubject = json.loads(json_util.dumps(subject))
serializedsubject = json_util.loads(request.body) # correct usage
Or more succinctly self contained:
input = '{ "skillName" : "Professional Skills" ,"_id" : { "$oid": "5adf23946ab671bf6cb36aff"} }'
json_util.loads(input)
Returns
{u'skillName': u'Professional Skills', u'_id': ObjectId('5adf23946ab671bf6cb36aff')}
This correctly casts objects from any keys notated with the Extended JSON Syntax to their correct BSON Type, as also supported in the driver functions. And naturally the driver will then convert back to BSON when sending to MongoDB.
If for some reason your request.body contains anything other than a "string" which is valid for input to the function, then it is up to your code to convert it to that point. But there should be no need to "parse to JSON" and then "stringify" again just to input to the function.
NOTE: If you have not already done so within your JavaScript client side of the application, there is also the bson package available. This would allow where such Extended JSON is "received" from the server the translation into the BSON Types as JavaScript Objects, and of course then the serialization of such objects back into the Extended JSON Format.
This would in fact be recommended where "type" information needs to be maintained with the data transmitted and kept between client and server.

Finatra FeatureTests: How to manually deserialize returned json

I read the Finatra getting started guide and I was able to write the HelloWorld Service and its feature test.
Currently my feature test looks like
server.httpPost(
path = "/hi",
postBody = """{"name": "Foo", "dob": 136190040000}""",
andExpect = Ok,
withBody = """{"msg":"Hello Foo. You are 15780 days old today"}""")
This works fine and my tests pass. However my requirement is that I extract the json returned by the server and then manually perform asserts on the object returned.
I changed my code to
val response = server.httpPost(
path = "/hi",
postBody = """{"name": "Abhishek", "dob": 136190040000}""",
andExpect = Ok,
withBody = """{"msg":"Hello Abhishek. You are 15780 days old today"}""")
val json = response.contentString
This also works and I can see the json returned in side the variable json.
My question is that if I have to deserialize this json into an object. Should I just pull in any json library like circe? and then deserialize the object?
or can I use the jackson framework which comes inside of Finatra.
In all examples I could find, I see that Finatra "automatically" handles the json serialization and deserialization. But in my case I want to perform this manually.

You can use the FinatraObjectMapper by calling (using your example) server.mapper. That wraps a Jackson ObjectMapper that you could use if you wanted to use the Jackson library without any of the Finatra add ons.
Or you can import your a different JSON library. If you are using SBT, you can restrict libraries to certain areas of your code, so if you wanted to use circe only in the test code, you could add the following to your build.sbt
"org.scalatest" %% "scalatest" % "2.2.6" % "test"

How do I create a Flow with a different input and output types for use inside of a graph?

I am making a custom sink by building a graph on the inside. Here is a broad simplification of my code to demonstrate my question:
def mySink: Sink[Int, Unit] = Sink() { implicit builder =>
val entrance = builder.add(Flow[Int].buffer(500, OverflowStrategy.backpressure))
val toString = builder.add(Flow[Int, String, Unit].map(_.toString))
val printSink = builder.add(Sink.foreach(elem => println(elem)))
builder.addEdge(entrance.out, toString.in)
builder.addEdge(toString.out, printSink.in)
entrance.in
}
The problem I am having is that while it is valid to create a Flow with the same input/output types with only a single type argument and no value argument like: Flow[Int] (which is all over the documentation) it is not valid to only supply two type parameters and zero value parameters.
According to the reference documentation for the Flow object the apply method I am looking for is defined as
def apply[I, O]()(block: (Builder[Unit]) ⇒ (Inlet[I], Outlet[O])): Flow[I, O, Unit]
and says
Creates a Flow by passing a FlowGraph.Builder to the given create function.
The create function is expected to return a pair of Inlet and Outlet which correspond to the created Flows input and output ports.
It seems like I need to deal with another level of graph builders when I am trying to make what I think is a very simple flow. Is there an easier and more concise way to create a Flow that changes the type of it's input and output that doesn't require messing with it's inside ports? If this is the right way to approach this problem, what would a solution look like?
BONUS: Why is it easy to make a Flow that doesn't change the type of its input from it's output?

If you want to specify both the input and the output type of a flow, you indeed need to use the apply method you found in the documentation. Using it, though, is done pretty much exactly the same as you already did.
Flow[String, Message]() { implicit b =>
import FlowGraph.Implicits._
val reverseString = b.add(Flow[String].map[String] { msg => msg.reverse })
val mapStringToMsg = b.add(Flow[String].map[Message]( x => TextMessage.Strict(x)))
// connect the graph
reverseString ~> mapStringToMsg
// expose ports
(reverseString.inlet, mapStringToMsg.outlet)
}
Instead of just returning the inlet, you return a tuple, with the inlet and the outlet. This flow can now we used (for instance inside another builder, or directly with runWith) with a specific Source or Sink.

Intermitent Base64 Task Conversion Errors

I'm experiencing a really weird situation when passing on a POJO java object within the payload of a Pull Queue task using Gson. Without changing the code or the POJO being set within the payload of a task, this will randomly succeed or fail.
This is the code I'm using:
PullQueueTaskPayLoad tqp = new PullQueueTaskPayLoad("id","name");
tqp.uploadURL = taskPayLoad.uploadURL;
tqp.urls = taskPayLoad.urls;
tqp.sliceQueryParameter = taskPayLoad.sliceQueryParameter;
TaskOptions task = TaskOptions.Builder.withMethod(TaskOptions.Method.PULL);
task.payload(new Gson().toJson(tqp));
q.add(task);
Using an external queue consumer I then retrieve the POJO as follows:
Type GSON_TYPE = new TypeToken<PullQueueTaskPayLoad>() {}.getType();
byte[] b = new Base64().decodeBase64(leasedTask.getPayloadBase64().getBytes());
String payload = new String(b);
logger.info("About to convert payload: "+payload);
PullQueueTaskPayLoad taskpayload = new Gson().fromJson(payload, GSON_TYPE);
So from the debugging I did, the problem seems to be happening when I'm decoding the payload bytes. While encoding the same POJO (with different Ids) I randomly get 2 different decoded payload Strings as follows:
Correct decoding:
{"id":"1786024566","sliceQueryParameter": {"queryId":786024566,"sliceStart":-1,"sliceNumber":1,"params":{"DefaultAnnotation":{"http://www.slicepedia.org/ontology#hasNumberOfBulletPoints_SIGN":["\u003d"],"http://www.slicepedia.org/ontology#hasNumberOfBulletPoints":["0"],"http://www.slicepedia.org/ontology#hasNumberOfTokens":["80"],"http://www.slicepedia.org/ontology#hasNumberOfTokens_SIGN":["\u003e"]},"VG":{"http://www.slicepedia.org/ontology#hastense":["?"],"http://www.slicepedia.org/ontology#hasroot":["?"]}}},"uploadURL":"http://3.linguabox0412.appspot.com/_ah/upload/AMmfu6YRjxX23Ks-yh-9AZs4-3I1p6hxrFd6d4ptxSQegUkQHN7y4hNZwX6u7PufIHJbwtsHLXFZJ5P-vs90mslZEOMw0T-amN2qhEOAj_6YdwuY50FXMi8/ALBNUaYAAAAAT7Towgs4M00M5RLI8xnEOMdIxouZzuGu/","status":"IN_PROGRESS","action":"SLICE_SEARCH_AND_CREATE"}
Incorrect decoding:
{"id":"1-1968382407","sliceQueryParameter":{"queryId":-1968382407,"sliceStart":-1,"sliceNumber":1,"params":{"DefaultAnnotation":{"http://www.slicepedia.org/ontology#hasNumberOfBulletPoints_SIGN":["\u003d"],"http://www.slicepedia.org/ontology#hasNumberOfBulletPoints":["0"],"http://www.slicepedia.org/ontology#hasNumberOfTokens":["80"],"http://www.slicepedia.org/ontology#hasNumberOfTokens_SIGN":["\u003e"]},"VG":{"http://www.slicepedia.org/ontology#hastense":["?K??????˜?X?\YXK?ܙ?????H?\ܛ????Ȃ%?????'W??EU$?#?&?GG???2?Ɩ?wV&??C"?7?B?6??????W??B???gSe????'u?U'd?D??6?S??4UV?D?e7?%U?&%F%f?D?$???$&vu6?fF$????EG?v??6?6դvt?D???G??&D?fdֵ6%?甦??GD????F???$?V?CuF?$?F?F֤֧f?D??u?wt?4?C$?W?"?'7FGW2#?$???$?u$U52"?&7F???#?%4Ĕ4U?4T$4???E?5$TDR'
So the second string obviously fails when using Gson to convert it back to a POJO. But I dont' understand why this happens in only some cases and not others. For what I've seen, it seems to always happen after a ["?"] character string. I tried replacing and ? with other strings but it didn't change anything.

I think what is happening here is that the payload is webSafe-base64 encoded. In practice, this means swapping + and / and = for - and _ and .. Most base64 libraries have native support for decoding websafe base 64 strings.
Probably you are meeting one of these chars at a certain point, and that kills the decoding.
Here is some info on WebSafeBase64
Word of warning, though: the taskqueue implementation is actually sending padding equals (=) that you will have to convert manually before parsing.