akka stream check source and upload - akka-stream

I am using akka http fileUpload method that produces a Source[akka.util.ByteString, Any].
I would like to handle this source in 2 different threads such as:
----> Future(check first rows if ok) -> insert object in db -> HTTP response 201 / 400
|
source ---|
|
----> Future(upload file to S3) -> set object to ready / delete if error...
So far, I managed to do something like this:
val f = for {
uploadResult <- Future(sendFileToS3(filePath, source)) // uploads the file
(extractedLines, fileSize) <- Future(readFileFromS3(filePath)) // reads the uploaded file
} yield(uploadResult, extractedLines, fileSize)
oncomplete(f) {
case Success((uploadResult, extractedLines, fileSize)) => HTTP OK with id of the object created
case Success((uploadResult, extractedLines, fileSize)) if ... => HTTP KO
case Failure(ex) => HTTP KO
}
The problem here is that on large files, the HTTP response is returned when the upload is finished. But what I would like to have is to handle the uploadResult separately from checking the first lines.
Something like
val f = for {
(extractedLines, fileSize) <- Future(readSource(source))
} yield(extractedLines, fileSize)
oncomplete(f) {
case Success((extractedLines, fileSize)) =>
Future(sendFileToS3AndHandle(filePath, source)) //send in another thread
HTTP OK with id of the object created
case Success((extractedLines, fileSize)) if ... => HTTP KO
case Failure(ex) => HTTP KO
}
Did someone have a similar issue and managed to handle it like this?
I have read something about using the source twice but it seems over complicated for my use case (and did not managed to do what I want). Also, I tried to use akka-stream alsoTo but this does not solve the issue about returning the response as soon as the first line check is completed.
Thank you for your help or suggestion.

Related

Kotlin Parsing json array with new line separator

I'm using OKHttpClient in a Kotlin app to post a file to an API that gets processed. While the process is running the API is sending back messages to keep the connection alive until the result has been completed. So I'm receiving the following (this is what is printed out to the console using println())
{"status":"IN_PROGRESS","transcript":null,"error":null}
{"status":"IN_PROGRESS","transcript":null,"error":null}
{"status":"IN_PROGRESS","transcript":null,"error":null}
{"status":"DONE","transcript":"Hello, world.","error":null}
Which I believe is being separated by a new line character, not a comma.
I figured out how to extract the data by doing the following but is there a more technically correct way to transform this? I got it working with this but it seems error-prone to me.
data class Status (status : String?, transcript : String?, error : String?)
val myClient = OkHttpClient ().newBuilder ().build ()
val myBody = MultipartBody.Builder ().build () // plus some stuff
val myRequest = Request.Builder ().url ("localhost:8090").method ("POST", myBody).build ()
val myResponse = myClient.newCall (myRequest).execute ()
val myString = myResponse.body?.string ()
val myJsonString = "[${myString!!.replace ("}", "},")}]".replace (",]", "]")
// Forces the response from "{key:value}{key:value}"
// into a readable json format "[{key:value},{key:value},{key:value}]"
// but hoping there is a more technically sound way of doing this
val myTranscriptions = gson.fromJson (myJsonString, Array<Status>::class.java)
An alternative to your solution would be to use a JsonReader in lenient mode. This allows parsing JSON which does not strictly comply with the specification, such as in your case multiple top level values. It also makes other aspects of parsing lenient, but maybe that is acceptable for your use case.
You could then use a single JsonReader wrapping the response stream, repeatedly call Gson.fromJson and collect the deserialized objects in a list yourself. For example:
val gson = GsonBuilder().setLenient().create()
val myTranscriptions = myResponse.body!!.use {
val jsonReader = JsonReader(it.charStream())
jsonReader.isLenient = true
val transcriptions = mutableListOf<Status>()
while (jsonReader.peek() != JsonToken.END_DOCUMENT) {
transcriptions.add(gson.fromJson(jsonReader, Status::class.java))
}
transcriptions
}
Though, if the server continously provides status updates until processing is done, then maybe it would make more sense to directly process the parsed status instead of collecting them all in a list before processing them.

Gatling how to store and load a value for a later request

I'd like to build a load test where the second request is fed from first response. The data extraction is done in a method because it is not only one line of code. My problem is storing the value (id) and load it later. How should the value be stored and loaded? I tried some different approaches, and I come up with this code. The documentation has not helped me.
object First {
val first = {
exec(http("first request")
.post("/graphql")
.headers(headers_0)
.body(RawFileBody("computerdatabase/recordedsimulation/first.json"))
.check(bodyString.saveAs("bodyResponse"))
)
.exec {
session =>
val response = session("bodyResponse").as[String]
session.set("Id", getRandomValueForKey("id", response))
session}
.pause(1)
}
}
object Second {
val second = {
exec(http("Second ${Id}")
.post("/graphql")
.headers(headers_0)
.body(RawFileBody("computerdatabase/recordedsimulation/second.json"))
)
.pause(1)
}
}
val user = scenario("User")
.exec(
First.first,
Second.second
)
setUp(user.inject(
atOnceUsers(1),
)).protocols(httpProtocol)
Your issue is that you're not using the Session properly.
From the documentation:
Warning
Session instances are immutable!
Why is that so? Because Sessions are messages that are dealt with in a multi-threaded concurrent way, so immutability is the best way to deal with state without relying on synchronization and blocking.
A very common pitfall is to forget that set and setAll actually return new instances.
This is exactly what you're doing:
exec { session =>
val response = session("bodyResponse").as[String]
session.set("Id", getRandomValueForKey("id", response))
session
}
It should be:
exec { session =>
val response = session("bodyResponse").as[String]
session.set("Id", getRandomValueForKey("id", response))
}

Insert multiple records into database with Vapor3

I want to be able to bulk add records to a nosql database in Vapor 3.
This is my Struct.
struct Country: Content {
let countryName: String
let timezone: String
let defaultPickupLocation: String
}
So I'm trying to pass an array of JSON objects but I'm not sure how to structure the route nor how to access the array to decode each one.
I have tried this route:
let countryGroup = router.grouped("api/country")
countryGroup.post([Country.self], at:"bulk", use: bulkAddCountries)
with this function:
func bulkAddCountries(req: Request, countries:[Country]) throws -> Future<String> {
for country in countries{
return try req.content.decode(Country.self).map(to: String.self) { countries in
//creates a JSON encoder to encode the JSON data
let encoder = JSONEncoder()
let countryData:Data
do{
countryData = try encoder.encode(country) // encode the data
} catch {
return "Error. Data in the wrong format."
}
// code to save data
}
}
}
So how do I structure both the Route and the function to get access to each country?
I'm not sure which NoSQL database you plan on using, but the current beta versions of MongoKitten 5 and Meow 2.0 make this pretty easy.
Please note how we didn't write documentation for these two libraries yet as we pushed to a stable API first. The following code is roughly what you need with MongoKitten 5:
// Register MongoKitten to Vapor's Services
services.register(Future<MongoKitten.Database>.self) { container in
return try MongoKitten.Database.connect(settings: ConnectionSettings("mongodb://localhost/my-db"), on: container.eventLoop)
}
// Globally, add this so that the above code can register MongoKitten to Vapor's Services
extension Future: Service where T == MongoKitten.Database {}
// An adaptation of your function
func bulkAddCountries(req: Request, countries:[Country]) throws -> Future<Response> {
// Get a handle to MongoDB
let database = req.make(Future<MongoKitten.Database>.self)
// Make a `Document` for each Country
let documents = try countries.map { country in
return try BSONEncoder().encode(country)
}
// Insert the countries to the "countries" MongoDB collection
return database["countries"].insert(documents: documents).map { success in
return // Return a successful response
}
}
I had a similar need and want to share my solution for bulk processing in Vapor 3. I’d love to have another experienced developer help refine my solution.
I’m going to try my best to explain what I did. And I’m probably wrong.
First, nothing special in the router. Here, I’m handling a POST to items/batch for a JSON array of Items.
router.post("items", "batch", use: itemsController.handleBatch)
Then the controller’s handler.
func createBatch(_ req: Request) throws -> Future<HTTPStatus> {
// Decode request to [Item]
return try req.content.decode([Item].self)
// flatMap will collapse Future<Future<HTTPStatus>> to [Future<HTTPStatus>]
.flatMap(to: HTTPStatus.self) { items in
// Handle each item as 'itm'. Transforming itm to Future<HTTPStatus>
return items.map { itm -> Future<HTTPStatus> in
// Process itm. Here, I save, producing a Future<Item> called savedItem
let savedItem = itm.save(on: req)
// transform the Future<Item> to Future<HTTPStatus>
return savedItem.transform(to: HTTPStatus.ok)
}
// flatten() : “Flattens an array of futures into a future with an array of results”
// e.g. [Future<HTTPStatus>] -> Future<[HTTPStatus]>
.flatten(on: req)
// transform() : Maps the current future to contain the new type. Errors are carried over, successful (expected) results are transformed into the given instance.
// e.g. Future<[.ok]> -> Future<.ok>
.transform(to: HTTPStatus.ok)
}
}

Gmail API .NET: Get full message

How do I get the full message and not just the metadata using gmail api?
I have a service account and I am able to retrieve a message but only in the metadata, raw and minimal formats. How do I retrieve the full message in the full format? The following code works fine
var request = service.Users.Messages.Get(userId, messageId);
request.Format = UsersResource.MessagesResource.GetRequest.FormatEnum.Metadata;
Message message = request.Execute();
However, when I omit the format (hence I use the default format which is FULL) or I change the format to UsersResource.MessagesResource.GetRequest.FormatEnum.Full
I get the error: Metadata scope doesn't allow format FULL
I have included the following scopes:
https://www.googleapis.com/auth/gmail.readonly,
https://www.googleapis.com/auth/gmail.metadata,
https://www.googleapis.com/auth/gmail.modify,
https://mail.google.com/
How do I get the full message?
I had to remove the scope for the metadata to be able to get the full message format.
The user from the SO post have the same error.
Try this out first.
Go to https://security.google.com/settings/security/permissions
Choose the app you are working with.
Click Remove > OK
Next time, just request exactly which permissions you need.
Another thing, try to use gmailMessage.payload.parts[0].body.dataand to decode it into readable text, do the following from the SO post:
import org.apache.commons.codec.binary.Base64;
import org.apache.commons.codec.binary.StringUtils;
System.out.println(StringUtils.newStringUtf8(Base64.decodeBase64(gmailMessage.payload.parts[0].body.data)));
You can also check this for further reference.
try something like this
public String getMessage(string user_id, string message_id)
{
Message temp =service.Users.Messages.Get(user_id,message_id).Execute();
var parts = temp.Payload.Parts;
string s = "";
foreach (var part in parts) {
byte[] data = FromBase64ForUrlString(part.Body.Data);
s += Encoding.UTF8.GetString(data);
}
return s
}
public static byte[] FromBase64ForUrlString(string base64ForUrlInput)
{
int padChars = (base64ForUrlInput.Length % 4) == 0 ? 0 : (4 - (base64ForUrlInput.Length % 4));
StringBuilder result = new StringBuilder(base64ForUrlInput, base64ForUrlInput.Length + padChars);
result.Append(String.Empty.PadRight(padChars, '='));
result.Replace('-', '+');
result.Replace('_', '/');
return Convert.FromBase64String(result.ToString());
}

Akka stream stops after one element

My akka stream is stopping after a single element. Here's my stream:
val firehoseSource = Source.actorPublisher[FirehoseActor.RawTweet](
FirehoseActor.props(
auth = ...
)
)
val ref = Flow[FirehoseActor.RawTweet]
.map(r => ResponseParser.parseTweet(r.payload))
.map { t => println("Received: " + t); t }
.to(Sink.onComplete({
case Success(_) => logger.info("Stream completed")
case Failure(x) => logger.error(s"Stream failed: ${x.getMessage}")
}))
.runWith(firehoseSource)
FirehoseActor connects to the Twitter firehose and buffers messages to a queue. When the actor receives a Request message, it takes the next element and returns it:
def receive = {
case Request(_) =>
logger.info("Received request for next firehose element")
onNext(RawTweet(queue.take()))
}
The problem is that only a single tweet is being printed to the console. The program doesn't quit or throw any errors, and I've sprinkled logging statements around, and none are printed.
I thought the sink would keep applying pressure to pull elements through but that doesn't seem to be the case since neither of the messages in Sink.onComplete get printed. I also tried using Sink.ignore but that only printed a single element as well. The log message in the actor only gets printed once as well.
What sink do I need to use to make it pull elements through the flow indefinitely?
Ah I should have respected totalDemand in my actor. This fixes the issue:
def receive = {
case Request(_) =>
logger.info("Received request for next firehose element")
while (totalDemand > 0) {
onNext(RawTweet(queue.take()))
}
I was expecting to receive a Request for each element in the stream, but apparently each Flow will send a Request.

Resources