Streaming text lines from multiple source files - akka-stream

I'm implementing a solution using akka-stream to read text lines from multiple files and came up with below impl:
def main(args: Array[String]): Unit = {
val g: Flow[String, Unit, NotUsed] = Flow.fromGraph(GraphDSL.create() {
implicit builder =>
import GraphDSL.Implicits._
val A = builder.add(doQuery)
val B = builder.add(analyzeResult)
A ~> B
FlowShape(A.in, B.out)
})
val files = Source(fileNames)
val lines = files.map(file =>
Source.fromIterator(() => Source.fromFile(file.getName, "UTF-8").getLines)
)
val done = lines.runForeach(g.runWith(_, Sink.ignore))
// implicit val ec = system.dispatcher
// done.onComplete(_ => system.terminate())
}
val fileNames: List[File] = ???
val doQuery = Flow[String]
.groupedWithin(1000, 100 millisecond)
.mapAsync(4)(x =>
Future[Seq[String]] {
synchronized {
// Do Something
Nil
}
}
)
val analyzeResult: Flow[Seq[String], Unit, NotUsed] = ???
Anyone can give comments/ feedbacks if there are better solutions?
I'd prefer not to have two Source's (the file list and the second one for the text lines from each file). Wondering how to just have one single Source (the list of files)...
TIA!

val g: Flow[File, Unit, NotUsed] = Flow.fromGraph(
GraphDSL.create() {
implicit builder =>
import GraphDSL.Implicits._
val A = builder.add(Flow[File]
.flatMapConcat {
Source.fromIterator(() =>
scala.io.Source.fromFile(s"${file.getAbsolutePath}", "UTF-8").getLines)
}
)
val B = builder.add(doQuery)
val C = builder.add(analyzeResult)
A ~> B ~> C
FlowShape(A.in, C.out)
}
)
val files = Source(fileNames)
val (_: NotUsed, done: Future[Done]) = g.runWith(files, Sink.ignore)
implicit val ec = system.dispatcher
done.onComplete(_ => system.terminate())

Related

How to adjust function for Iterable Array?

1974,1974-06-22
1966,1966-07-20
1954,1954-06-19
1994,1994-06-27
1954,1954-06-26
2006,2006-07-04
2010,2010-07-07
1990,1990-06-30
...
It is type RDD[String].
What is wrong in the function iteself?
Try
def f(v: Iterable[Array[String]]): Int = {
val parsedDates = v.flatten.map(e => LocalDate.parse(e, formatter))
parsedDates.max.getDayOfYear - parsedDates.min.getDayOfYear
}
which outputs
val arrays = Iterable(Array("2014-10-10","2014-12-10"))
f(arrays) // res0: Int = 61

How to merge 2 JSON Arrays in Groovy?

Trying to merge 2 Json arrays into 1 in groovy.
def branchTags = new JsonBuilder()
branchTags branches, { String branch ->
tag branch
type 'b'
}
println(branchTags.toString())
//generates [{"tag":"Branch","type":"b"},{"tag":"Branch1","type":"b"}]
def releaseTags = new JsonBuilder()
releaseTags releases, {String release ->
tag release
type 'r'
}
println(releaseTags.toString())
//generates [{"tag":"Release","type":"r"},{"tag":"Rel1","type":"r"}]
/*def newTags = new JsonBuilder()
branchTags.each {k,v -> newTags.}*/
def slurper = new JsonSlurper()
def input = slurper.parseText(branchTags.toString())
def res = slurper.parseText(releaseTags.toString())
def joined = [input, res].flatten()
println joined.toString()
//this generates [{"tag":"Branch","type":"b"},{"tag":"Branch1","type":"b"}][{"tag":"Release","type":"r"},{"tag":"Rel1","type":"r"}]
I need:
[
{"tag":"Branch","type":"b"},
{"tag":"Branch1","type":"b"},
{"tag":"Release","type":"r"},
{"tag":"Rel1","type":"r"}
]
TIA,
in your case after you parsed json you have two arrays.
just use + to concatenate two arrays into one
import groovy.json.*
def branchTags = '[{"tag":"Branch","type":"b"},{"tag":"Branch1","type":"b"}]'
def releaseTags = '[{"tag":"Release","type":"r"},{"tag":"Rel1","type":"r"}]'
def slurper = new JsonSlurper()
def bArr = slurper.parseText(branchTags)
def rArr = slurper.parseText(releaseTags)
def res = bArr+rArr
println new JsonBuilder(res).toString()

How to update the sql server's identity column using play2.6 slick3?

I got the following error when i try to run the sbt scala project using sql server,
[SQLServerException: Cannot update identity column 'ID'.]
I am using Play 2.6,Scala 2.12, Slick 3
my update function is,
def update(id: Long): Action[AnyContent] = Action.async { implicit request =>
topicForm.bindFromRequest.fold(
formWithErrors => Future.successful(BadRequest(html.editForm(id, formWithErrors))),
topic => {
val futureTopUpdate = dao.update(id, topic.copy(id = Some(id)))
futureTopUpdate.map { result =>
Home.flashing("success" -> "Topic %s has been updated".format(topic.code))
}.recover {
case ex: TimeoutException =>
Logger.error("Problem found in topic update process")
InternalServerError(ex.getMessage)
}
})
}
and the DAO:
override def update(id: Long, topic: Topic): Future[Int] =
try db.run(filterQuery(id).update(topic))
finally db.close
any idea?
You can show the filterQuery(id) implementation, something similar in a dao of us works well in this way:
override def update(id: Long, topic: Topic): Future[Int] = {
db.run(filterQuery(id).update(topic.copy(id))
}
Notice: topic.copy(id)
And filterQuery is:
def filterQuery(id: Int) = themes.filter(_.id === id)
We using Play 2.6,Scala 2.12, Slick 3 with MYSQL.
Update # 1:
-> Entity:
case class CategoryRow(id: Int, name: String, description: String)
-> Mapping:
trait CategoryMapping {
self: HasDatabaseConfigProvider[JdbcProfile] =>
import dbConfig.profile.api._
private[models] class CategoryTable(tag: Tag)
extends Table[CategoryRow](tag, "category") {
def id = column[Int]("id", O.AutoInc, O.PrimaryKey)
def name = column[String]("name", O.Length(TextMaxLength_250))
def description = column[String]("description", Nullable)
def categoryNameAgencyIndex = index("categoryName_agency_idx", (name, agencyId), unique = true)
override def * = (
id,
name,
description
) <> (CategoryRow.tupled, CategoryRow.unapply)
}
private[models] val Categories = TableQuery[CategoryTable]
private[models] val CategoriesInsertQuery = Categories returning Categories.map(_.id)
}
-> REPO
trait CategoryRepository {
//...
def update(id: Int, category: Category)(implicit agencyId: Int): Future[Int]
//...
}
-> REPOImpl:
#Singleton
class CategoryRepositoryImpl #Inject()(protected val dbConfigProvider: DatabaseConfigProvider)(implicit ec: RepositoryExecutionContext)
extends CategoryRepository with HasDatabaseConfigProvider[JdbcProfile] with CategoryMapping {
import dbConfig.profile.api._
//....
def update(id: Int, category: CategoryRow)(implicit agencyId: Int): Future[Int] =
db.run(filter(id).update(category.copy(id)))
private def filter(id: Int) = Categories.filter(_.id === id)
//....
}
-> RepositoryExecutionContex
class RepositoryExecutionContext #Inject()(actorSystem: ActorSystem) extends CustomExecutionContext(actorSystem, "repository.dispatcher")
and aplication.conf:
# db connections = ((physical_core_count * 2) + effective_spindle_count)
fixedConnectionPool = 5
repository.dispatcher {
executor = "thread-pool-executor"
throughput = 1
thread-pool-executor {
fixed-pool-size = ${fixedConnectionPool}
}
}
There's some more information about fold in Chapter 3.3 of Essential Slick.

Filter an array of objects by distance Kotlin

Im new to Kotlin and trying to convert some Swift code to Kotlin.
Here is my swift function. It filters out array objects that are over a specific distance from the user.
func filterByDistance(_ events:[Event]) -> [Event] {
let filteredEvents = events.filter { event -> Bool in
if let lat = event.venue?.location?.latitude,
let long = event.venue?.location?.longitude,
let userLocation = UserLocation.shared.location {
let eventLocation = CLLocation(latitude: lat, longitude: long)
let distance = eventLocation.distance(from: userLocation)
let convertedDistance = distance * 0.000621371
if convertedDistance <= maxDistance {
return true
}
}
return false
}
return filteredEvents
}
Below is what I've got so far using Kotlin
fun LatLng.toLocation() = Location(LocationManager.GPS_PROVIDER).also {
it.latitude = latitude
it.longitude = longitude
}
fun filterByDistance(events: Array<Events>): Array<Events> {
val filteredEvents = events.filter<Events> { event ->
val lat = event.venue?.location?.latitude
val long = event.venue?.location?.longitude
val userLocation = LatLng(latitude, longitude)
val eventLocation = LatLng(lat, long)
val distance = eventLocation.toLocation().distanceTo(userLocation.toLocation())
val convertedDistance = distance * 0.000621371
if (convertedDistance <= 500) {
return true
} else {
return false
}
}
return filterEvents(events)
}
Im getting an error asking me to change my return type to a Bool, but I need to return an array of filtered events. Can someone help me out?
EDIT: Thanks to JB Nizet I was able to get this working. I had to change the object from Array to List. Here is the working code.
fun fetchJson() {
val url = "URL String"
val request = Request.Builder().url(url).build()
val client = OkHttpClient()
client.newCall(request).enqueue(object:Callback{
override fun onResponse(call: Call?, response: Response?) {
val body = response?.body()?.string()
val gson = GsonBuilder().create()
val eventss = gson.fromJson(body, Array<Events>::class.java)
val events = eventss.toList()
val filteredEvents = filterByDistance(events)
runOnUiThread {
recyclerView.adapter = MainAdaptor(filteredEvents)
}
}
override fun onFailure(call: Call?, e: IOException?) {
println("failed")
}
})
}
fun LatLng.toLocation() = Location(LocationManager.GPS_PROVIDER).also {
it.latitude = latitude
it.longitude = longitude
}
fun filterByDistance(events: List<Events>): List<Events> {
val filteredEvents = events.filter { event ->
val lat = event.venue?.location?.latitude
val long = event.venue?.location?.longitude
val userLocation = LatLng(latitude, longitude)
val eventLocation = LatLng(lat, long)
val distance = eventLocation.toLocation().distanceTo(userLocation.toLocation())
val convertedDistance = distance * 0.000621371
convertedDistance <= maxDistance
}
return filteredEvents
}
And the Class if it helps anyone:
class Events (val type: String,
val venue: Venue,
val time: String,
val name: String,
val summary: String,
val activity: String,
val image_link: String,
val membership_link: String,
val description: String
)
class Venue(val type: String,
val name: String,
val address: String,
val location:Location
)
class Location(val type: String,
val latitude: Double,
val longitude: Double)
Replace
if (convertedDistance <= 500) {
return true
} else {
return false
}
by
convertedDistance <= 500
and
return filterEvents(events)
by
return filteredEvents
See https://kotlinlang.org/docs/reference/lambdas.html#lambda-expression-syntax for an explanation of the lambda syntax:
We can explicitly return a value from the lambda using the qualified return syntax. Otherwise, the value of the last expression is implicitly returned. Therefore, the two following snippets are equivalent:
ints.filter {
val shouldFilter = it > 0
shouldFilter
}
ints.filter {
val shouldFilter = it > 0
return#filter shouldFilter
}

What is the Best way to loop over an array in scala

I'm new to scala and I'm trying to refactor the below code.I want to eliminate "index" used in the below code and loop over the array to fetch data.
subgroupMetricIndividual.instances.foreach { instanceIndividual =>
val MetricContextListBuffer: ListBuffer[Context] = ListBuffer()
var index = 0
contextListBufferForSubGroup.foreach { contextIndividual =>
MetricContextListBuffer += Context(
entity = contextIndividual,
value = instanceIndividual(index).toString
)
index += 1
}
}
For instance, if the values of variables are as below:
contextListBufferForSubGroup = ("context1","context2")
subgroupMetricIndividual.instances = {{"Inst1","Inst2",1},{"Inst3","Inst4",2}}
Then Context should be something like:
{
entity: "context1",
value: "Inst1"
},
{
entity: "context2",
value: "Inst2"
},
{
entity: "context1",
value: "Inst3"
},
{
entity: "context2",
value: "Inst4"
}
Note:
instanceIndividual can have more elements than those in contextListBufferForSubGroup. We must ignore the last extra elements in instanceIndividual in this case
You can zip two lists into a list of tuples and then map over that.
e.g.
subgroupMetricIndividual.instances.foreach { instanceIndividual =>
val MetricContextListBuffer = contextListBufferForSubGroup.zip(instanceIndividual).map {
case (contextIndividual, instanceIndividualIndex) => Context(
entity = contextIndividual,
value = instanceIndividualIndex.toString
)
}
}
If Context can be called like a function i.e. Context(contextIndividual, instanceIndividualIndex.toString) then you can write this even shorter.
subgroupMetricIndividual.instances.foreach { instanceIndividual =>
val MetricContextListBuffer = contextListBufferForSubGroup
.zip(instanceIndividual.map(_.toString)).map(Context.tupled)
}
Without knowing your exact datatypes, I'm mocked up something which is probably close to what you want, and is slightly more functional using maps, and immutable collections
case class Context(entity:String, value:String)
val contextListBufferForSubGroup = List("context1","context2")
val subgroupMetricIndividualInstances = List(List("Inst1","Inst2",1),List("Inst3","Inst4",2))
val result: List[Context] = subgroupMetricIndividualInstances.map { instanceIndividual =>
contextListBufferForSubGroup.zip(instanceIndividual) map { case v: (String, String) =>
Context(
entity = v._1,
value = v._2
)
}
}.flatten

Resources