I'm struggling with Slick's lifted embedding and mapped tables. The API feels strange to me, maybe just because it is structured in a way that's unfamiliar to me.
I want to build a Task/Todo-List. There are two entities:
Task: Each task has a an optional reference to the next task. That way a linked list is build. The intention is that the user can order the tasks by his priority. This order is represented by the references from task to task.
TaskList: Represents a TaskList with a label and a reference to the first Task of the list.
case class Task(id: Option[Long], title: String, nextTask: Option[Task])
case class TaskList(label: String, firstTask: Option[Task])
Now I tried to write a data access object (DAO) for these two entities.
import scala.slick.driver.H2Driver.simple._
import slick.lifted.MappedTypeMapper
implicit val session: Session = Database.threadLocalSession
val queryById = Tasks.createFinderBy( t => t.id )
def task(id: Long): Option[Task] = queryById(id).firstOption
private object Tasks extends Table[Task]("TASKS") {
def id = column[Long]("ID", O.PrimaryKey, O.AutoInc)
def title = column[String]("TITLE")
def nextTaskId = column[Option[Long]]("NEXT_TASK_ID")
def nextTask = foreignKey("NEXT_TASK_FK", nextTaskId, Tasks)(_.id)
def * = id ~ title ~ nextTask <> (Task, Task.unapply _)
}
private object TaskLists extends Table[TaskList]("TASKLISTS") {
def label = column[String]("LABEL", O.PrimaryKey)
def firstTaskId = column[Option[Long]]("FIRST_TASK_ID")
def firstTask = foreignKey("FIRST_TASK_FK", firstTaskId, Tasks)(_.id)
def * = label ~ firstTask <> (Task, Task.unapply _)
}
Unfortunately it does not compile. The problems are in the * projection of both tables at nextTask respective firstTask.
could not find implicit value for evidence parameter of type
scala.slick.lifted.TypeMapper[scala.slick.lifted.ForeignKeyQuery[SlickTaskRepository.this.Tasks.type,justf0rfun.bookmark.model.Task]]
could not find implicit value for evidence parameter of type scala.slick.lifted.TypeMapper[scala.slick.lifted.ForeignKeyQuery[SlickTaskRepository.this.Tasks.type,justf0rfun.bookmark.model.Task]]
I tried to solve that with the following TypeMapper but that does not compile, too.
implicit val taskMapper = MappedTypeMapper.base[Option[Long], Option[Task]](
option => option match {
case Some(id) => task(id)
case _ => None
},
option => option match {
case Some(task) => task.id
case _ => None
})
could not find implicit value for parameter tm: scala.slick.lifted.TypeMapper[Option[justf0rfun.bookmark.model.Task]]
not enough arguments for method base: (implicit tm: scala.slick.lifted.TypeMapper[Option[justf0rfun.bookmark.model.Task]])scala.slick.lifted.BaseTypeMapper[Option[Long]]. Unspecified value parameter tm.
Main question: How to use Slick's lifted embedding and mapped tables the right way? How to I get this to work?
Thanks in advance.
The short answer is: Use ids instead of object references and use Slick queries to dereference ids. You can put the queries into methods for re-use.
That would make your case classes look like this:
case class Task(id: Option[Long], title: String, nextTaskId: Option[Long])
case class TaskList(label: String, firstTaskId: Option[Long])
I'll publish an article about this topic at some point and link it here.
Related
I am writing a Spark 3 UDF to mask an attribute in an Array field.
My data (in parquet, but shown in a JSON format):
{"conditions":{"list":[{"element":{"code":"1234","category":"ABC"}},{"element":{"code":"4550","category":"EDC"}}]}}
case class:
case class MyClass(conditions: Seq[MyItem])
case class MyItem(code: String, category: String)
Spark code:
val data = Seq(MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("4550", "EDC"))))
import spark.implicits._
val rdd = spark.sparkContext.parallelize(data)
val ds = rdd.toDF().as[MyClass]
val maskedConditions: Column = updateArray.apply(col("conditions"))
ds.withColumn("conditions", maskedConditions)
.select("conditions")
.show(2)
Tried the following UDF function.
UDF code:
def updateArray = udf((arr: Seq[MyItem]) => {
for (i <- 0 to arr.size - 1) {
// Line 3
val a = arr(i).asInstanceOf[org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema]
val a = arr(i)
println(a.getAs[MyItem](0))
// TODO: How to make code = "XXXX" here
// a.code = "XXXX"
}
arr
})
Goal:
I need to set 'code' field value in each array item to "XXXX" in a UDF.
Issue:
I am unable to modify the array fields.
Also I get the following error if remove the line 3 in the UDF (cast to GenericRowWithSchema).
Error:
Caused by: java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to MyItem
Question: How to capture Array of Structs in a function and how to return a modified array of items?
Welcome to Stackoverflow!
There is a small json linting error in your data: I assumed that you wanted to close the [] square brackets of the list array. So, for this example I used the following data (which is the same as yours):
{"conditions":{"list":[{"element":{"code":"1234","category":"ABC"}},{"element":{"code":"4550","category":"EDC"}}]}}
You don't need UDFs for this: a simple map operation will be sufficient! The following code does what you want:
import spark.implicits._
import org.apache.spark.sql.Encoders
case class MyItem(code: String, category: String)
case class MyElement(element: MyItem)
case class MyList(list: Seq[MyElement])
case class MyClass(conditions: MyList)
val df = spark.read.json("./someData.json").as[MyClass]
val transformedDF = df.map{
case (MyClass(MyList(list))) => MyClass(MyList(list.map{
case (MyElement(item)) => MyElement(MyItem(code = "XXXX", item.category))
}))
}
transformedDF.show(false)
+--------------------------------+
|conditions |
+--------------------------------+
|[[[[XXXX, ABC]], [[XXXX, EDC]]]]|
+--------------------------------+
As you see, we're doing some simple pattern matching on the case classes we've defined and successfully renaming all of the code fields' values to "XXXX". If you want to get a json back, you can call the to_json function like so:
transformedDF.select(to_json($"conditions")).show(false)
+----------------------------------------------------------------------------------------------------+
|structstojson(conditions) |
+----------------------------------------------------------------------------------------------------+
|{"list":[{"element":{"code":"XXXX","category":"ABC"}},{"element":{"code":"XXXX","category":"EDC"}}]}|
+----------------------------------------------------------------------------------------------------+
Finally a very small remark about the data. If you have any control over how the data gets made, I would add the following suggestions:
The conditions JSON object seems to have no function in here, since it just contains a single array called list. Consider making the conditions object the array, which would allow you to discard the list name. That would simpify your structure
The element object does nothing, except containing a single item. Consider removing 1 level of abstraction there too.
With these suggestions, your data would contain the same information but look something like:
{"conditions":[{"code":"1234","category":"ABC"},{"code":"4550","category":"EDC"}]}
With these suggestions, you would also remove the need of the MyElement and the MyList case classes! But very often we're not in control over what data we receive so this is just a small disclaimer :)
Hope this helps!
EDIT: After your addition of simplified data according to the above suggestions, the task gets even easier. Again, you only need a map operation here:
import spark.implicits._
import org.apache.spark.sql.Encoders
case class MyItem(code: String, category: String)
case class MyClass(conditions: Seq[MyItem])
val data = Seq(MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("4550", "EDC"))))
val df = data.toDF.as[MyClass]
val transformedDF = df.map{
case MyClass(conditions) => MyClass(conditions.map{
item => MyItem("XXXX", item.category)
})
}
transformedDF.show(false)
+--------------------------+
|conditions |
+--------------------------+
|[[XXXX, ABC], [XXXX, EDC]]|
+--------------------------+
I am able to find a simple solution with Spark 3.1+ as new features are added in this new Spark version.
Updated code:
val data = Seq(
MyClass(conditions = Seq(MyItem("1234", "ABC"), MyItem("234", "KBC"))),
MyClass(conditions = Seq(MyItem("4550", "DTC"), MyItem("900", "RDT")))
)
import spark.implicits._
val ds = data.toDF()
val updatedDS = ds.withColumn(
"conditions",
transform(
col("conditions"),
x => x.withField("code", updateArray(x.getField("code")))))
updatedDS.show()
UDF:
def updateArray = udf((oldVal: String) => {
if(oldVal.contains("1234"))
"XXX"
else
oldVal
})
I need to perform binary search on an array of custom case class. This should be as simple as calling the search function defined in scala.collection.Searching:
As you can see, if the collection on which I call the search method is an indexed sequence, the binary search is performed.
Now, I need to create my custom Ordering[B] parameter and I want to pass it explicitly to the search function (I don't want for it to take any implicit parameter inferred from context).
I have the following code:
// File 1
case class Person(name: String, id: Int)
object Person{
val orderingById: Ordering[Person] = Ordering.by(e => e.id)
}
// File 2 (same package)
for(i <- orderedId.indices) {
// orderedId is an array of Int
// listings is an array of Person
val listingIndex = listings.search(orderedId(i))(Person.orderingById)
...
}
I get the following error:
Type mismatch. Required: Ordering[Any], found: Ordering[Nothing]
So, I tried change the implementation in this way:
// file 1
object Person{
implicit def orderingById[A <: Person] : Ordering[A] = {
Ordering.by(e => e.id)
}
}
//file 2 as before
This time getting the following error:
Type mismatch. Required: Ordering[Any], found: Ordering[Person]
Why does it happen? At least in the second case, should it convert from Any to Person?
Follow the type specifications.
If you want to .search() on a collection of Person elements then the first search parameter should be a Person (or a super-class thereof).
val listingIndex =
listings.search(Person("",orderedId(i)))(Person.orderingById)
Or, to put it in a more complete and succinct context:
import scala.collection.Searching.SearchResult
case class Person(name: String, id: Int)
val listings: Array[Person] = ...
val orderedId: Array[Int] = ...
for(id <- orderedId) {
val listingIndex: SearchResult =
listings.search(Person("",id))(Ordering.by(_.id))
}
I'll add a bit just to elaborate about your error. First, please note that Searching.search is deprecated, with deprecation message:
Search methods are defined directly on SeqOps and do not require scala.collection.Searching any more.
search is now defined on IndexedSeqOps. Let's look at the signature:
final def search[B >: A](elem: B)(implicit ord: Ordering[B])
When you call:
listings.search(orderedId(i))(Person.orderingById)
The result of orderedId(i) is Int. Therefore, B in the signature above is Int. The definition of Int is:
final abstract class Int private extends AnyVal
A is Person, because listing is of type Array[Person]. Therefore, search, is looking for a common root for both Int and Person. This common root is Any, hence you are getting this error. One way to overcome it, is to define an implicit conversion from Int to Person:
object Person{
val orderingById: Ordering[Person] = Ordering.by(e => e.id)
implicit def apply(id: Int): Person = {
Person("not defined", id)
}
}
Then the following:
val listings = Array(Person("aa", 1), Person("bb", 2), Person("dd", 4))
val orderedId = 1.to(6).toArray
for(i <- orderedId.indices) {
// orderedId is an array of Int
// listings is an array of Person
listings.search[Person](orderedId(i))(Person.orderingById) match {
case Found(foundIndex) =>
println("foundIndex: " + foundIndex)
case InsertionPoint(insertionPoint) =>
println("insertionPoint: " + insertionPoint)
}
}
will produce:
foundIndex: 0
foundIndex: 1
insertionPoint: 2
foundIndex: 2
insertionPoint: 3
insertionPoint: 3
Code run in Scastie.
I try to have Seq[String], containing the fields name of a case class
And another Seq[String] containing values of case class.
In a generic way. I think I will have to map values with a Poly1 function to have the Arbitrary type => String.
But now, I'm not able to extract keys and values form LabelledGenerics.
def apply[T,R <: HList](value : T)(implicit gen: LabelledGeneric.Aux[T, R],
keys : Keys[R],
valuesR : Values[R]
) {
val hl = gen.to(value)
val keys = hl.keys ...
val values = hl.values.map ...
}
I'm not sure if I have to ask for keys and values implicit or if it's possible to have this from the LabelledGeneric.
I have tried to map the following Poly over keys to have a hlist of string.
But it's seems keys are not Witness
object PolyWitnesToString extends Poly1 {
implicit def witnessCase = at[Witness]{ w => w.toString}
}
I'm a little bit lost now.
I know there are more elaborate ways to achieve this in Java, but Groovy should have a concise way to do the same as per http://groovy.codehaus.org/Looping
Class Currency.groovy
class Currency {
String name
double rate
}
CurrencyController
def select(){
List<Currency> selectedCurrencies = Currency.getAll(params.currencies)
selectedCurrencies.eachWithIndex { obj, i -> obj.rate = update(obj.name)};
[selectedCurrencies:selectedCurrencies]
}
def update(String sym){
return sym
}
The above code throws:
No signature of method: currencychecker.CurrencyController$_$tt__select_closure12.doCall() is applicable for argument types: (currencychecker.Currency)
Thanks to #dmahapatro, the issue was that I was using an iterator variable obj[i], even though obj itself is the iterated object. The rest is correct!
I experimented with selectCurrencies.each as well instead of selectCurrencies.eachWithIndex however the right one in this case is eachWithIndex
I don't understand why the compiler cannot understand the case instruction mapping on tuple when i try to use with generics Array[T].
class Variable[T](val p: Prototype[T], val value: T)
class Prototype[T](val name: String)(implicit m: Manifest[T])
// Columns to variable converter
implicit def columns2Variables[T](columns:Array[(String,Array[T])]): Iterable[Variable[Array[T]]] = {
columns.map{
case(name,value) =>
new Variable[Array[T]](new Prototype[Array[T]](name), value)
}.toIterable
}
Error say :
error: constructor cannot be instantiated to expected type;
found : (T1, T2)
required: fr.geocite.simExplorator.data.Variable[Array[T]]
case(name,value) =>
I'm also not sure about the wording of the error, but first of all, you will need the manifest for T because it is required for constructing new Prototype[Array[T]] (the array manifest can be automatically generated if a manifest for its type parameter is in scope).
Is there any reason you absolutely need arrays? They come with the irregularity of Java's type system, they are mutable, and they offer very little advantage over for example Vector. Lastly, and that's probably why carry around the manifests, unlike arrays standard collections do not require manifests for construction.
class Variable[T](val p: Prototype[T], val value: T)
class Prototype[T](val name: String)
implicit def cols2v[T](cols: Vector[(String,Vector[T])]): Vector[Variable[Vector[T]]] =
cols.map {
case (name, value) => new Variable(new Prototype(name), value)
}