Flink Scala API functions on generic parameters

Flink Scala API functions on generic parameters - apache-flink

It's a follow up question on Flink Scala API "not enough arguments".
I'd like to be able to pass Flink's DataSets around and do something with it, but the parameters to the dataset are generic.
Here's the problem I have now:
import org.apache.flink.api.scala.ExecutionEnvironment
import org.apache.flink.api.scala._
import scala.reflect.ClassTag
object TestFlink {
def main(args: Array[String]) {
val env = ExecutionEnvironment.getExecutionEnvironment
val text = env.fromElements(
"Who's there?",
"I think I hear them. Stand, ho! Who's there?")
val split = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }
id(split).print()
env.execute()
}
def id[K: ClassTag](ds: DataSet[K]): DataSet[K] = ds.map(r => r)
}
I have this error for ds.map(r => r):
Multiple markers at this line
- not enough arguments for method map: (implicit evidence$256: org.apache.flink.api.common.typeinfo.TypeInformation[K], implicit
evidence$257: scala.reflect.ClassTag[K])org.apache.flink.api.scala.DataSet[K]. Unspecified value parameters evidence$256, evidence$257.
- not enough arguments for method map: (implicit evidence$4: org.apache.flink.api.common.typeinfo.TypeInformation[K], implicit evidence
$5: scala.reflect.ClassTag[K])org.apache.flink.api.scala.DataSet[K]. Unspecified value parameters evidence$4, evidence$5.
- could not find implicit value for evidence parameter of type org.apache.flink.api.common.typeinfo.TypeInformation[K]
Of course, the id function here is just an example, and I'd like to be able to do something more complex with it.
How it can be solved?

you also need to have TypeInformation as a context bound on the K parameter, so:
def id[K: ClassTag: TypeInformation](ds: DataSet[K]): DataSet[K] = ds.map(r => r)
The reason is, that Flink analyses the types that you use in your program and creates a TypeInformation instance for each type you use. If you want to create generic operations then you need to make sure a TypeInformation of that type is available by adding a context bound. This way, the Scala compiler will make sure an instance is available at the call site of the generic function.

Related

Refactoring after deprecation of getFoldableComposition, option, array et al

I spent some time last year trying to learn fp-ts. I've finally come around to using it in a project and a lot of my sample code has broken due to the recent refactoring. I've fixed a few of the breakages but am strugging with the others. It highlights a massive whole in my FP knowledge no doubt!
I had this:
import { strict as assert } from 'assert';
import { array } from 'fp-ts/Array';
import { getFoldableComposition, } from 'fp-ts/Foldable';
import { Monoid as MonoidString } from 'fp-ts/string'
import { none,some, option } from 'fp-ts/Option';
const F = getFoldableComposition(array, option)
assert.strictEqual(F.reduce([some('a'), none, some('c')], '', MonoidString.concat), 'ac')
getFoldableComposition, option and array are now deprecated. The comments on getFoldableComposition say to use reduce, foldMap or reduceRight instead, so, amongst other things, I tried this.
import { strict as assert } from 'assert';
import { reduceRight } from 'fp-ts/Foldable';
import { Monoid as MonoidString } from 'fp-ts/string'
import { some } from 'fp-ts/Option';
assert.strictEqual(reduceRight([some('a'), none, some('c')], '', MonoidString.concat), 'ac')
That's not even compiling, so obviously I'm way off base.
Could someone please show me the correct way to replace getFoldableComposition and, while we're at it, explain what is meant by 'Use small, specific instances instead' as well for option and array? Also, anything else I'm obviously doing wrong?
Thank you!

Let's start with your question
what is meant by 'Use small, specific instances instead' as well for option and array?
Prior to fp-ts v2.10.0, type class instances were grouped together as a single record implementing the interfaces of multiple classes, and the type class record was named after the data type for which the classes were defined. So for the Array module, array was exported containing all the instances; it had map for Functor and ap for Apply etc. For Option, the option record was exported with all the instances. And so on.
Many functions, like getFoldableComposition and sequenceT are defined very generically using "higher-kinded types" and require you to pass in the type class instance for the data type you wanted the function to use. So, e.g., sequenceT requires you to pass an Apply instance like
assert.deepEqual(
sequenceT(O.option)([O.some(1), O.none]),
O.none
)
Requiring these big records of type classes instances to be passed around like that ended up making fp-ts not tree-shake well in application and library code, because JS bundlers couldn't statically tell which members of the type class record where being accessed and which weren't, so it ended up including all of them even if only one was used. That increases bundle size, which ultimately makes your app load slower for users and/or increases the bundle size of libraries consuming your library.
The solution to this problem was to break the big type class records apart and give each type class its own record. So now each data type module exports small, individual type class instances and eventually the mega-instance record will be removed. So now you would use sequenceT like
assert.deepEqual(
sequenceT(O.Apply)([O.some(1), O.none]),
O.none
)
Now the bundler knows that only Apply methods are being used, and it can remove unused instances from the bundle.
So the upshot of all this is to just not use the mega instance record anymore and only use the smaller instance records.
Now for your code.
The first thing I'll say is talk to the compiler. Your code should give you a compile error. What I'm seeing is this:
So you passed reduceRight too many arguments, so let's look at the signature:
export declare function reduceRight<F extends URIS, G extends URIS>(
F: Foldable1<F>,
G: Foldable1<G>
): <B, A>(b: B, f: (a: A, b: B) => B) => (fga: Kind<F, Kind<G, A>>) => B
First thing you should note, this function is curried and requires three invocations in order to fully evaluate (i.e. it is curried to three separate function calls). First it takes the type class instances, then the accumulator and reducing function, and finally it takes the data type we are reducing.
So first it takes a Foldable instance for a type of kind Type -> Type, and another Foldable instance for another (or the same) type of kind Type -> Type. This is where the small vs big instance record comes into play. You'll pass SomeDataType.Foldable instead of SomeDataType.someDataType.
Then it takes polymorphic type B of kind Type as the initial value for the reduce (aka the "accumulator") and a binary function which takes polymorphic type A of kind Type and B and returns B. This is the typical signature of a reduceRight.
Then it takes a scary looking type which is making use of higher-kinded types. I would pronounce it as "F of G of A" or F<G<A>>. And finally it returns B, the reduced value.
Sounds complicated, but hopefully after this it won't seem so bad.
From looking at your code, it appears you want to reduce an Array<Option<string>> into a string. Array<Option<string>> is the higher-kinded type you want to specify. You just replace "F of G of A" with "Array of Option of string". So in the signature of reduceRight, F is the Foldable instance for Array and G is the Foldable instance for Option.
If we pass those instances, we'll get back a reduceRight function specialized for an array of options.
import * as A from 'fp-ts/Array'
import * as O from 'fp-ts/Option'
import { reduceRight } from 'fp-ts/Foldable'
const reduceRightArrayOption: <B, A>(
b: B,
f: (a: A, b: B) => B) => (fga: Array<O.Option<A>>) => B =
reduceRight(A.Foldable, O.Foldable)
Then we call this reduce with the initial accumulator and a reducing function that takes the value inside Array<Option<?>> which is string and the type of the accumulator, which is also string. In your initial code, you were using concat for string. That will work here, and you'll find it on the Monoid<string> instance in the string module.
import * as A from 'fp-ts/Array'
import * as O from 'fp-ts/Option'
import { reduceRight } from 'fp-ts/Foldable'
import * as string from 'fp-ts/string'
const reduceRightArrayOption: <B, A>(
b: B,
f: (a: A, b: B) => B) => (fga: Array<O.Option<A>>) => B
= reduceRight(A.Foldable, O.Foldable)
const reduceRightArrayOptionStringToString: (fga: Array<O.Option<string>>) => string
= reduceRightArrayOption("", string.Monoid.concat)
Finally, it's ready to take our Array<O.Option<string>>.
import * as assert from 'assert'
import * as A from 'fp-ts/Array'
import * as O from 'fp-ts/Option'
import { reduceRight } from 'fp-ts/Foldable'
import * as string from 'fp-ts/string'
const reduceRightArrayOption: <B, A>(
b: B,
f: (a: A, b: B) => B) => (fga: Array<O.Option<A>>) => B
= reduceRight(A.Foldable, O.Foldable)
const reduceRightArrayOptionStringToString: (fga: Array<O.Option<string>>) => string
= reduceRightArrayOption("", string.Monoid.concat)
const result = reduceRightArrayOptionStringToString([
O.some('a'),
O.none,
O.some('c'),
])
assert.strictEqual(result, "ac")
To simplify all of this, we can use the more idiomatic pipe approach to calling reduceRight:
import * as assert from "assert"
import { reduceRight } from "fp-ts/Foldable"
import * as string from "fp-ts/string"
import * as O from "fp-ts/Option"
import * as A from "fp-ts/Array"
import { pipe } from "fp-ts/lib/function"
assert.strictEqual(
pipe(
[O.some("a"), O.none, O.some("c")],
reduceRight(A.Foldable, O.Foldable)(string.empty, string.Monoid.concat)
),
"ac"
)
I know that was a lot, but hopefully it provides a little clarity about what's going on. reduceRight is very generic, in a way that almost no other TypeScript libraries attempt to be, so it's totally normal if it takes you a while to get your head around it. Higher-kinded types are not a built-in feature of TypeScript, and the way fp-ts does it is admittedly a bit of a hack to work around the limitations of TS. But keep playing around and experimenting. It'll all start to click eventually.

Is there any way to use flow to restrict specific string patterns?

I'm using Flow on a React webapp and I'm currently facing a use-case where I'm asking for the user to input certain time values in a "HH:mm" format. Is there any way to describe what pattern is being followed by the strings?
I've been looking around for a solution but the general consensus which I agree to to a certain point seems to be that you don't need to handle this kind of thing using Flow, favouring using validating functions and relying on the UI code to supply the code following the correct pattern. Still, I was wondering if there is any way to achieve this in order to make the code as descriptive as possible.

You want to create a validator function, but enhanced using Opaque Type Aliases: https://flow.org/en/docs/types/opaque-types/
Or, more specifically, Opaque Type Aliases with Subtyping Constraints: https://flow.org/en/docs/types/opaque-types/#toc-subtyping-constraints
You should write a validator function in the same file where you define the opaque type. It will accept the primitive type as an argument and return a value typed as the opaque type with subtyping constraint.
Now, in a different file, you can type some variables as the opaque type, for example in function arguments. Flow will enforce that you only pass values that go through your validator function, but these could be used just as if they were the primitive type.
Example:
exports.js:
export opaque type ID: string = string;
function validateID(x: string): ID | void {
if ( /* some validity check passes */ ) {
return x;
}
return undefined;
}
import.js:
import type {ID} from './exports';
function formatID(x: ID): string {
return "ID: " + x; // Ok! IDs are strings.
}
function toID(x: string): ID {
return x; // Error: strings are not IDs.
}

How to serialize/unserialize an Array of Custom object in Kotlin?

In my Kotlin Android project, I made a FileItem class which extends Serializable
class FileItem(<parameters>) : Serializable, Comparable<FileItem> {
So I needed to Serialize instances of this class into a Bundle
val arguments:Bundle = Bundle()
arguments.putSerializable("folders", folders as Serializable)
where folders has been declared as :
folders:Array<FileItem> (method parameter)
The serialization code above compile without any warning. Meanwhile, the problem comes when I need to unserialize folders items :
val arguments: Bundle? = getArguments()
if (arguments != null){
foldersItems = arguments.getSerializable("folders") as Array<FileItem>
where foldersItems is declared as
var foldersItems: Array<FileItem>?
I get the following warning, that I can't manage to solve without suppress_warning annotation :
w: <Path to my class>: (78, 28): Unchecked cast: java.io.Serializable! to kotlin.Array<com.loloof64.android.chess_positions_archiver.main_file_explorer.FileItem>
This kind of code compiles in Java/Groovy without warning (folderItems is then a FileItem[]), so how can I modify the kotlin code for the compiler to be "satisfied" ?
I noticed in official Kotlin documentation that Kotlin Array does not extend Serializable and is not open for inheritance. Is it possible meanwhite to "add" it via a kind of extension method ?

In fact, the cast is not unchecked, the compiler's warning is misleading.
This happens because in Kotlin arrays are represented by generic class Array<T>, and the compiler treats it as usual generic class with type parameters erased at runtime.
But on JVM arrays have reified types, and when you cast something as Array<SomeType>, the generated bytecode really checks the type parameter to be SomeType as well as something being an Array<*>, which would only happen for any other generic class.
This example shows that the array cast is checked:
val a: Any = Array<Int>(1) { 0 }
val i = a as Array<Int>
val d = a as Array<Double> // gets checked and throws ClassCastException
The easiest solution is indeed to #Suppress("UNCHECKED_CAST"), because actually there should not be any warning.
I filed an issue describing the problem in Kotlin issue tracker.

The cast here is unchecked because the compiler here can't ensure the nullability of array's generic type parameter.
Consider the following example:
fun castAsArrayOfString(param: Any) = param as Array<String>
castAsArrayOfString(arrayOf("a")) // is Array<String>, all ok
castAsArrayOfString(arrayOf("a", null)) // is Array<String>, but contains null
So the compiler warns you about potential type safety problems this cast could introduce.

How do I create a Flow with a different input and output types for use inside of a graph?

I am making a custom sink by building a graph on the inside. Here is a broad simplification of my code to demonstrate my question:
def mySink: Sink[Int, Unit] = Sink() { implicit builder =>
val entrance = builder.add(Flow[Int].buffer(500, OverflowStrategy.backpressure))
val toString = builder.add(Flow[Int, String, Unit].map(_.toString))
val printSink = builder.add(Sink.foreach(elem => println(elem)))
builder.addEdge(entrance.out, toString.in)
builder.addEdge(toString.out, printSink.in)
entrance.in
}
The problem I am having is that while it is valid to create a Flow with the same input/output types with only a single type argument and no value argument like: Flow[Int] (which is all over the documentation) it is not valid to only supply two type parameters and zero value parameters.
According to the reference documentation for the Flow object the apply method I am looking for is defined as
def apply[I, O]()(block: (Builder[Unit]) ⇒ (Inlet[I], Outlet[O])): Flow[I, O, Unit]
and says
Creates a Flow by passing a FlowGraph.Builder to the given create function.
The create function is expected to return a pair of Inlet and Outlet which correspond to the created Flows input and output ports.
It seems like I need to deal with another level of graph builders when I am trying to make what I think is a very simple flow. Is there an easier and more concise way to create a Flow that changes the type of it's input and output that doesn't require messing with it's inside ports? If this is the right way to approach this problem, what would a solution look like?
BONUS: Why is it easy to make a Flow that doesn't change the type of its input from it's output?

If you want to specify both the input and the output type of a flow, you indeed need to use the apply method you found in the documentation. Using it, though, is done pretty much exactly the same as you already did.
Flow[String, Message]() { implicit b =>
import FlowGraph.Implicits._
val reverseString = b.add(Flow[String].map[String] { msg => msg.reverse })
val mapStringToMsg = b.add(Flow[String].map[Message]( x => TextMessage.Strict(x)))
// connect the graph
reverseString ~> mapStringToMsg
// expose ports
(reverseString.inlet, mapStringToMsg.outlet)
}
Instead of just returning the inlet, you return a tuple, with the inlet and the outlet. This flow can now we used (for instance inside another builder, or directly with runWith) with a specific Source or Sink.

Why can't I create an array of generic type?

This does not work:
def giveArray[T](elem:T):Array[T] = {
new Array[T](1)
}
But this does:
def giveList[T](elem:T):List[T] = {
List.empty[T]
}
I am sure this is a pretty basic thing and I know that Arrays can behave a bit unusual in Scala.
Could someone explain to me how to create such an Array and also why it doesn't work in the first place?

This is due to JVM type erasure. Manifest were introduce to handle this, they cause type information to be attached to the type T. This will compile:
def giveArray[T: Manifest](elem:T):Array[T] = {
new Array[T](1)
}
There are nearly duplicated questions on this. Let me see if I can dig up.
See http://www.scala-lang.org/docu/files/collections-api/collections_38.html for more details. I quote (replace evenElems with elem in your case)
What's required here is that you help the compiler out by providing some runtime hint what the actual type parameter of evenElems is
In particular you can also use ClassManifest.
def giveArray[T: ClassManifest](elem:T):Array[T] = {
new Array[T](1)
}
Similar questions:
cannot find class manifest for element type T
What is a Manifest in Scala and when do you need it?
About Scala generics: cannot find class manifest for element type T

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Flink Scala API functions on generic parameters - apache-flink

Related

Refactoring after deprecation of getFoldableComposition, option, array et al

Is there any way to use flow to restrict specific string patterns?

How to serialize/unserialize an Array of Custom object in Kotlin?

How do I create a Flow with a different input and output types for use inside of a graph?

Why can't I create an array of generic type?

Categories

Resources