flink MultipleLinearRegression fit take 3 params - apache-flink

I follow the example of
https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/batch/libs/ml/multiple_linear_regression.html
but in the example the fit function only need one param,but in my code , fit require three params,
mlr.fit(training, fitParameters, fitOperation);
I thought fitParameters may be a alternative for setIterations(),setStepsize()
but what is fitOperation?

The fitOperation parameter is actually an implicit parameter which is filled in automatically by the Scala compiler. It encapsulates the MLR logic.
Since your fit function has 3 parameters, I suspect that you're using FlinkML with Flink's Java API. I would highly recommend you using the Scala API, because otherwise you will have to construct the ML pipelines manually. If you still want to do it, then take a look at the FitOperations defined in the MultipleLinearRegression companion object.

Related

How to design generic backward compatible API for embedded software application library interface in C?

I am tasked to assist with the design of a dynamic library (exposed with a C interface) aimed to be used in embed software application on various embed platform (Android,Windows,Linux).
Main requirements are speed , and decoupling.
For the decoupling part : one of our requirement is to be able to facilitate integration and so permit backward compatibility and resilience.
My library have some entry points that should be called by the integrating software (like an initialize constructor to provide options as where to log, how to behave etc...) and could also call some callback in the application (an event to inform when task is finished).
So I have come with several propositions but as each of one not seems great I am searching advice on a better or standard ways to achieve decoupling an d backward compatibility than this 3 ways that I have come up :
First an option that I could think of is to have a generic interface call for my exposed entry points for example with a hashmap of key/values for the parameters of my functions so in pseudo code it gives something like :
myLib.Initialize(Key_Value_Option_Array_Here);
Another option is to provide a generic function to provide all the options to the library :
myLib.SetOption(Key_Of_Option, Value_OfOption);
myLib.SetCallBack(Key_Of_Callbak, FunctionPointer);
When presenting my option my collegue asked me why not use a google protobuf argument as interface between the library and the embed software : but it seems weird to me, as their will be a performance hit on each call for serialization and deserialization.
Are there any more efficient or standard way that you coud think of?
You could have a struct for optional arguments:
typedef struct {
uint8_t optArg1;
float optArg2;
} MyLib_InitOptArgs_T;
void MyLib_Init(int16_t arg1, uint32_t arg2, MyLib_InitOptArgs_T const * optionalArgs);
Then you could use compound literals on function call:
MyLib_Init(1, 2, &(MyLib_InitOptArgs_T){ .optArg2=1.2f });
All non-specified values would have zero-ish value (0, NULL, NaN), and would be considered unused. Similarly, when passing NULL for struct pointer, all optional arguments would be considered unused.
Downside with this method is that if you expect to have many new arguments in the future, structure could grow too big. But whether that is an issue, depends on what your limits are.
Another option is to simply have multiple smaller initialization functions for initializating different subsystems. This could be combined with the optional arguments system above.

Difference between Data_Wrap_Struct and TypedData_Wrap_Struct?

I'm wrapping a C struct in a Ruby C extension but I can't find the differente between Data_Wrap_Struct and TypedData_Wrap_Struct in the docs, what's the difference between the two functions?
It's described pretty well in the official documentation.
The tl;dr is that Data_Wrap_Struct is deprecated and just lets you set the class and the mark/free functions for the wrapped data. TypedData_Wrap_Struct instead lets you set the class and then takes a pointer to a rb_data_type_struct structure that allows for more advanced options to be set for the wrapping:
the mark/free functions as before, but also
an internal label to identify the wrapped type
a function for calculating memory consumption
arbitrary data (basically letting you wrap data at a class level)
additional flags for garbage collection optimization
Check my unofficial documentation for a couple examples of how this is used.

How to get attribute value from Xml using C

I have the XML file as given as below
-<fmiModelDescription numberOfEventIndicators="0" variableNamingConvention="structured" generationDateAndTime="2015-06-22T14:46:19Z" generationTool="Dassault Systemes FMU Export from Simulink, ver. 2.1.1 (MATLAB 8.7 (R2014b) 08-Sep-2014)" version="1.4" author="Dan Henriksson" description="S-function with FMI generated from Simulink model BouncingBalls" guid="{76da271a-0d11-469c-bc24-0343629fb38e}" modelName="BouncingBalls_sf" fmiVersion="2.0"> <CoSimulation canHandleVariableCommunicationStepSize="true" modelIdentifier="BouncingBalls_sf"/> <DefaultExperiment stepSize="0.001" stopTime="10.0" startTime="0.0"/> -<ModelVariables>
I want to fetch the attribute value for eg GUID which is given in the above XML,how can i do that using C programming
Well the only valid answer is: use a library!
The probably best one (in terms of feature completeness) is libxml. Use this if there aren't any other concerns. There's good documentation, too.
If you need something small, there are a LOT of options, all with their limitations. I recently created badxml for this purpose. There are many other options, such as ezxml which I discovered just today in a question here.
But as I said, if size is not a concern, just use libxml, because it is widely used, well tested and feature-complete.

Passing array of bytes to an ActiveX

I want to pass a array of bytes to ActiveX. I am using delphi 7 and i'm using a InProcess Server (DLL).
I am using a pointer to the array of bytes and the size of the array, passing it to the InProcess Server. It is working well. I did this because I need performance. Does anyone see any trouble in this approach?
I see a post that is very similar: What data type is suitable to handle binary data in ActiveX method? but nobody gave this answer.
Passing the byte array as pointer together with size information is just fine.
However, some programming languages support only a small subset of all possible types. For example, Visual Basic for Application (not VB.NET) can only handle Automation compatible data types (see http://msdn.microsoft.com/en-us/library/cc237562(v=prot.20).aspx), and even not all of them (no support for 16bit unsigned integers, for example). To be on the safe side, I always use SAFEARRAYs whenever there is no good argument against it.
Also note that using non-automation compatible interfaces forces you to provide your own marshalling code in case you wanted to use your component OutProc. Since you mention that you intend to use your component only InProc, this should not worry you.
Regards,
Stuart

Array vs ArraySeq comparison

This is a bit of a general question but I was wondering if anybody could advise me on what would be advantages of working with Array vs ArraySeq. From what I have seen Array is scala's representation of java Array and there are not too many members in its API whereas ArraySeq seems to contain a much richer API.
There are actually four different classes you could choose from to get mutable array-like functionality.
Array + ArrayOps
WrappedArray
ArraySeq
ArrayBuffer
Array is a plain old Java array. It is by far the best way to go for low-level access to arrays of primitives. There's no overhead. Also it can act like the Scala collections thanks to implicit conversion to ArrayOps, which grabs the underlying array, applies the appropriate method, and, if appropriate, returns a new array. But since ArrayOps is not specialized for primitives, it's slow (as slow as boxing/unboxing always is).
WrappedArray is a plain old Java array, but wrapped in all of Scala's collection goodies. The difference between it and ArrayOps is that WrappedArray returns another WrappedArray--so at least you don't have the overhead of having to re-ArrayOps your Java primitive array over and over again for each operation. It's good to use when you are doing a lot of interop with Java and you need to pass in plain old Java arrays, but on the Scala side you need to manipulate them conveniently.
ArraySeq stores its data in a plain old Java array, but it no longer stores arrays of primitives; everything is an array of objects. This means that primitives get boxed on the way in. That's actually convenient if you want to use the primitives many times; since you've got boxed copies stored, you only have to unbox them, not box and unbox them on every generic operation.
ArrayBuffer acts like an array, but you can add and remove elements from it. If you're going to go all the way to ArraySeq, why not have the added flexibility of changing length while you're at it?
From the scala-lang.org forum:
Array[T] - Benefits: Native, fast -
Limitations: Few methods (only apply,
update, length), need to know T at
compile-time, because Java bytecode
represents (char[] different from
int[] different from Object[])
ArraySeq[T] (the class formerly known
as GenericArray[T]): - Benefits: Still
backed by a native Array, don't need
to know anything about T at
compile-time (new ArraySeq[T] "just
works", even if nothing is known about
T), full suite of SeqLike methods,
subtype of Seq[T] - Limitations: It's
backed by an Array[AnyRef], regardless
of what T is (if T is primitive, then
elements will be boxed/unboxed on
their way in or out of the backing
Array)
ArraySeq[Any] is much faster than
Array[Any] when handling primitives.
In any code you have Array[T], where T
isn't <: AnyRef, you'll get faster
performance out of ArraySeq.
Array is a direct representation of Java's Array, and uses the exact same bytecode on the JVM.
The advantage of Array is that it's the only collection type on the JVM to not undergo type erasure, Arrays are also able to directly hold primitives without boxing, this can make them very fast under some circumstances.
Plus, you get Java's messed up array covariance behaviour. (If you pass e.g. an Array[Int] to some Java class it can be assigned to a variable of type Array[Object] which will then throw an ArrayStoreException on trying to add anything that isn't an int.)
ArraySeq is rarely used nowadays, it's more of a historic artifact from older versions of Scala that treated arrays differently. Seeing as you have to deal with boxing anyway, you're almost certain to find that another collection type is a better fit for your requirements.
Otherwise... Arrays have exactly the same API as ArraySeq, thanks to an implicit conversion from Array to ArrayOps.
Unless you have a specific need for the unique properties of arrays, try to avoid them too.
See This Talk at around 19:30 or This Article for an idea of the sort of problems that Arrays can introduce.
After watching that video, it's interesting to note that Scala uses Seq for varargs :)
As you observed correctly, ArraySeq has a richer API as it is derived from IndexedSeq (and so on) whereas Array is a direct representation of Java arrays.
The relation between the both could be roughly compared to the relation of the ArrayList and arrays in Java.
Due to it's API, I would recommend using the ArraySeq unless there is a specific reason not to do so. Using toArray(), you can convert to an Array any time.

Resources