Benchmark a real-time planning algorithm using Optaplanner - benchmarking

I'm trying to benchmark a real-time planning algorithm but can't seem to find how to do it, is this supported in Optaplanner?
I've successfully run a benchmark using an offline version of my problem. I've implemented SolutionFileIO that reads my problem instances and converts them to a solution. I've read the docs and saw the video related to benchmarking but couldn't find what I'm looking for.
Alternatively, I can run the real-time algorithms using my own framework, but that would require me to manually define all Optaplanner heuristics that I want to run (which is quite cumbersome when using a matrix setup). Is there a way to instantiate the solvers (in Java) based on the benchmark xml definition? This would allow me to run my own real-time benchmark while still using the Optaplanner benchmark definition.

A benchmark config that also fires ProblemFactChange events (= real-time planning), is not yet supported, vote for this jira. How would you like the benchmark config to look like?
To HACK reusing the solvers from a benchmark configuration, cast PlannerBenchmark to PlannerBenchmarkRunner and use getPlannerBenchmarkResult().getSolverBenchmarkResultList(), but that will give up on a bunch of orchestration (including the report). Instead, if you can succeed in overriding SubSingleBenchmarkResult, you wouldn't loose that orchestration (but your hacks would be even deeper).
Whatever you end up doing, do share how you'd the benchmark config to look like, as this will give us inspiration when we implement it for a future OptaPlanner version.


NVIDIA Triton vs TorchServe for SageMaker Inference

NVIDIA Triton vs TorchServe for SageMaker inference? When to recommend each?
Both are modern, production grade inference servers. TorchServe is the DLC default inference server for PyTorch models. Triton is also supported for PyTorch inference on SageMaker.
Anyone has a good comparison matrix for both?
Important notes to add here where both serving stacks differ:
TorchServe does not provide the Instance Groups feature that Triton does (that is, stacking many copies of the same model or even different models onto the same GPU). This is a major advantage for both realtime and batch use-cases, as the performance increase is almost proportional to the model replication count (i.e. 2 copies of the model get you almost twice the throughput and half the latency; check out a BERT benchmark of this here). Hard to match a feature that is almost like having 2+ GPU's for the price of one.
if you are deploying PyTorch DL models, odds are you often want to accelerate them with GPU's. TensorRT (TRT) is a compiler developed by NVIDIA that automatically quantizes and optimizes your model graph, which represents another huge speed up, depending on GPU architecture and model. It is understandably so probably the best way of automatically optimizing your model to run efficiently on GPU's and make good use of TensorCores. Triton has native integration to run TensorRT engines as they're called (even automatically converting your model to a TRT engine via config file), while TorchServe does not (even though you can use TRT engines with it).
There is more parity between both when it comes to other important serving features: both have dynamic batching support, you can define inference DAG's with both (not sure if the latter works with TorchServe on SageMaker without a big hassle), and both support custom code/handlers instead of just being able to serve a model's forward function.
Finally, MME on GPU (coming shortly) will be based on Triton, which is a valid argument for customers to get familiar with it so that they can quickly leverage this new feature for cost-optimization.
Bottom line I think that Triton is just as easy (if not easier) ot use, a lot more optimized/integrated for taking full advantage of the underlying hardware (and will be updated to keep being that way as newer GPU architectures are released, enabling an easy move to them), and in general blows TorchServe out of the water performance-wise when its optimization features are used in combination.
Because I don't have enough reputation for replying in comments, I write in answer.
MME is Multi-model endpoints. MME enables sharing GPU instances behind an endpoint across multiple models and dynamically loads and unloads models based on the incoming traffic.
You can read it further in this link

ECL - HPCC Testing Roxie query

I'm trying to write a Roxie query using ECL language. Is there a way to write and test the code without constantly publishing the query?
I am assuming that you are only seeking to avoid the extra UI-oriented steps in publishing a query (switching back and forth between ECL Watch and your dev environment, for instance). You can make testing on roxie relatively painless with some build scripting and REST calls.
In HPCC, roxie and hthor are similar from an execution and runtime environment viewpoint. Their tactical execution strategies are different (roxie handles queries in OS threads, hthor handles them by forking child processes), but the rule of thumb is that if you can get code to work well in hthor then it will probably work well in roxie.
You can leverage that similarity during development. Rather than publishing the query to roxie, testing, tearing down the query, and repeating all that, you can simply submit the job to hthor (much like you would do for a thor job). You would have to hardcode some test values that would normally be parameters for the roxie query, but that is simple enough.
An additional bonus to using hthor is that it is the only engine that supports any kind of step-by-step debugging. That can be a hit-or-miss proposition, though, depending on the version of HPCC you are executing against and you did not mention that. Even if you don't use the debugger, hthor's execution graphs at least show details on a specific query's data flows such as record counts at each step (roxie shows the graph, but there is no detailed information on individual queries).

What is a good lightweight ORM for my need using Kotlin?

Scenario :
I am having an application where I am using AWS Lambdas which are written in Kotlin to query data from a relational DB residing in AWS.
My problem is that I want to use an ORM for firing these queries. I dont want to use hibernate as it is too heavy and takes too long to setup, and I need a solution that would take up the least time in setting up and firing from the Lambdas. I have looked upon multiple ORMSs like Exposed, Requery, Jooq, Ktorm and Squash.
Is anybody out there having experience with any of these libraries in the serverless context? What are your experiences with them and what would you suggest using in my scenario?
You can have a look at exposed,
I have been using Squash with the Hikari connection pool for some large projects and I have been very happy with it. I like that is is very extensible and my team has been able to solve any issues that come up, implementing extensions to the dialect and the simplicity of defining TableDefinition classes makes it work well for generating code. It is also very self contained with very few dependencies and light on reflection, so should be good for serverless though I have not personally used it for that.
Squash is less an ORM than an sql abstraction / translation layer that ties into entities and it doesn't try to solve all the problems that something like hibernate does. In my experience ORMs start as simple, efficient, and powerful projects and grow to heavyweight libraries that try to do too much and their complexity begins to cause issues when the developer cannot easily see what's going on in the chain from usage through to the database / storage mechanism.
One negative about squash that deserves mention is that, while it is a jetlbrains official library and created by a kotlin developer, support is limited as orangy, the creator, is quite busy and I have feature pull requests outstanding, with many more of them backed up currently. I chose it because I favored it's simplicity and extensibility among a small but advanced team of developers all capable of improving upon it.
Which ever library you choose I hope these factors assist you in making your decision at the least.

Flink - Building the operator graph

Good morning everybody,
I have already used Apache Storm to build topologies and I found that a good thing about the API they expose is the possibility to "manually" connect the operators in the graph topology.
You can create loops, for example.
I was wondering if there is a best practice to achieve the same "expressivity" in Flink.
Thank you so much!
Cyclic topologies are not supported in Flink. You can perform iterations through a specific operator. Except for cycles, you define your graph through the standard API and it's rather flexible compared to, for example, Spark. Many DataSet and DataStream API accept both functions and custom implementations of classes like RichMapFunction,RichFlatMapFunction and so on. This gives a huge degree of flexibility and customizability together with modularity and reusability. It takes some time to go beyond the standard API and learn how to customize your Flink Jobs properly but it's worth it.
Flink has an "easy-mode", that resembles the API of Spark, in which you can do most of the stuff you need. When you want to express stuff that is out of the scope and use cases of the standard API, instead of doing weird workarounds like you have to in Spark, you can work directly with a layer that is partially below the standard API. There are many pieces that you can extend and customize and then plug in place of the provided operators/triggers/sources/sinks and so on. This is mostly documented feature by feature.

Stress testing with pycassa

I've been trying to write a stress tester for a rather large cassandra database. At first I was doing it from scratch, and then I found which allows you to stress test your cluster. However, like all benchmarks, the test data is unrepresentative of the loads this database will be seeing. Thus I decided to modify it to be more realistic to my usage pattern.
I'm using pycassa for most of this project. However uses the lower-level thrift interface directly, which I find rather cumbersome. Are there any projects out there which stress test cassandra using pycassa? Thanks!
I'm not aware of any existing general-purpose stress tests that make use of pycassa; I'd also love to hear about them if there are any.
In the past, I've modified to make use of pycassa. I believe I set it up to use one small ConnectionPool per process and I was pretty happy with the result; modifying the Operation class and get_client was the main chunk of work here.
It's hard to give more specific details about this without knowing what you want to do, so feel free to ask more detailed questions if you need to.