I want to subscribe to a kafka topic and ingest/ transform the data in snowflake.I would like to use the snowpark api for scala and the Streams & Tasks available in snowflake.
I couldnt find anything in the API Dokumentation nor via google. Is this option already available and if not is it planed on some roadmap?
Related
I want to create a Custom Apache Flink Sink to AWS Sagemaker Feature store, but there is no documentation for how to create custom sinks on Flink's website. There are also multiple base classes that I can potentially extend (e.g. AsyncSinkBase, RichSinkFunction), so I'm not sure which to use.
I am looking for guidelines regarding how to implement a custom sink (both in general and for my specific use-case). For my specific use-case: Sagemaker Feature Store has a synchronous client with a putRecord call to send records to AWS Sagemaker FS, so I am ideally looking for a way to create a custom sink that would work well with this client. Note: I require at at least once processing guarantees, as Sagemaker FS is DynamoDB (a key-value store) under the hood.
Java Client: https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sagemakerfeaturestoreruntime/AmazonSageMakerFeatureStoreRuntime.html
Example of the putRecord call using the Python client: https://github.com/aws-samples/amazon-sagemaker-feature-store-streaming-aggregation/blob/main/src/lambda/StreamingIngestAggFeatures/lambda_function.py#L31
What I've Found so Far
Some older articles which say to use org.apache.flink.streaming.api.functions.sink.RichSinkFunction and SinkFunction
Some connectors using classes in org.apache.flink.connector.base.sink.writer (e.g. AsyncSinkWriter, AsyncSinkBase)
This section of the Flink docs says to use the SourceReaderBase from org.apache.flink.connector.base.source.reader when creating custom sources; SourceBaseReader seems to be the equivalent source to the sink classes in the bullet above
Any help/guidance/insights are much appreciated, thanks.
How about extending RichAsyncFunction ?
you can find similar example here - https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/#async-io-api
I am trying to find a true open source alternative to our SAP PI ESB that is going to be discontinued in our company.
We are using it for integrating several systems. We have around 100 interface SOAP-SOAP (both synchronous and asynchronous), file (FTP) => SOAP and SOAP => file.
I have read many blogs and watched quite some videos about open source ESBs. But I think I still miss the whole picture. From what what i understood I can you Apache Camel as a core integration engine with appropriate adapters for delivering and transforming the messages, then JBoss Fuse (or perhaps OpenShift) to orchestrate the individual Camel "applications".
I am still missing last piece of the puzzele. I was not able to find any ready to use solution for message monitoring. If I search for monitoring i always find monitoring of the system (procesor load, memory, number of processes, number of messages...).
But I am looking for a message monitoring where I can filter the messages by date/time range, interface name, sender/receiver, status (completed, error, in queue...) and look into the message payload (XML or file) before and after the transformation. Fulltext search by message content and email notifications on errors would be also great.
Can anyone give me hint what to look for?
Does Apache Flink's Python SDK (PyFlink) Datastream API support operators like Windowing? Whatever examples I have seen so far for Windowing with PyFlink, all use the Table API. The Datastream API does support these operators, but looks like these are not available via PyFlink yet?
Thanks!
That's correct, PyFlink doesn't yet support the DataStream window API. Follow FLINK-21842 to track progress on this issue.
My application uses hystrix as circuit breaker. I want to export hystrix metrics data to InfluxDB(or other storage service). I didn't find any docuemnts taking about how to read these data.
Thanks!
I found this blog very useful regarding this subject. http://www.nurkiewicz.com/2015/02/storing-months-of-historical-metrics.html .
This talks about exporting data to graphite, I am sure it can be extended to InfluxDB as well.
If you want to custom write hystrix metrics data, you can see here
I am trying to write a Camel route to get JMX data from an ActiveMQ server through the Jolokia REST API. I was able to successfully get the JSON object from the ActiveMQ server, but I am running into an issue where I cannot figure out how to parse the JSON object in my Camel route. Camel is integrated with Jackson, Gson, and XStream, but each of those appear to require an extra library that I do not have. Camel also has support for JSONPath, but it requires another library that I do not have. All of my research so far seems to point to using a new software library, so I am looking for someone who knows a solution to possibly save me some time from trying several more dead ends.
The big catch is that I am trying to parse JSON with something that comes with Java/Camel/Spring/ActiveMQ/apache-commons. I would prefer a solution that only uses Camel/Spring XML, but another solution using Java would work (maybe JXPath with Apache Commons?).
The reason I am trying to use libraries that I currently have is the long process that our company has for getting new software libraries approved. I can wait several months to get a library approved or I can write my own specialized parser, but I am hoping there is some other way for me to extract some of the information from the JSON object that I am getting from the Jolokia JMX REST API in ActiveMQ.
There is no JSOn library out of the box in Java itself. But there is a RFE to maybe add that in a future Java release, maybe Java 9.
So if you want to parse json, you need to use a 3rd party library. So you better get your company to approve a library.
camel-core 2.15.x has a json scheme parser we use to parse the component docs json schemas that is shipped now. But its not a general purpose json parser, but can parse simple schemas.
Its at org.apache.camel.util.JsonSchemaHelper