Datastream cannot read UPDATE binary log in Google cloud Datastream - google-datastream

I use Datastream to bigquery.
I sync data from oracle mysql to bigquery use datastream.
when use oracle mysql version 8.0.28 -> working.
when use oracle mysql version 8.0.30 -> working but I only saw data with change_type is INSERT. There is no UPDATE-INSERT event appear.
I wonder if there any case that Datastream cannot read binary log with UPDATE type? Thanks for your help.
Thanks for your help.

Related

Is there anyway to stream data from Snowflake to Oracle other then informatica

I am working on requirement where I need to stream data from Snowflake to Oracle for some value added process.
Few method which I got to know is unload file to S3 then load to Oracle and other one is informatica.
But above two approaches require some effort so is there any simple way of streaming data from Snowflake to Oracle.
Snowflake cannot connect directly to Oracle. You'll need some tooling or code "in between" the two.
I came across this data migration tool the other day, and it appears to support both Snowflake and Oracle: https://github.com/markddrake/YADAMU---Yet-Another-DAta-Migration-Utility/releases/tag/v1.0
-Paul-

Dump BigQuery select statement results into Google Cloud SQL database table

How to dump BigQuery select statement results into Google Cloud SQL database. The only way I am aware of is dumping the results to Google Cloud Storage and then Cloud SQL can read from it.
Is there a better way to implement this? I want this to happen everyday.
You can create a cron job that will use the BigQuery API to query the data and the MySQL API to post the data.
You can use the Cloud DataFlow that will use BigQuery query as input, but you will need to write a custom sink (Java, Python) (or find one) that will dump it to MySql.

Using Kylin without HDFS and HBase

is it possible to connect Apache Kylin without other databases like Hbase (plus HDFS) in general? So you can store raw data and the cube metadata somewhere else?
I think you coulde use Apache Hive using managed native tables
(Hive storage handlers)
Hive could connect over ODBC driver to MySQL for example
To use Kylin ,HDFS is mandatatory .Raw data as well as Cube data both will be stored in HDFS.
If you want to support other nosql datastore like cassandra ,you can consider other framework ,FiloDB

How to explore HBase data

I am currently doing an app that loads data into HBase, I chose HBase because the data is not structured and therefore using a column based database is recommended.
Once the data is in HBase I thought of integrating Solr to it but I found little information about the subject and no answer for my question "https://stackoverflow.com/questions/36542936/integrating-solr-to-hbase"
So I wanted to ask how I can query data stored in HBase? Spark Streaming doesn't seem to be made for that ..
Any help please ?
Thanks in advance
Assuming that your question is on how to query data from Hbase.
Apache Phoenix Provides a Sql Wrapper over Hbase.
Hive Hbase Integration Hive also provides a Sql Wrapper over Hbase
Spark Hbase Plugin lets your Apache Spark application interact with Apache HBase.

How to load data from oracle and sql server to HAWQ using Spring XD

Hi I have tables in Oracle and SQL Server. I need to load data from oracle and sql server into Pivotal HAWQ using Spring XD. Couldn't find in documentation.
You need to integrate sqoop jobs with Spring XD. See link below for sqoop jobs with springxd
https://github.com/tzolov/spring-xd-sqoop-job
You can use jdbchdfs job to load the data in HDFS as CSV or any PXF supported format. Then you can map the loaded data to HAWQ tables using PXF External tables support. If you need to load this data to Native HAWQ tables then you can do a SELECT INSERT from there, or have SELECT INSERT configured as another batch job that loads the data from PXF External table to HAWQ native.
Outsourcer is another open source solution that was originally designed to load data from Oracle and SQL Server into Greenplum but was enhanced some time ago to also support HAWQ.
All of the documentation and downloads are on http://www.pivotalguru.com/
And if you are interested in seeing the source code, here it is: https://github.com/pivotalguru/outsourcer

Resources