SF KAFKA CONNECTOR Detail: Table doesn't have a compatible schema - snowflake kafka connector - snowflake-cloud-data-platform

I have setup the snowflake - kafka connector. I setup a sample table (kafka_connector_test) in snowflake with 2 fields both are VARCHAR type.
Fields are CUSTOMER_ID and PURCHASE_ID.
Here is my configuration that I created for the connector
curl -X POST \
-H "Content-Type: application/json" \
--data '{
"name":"kafka_connector_test",
"config":{
"connector.class":"com.snowflake.kafka.connector.SnowflakeSinkConnector",
"tasks.max":"2",
"topics":"kafka-connector-test",
"snowflake.topic2table.map": "kafka-connector-test:kafka_connector_test",
"buffer.count.records":"10000",
"buffer.flush.time":"60",
"buffer.size.bytes":"5000000",
"snowflake.url.name":"XXXXXXXX.snowflakecomputing.com:443",
"snowflake.user.name":"XXXXXXXX",
"snowflake.private.key":"XXXXXXXX",
"snowflake.database.name":"XXXXXXXX",
"snowflake.schema.name":"XXXXXXXX",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"value.converter":"com.snowflake.kafka.connector.records.SnowflakeJsonConverter"}}'\
I send data to the topic that I have configured in the connector configuration.
{"CUSTOMER_ID" : "test_id", "PURCHASE_ID" : "purchase_id_test"}
then when I check the kafka-connect server I get the below error:
[SF KAFKA CONNECTOR] Detail: Table doesn't have a compatible schema
Is there something I need to setup in either kafka connect or snowflake that says which parts of the json go into which columns of the table? Not sure how to specify how it parses the json.
I setup a different topic as well and didn't create a table in snowlake. In that I was able to populate this table but the connector makes a table with 2 columns RECORD_METADATA and RECORD_CONTENT. But I don't want to write a scheduled job to parse this I want to directly insert into a queryable table.

Snowflake Kafka connector writes data as json by design. The default columns RECORD_METADATA and RECORD_CONTENT are variant. If you like to query them you can create a view on top the table to achieve your goal and you don't need a scheduled job
So, your table created by connector would be something like
RECORD_METADATA, RECORD_CONTENT
{metadata fields in json}, {"CUSTOMER_ID" : "test_id", "PURCHASE_ID" : "purchase_id_test"}
You can create a view to display your data
create view v1 as
select RECORD_CONTENT:CUSTOMER_ID::text CUSTOMER_ID,
RECORD_CONTENT:PURCHASE_ID::text PURCHASE_ID
Your query will be
select CUSTOMER_ID , PURCHASE_ID from v1
PS. If you like to create your own tables you need to use variant as your data type instead of varchar

Also looks like it's not supported at this time in reference to this github issue

Related

Converting a Array into individual rows in Snowflake

I have a Kafka Topic, which receives an array with multiple object in it like shown below.
[{"Id":2318805,"Booster Station":"Comanche County #1","TimeStamp":"2021-09-30T23:53:43.019","Total Throughput":2167.52856445125},{"Id":2318805,"Booster Station":"Comanche County #2","TimeStamp":"2020-09-30T23:53:43.019","Total Throughput":217.52856445125},]
when i load this in snowflake, it becomes one huge row, with all objects, i would like to store each object as individual row in snowflake, how can i achieve this, i am open to tweak at kafka level as or Connector.
My Kafka is AWS MSK, and i am using snowflake connector plugin for loading data in snowflake
Type of field
You can use Snowflake's flatten table function to flatten the arrays to individual rows:
create or replace temp table T1 as
select(parse_json($$[{"Id":2318805,"Booster Station":"Comanche County #1","TimeStamp":"2021-09-30T23:53:43.019","Total Throughput":2167.52856445125},
{"Id":2318805,"Booster Station":"Comanche County #2","TimeStamp":"2020-09-30T23:53:43.019","Total Throughput":217.52856445125}]
$$)) as JSON
;
select VALUE from T1, table(flatten(JSON));
This is assuming that the Kafka messages are stored as variant type. If they are string, you can use the parse_json function to convert them to variant.
From there, you can convert the individual objects to columns if you want:
select VALUE:"Booster Station"::string as BOOSTER_STATION
,VALUE:Id::int as ID
,VALUE:TimeStamp::timestamp as TIME_STAMP
,VALUE:"Total Throughput"::float as TOTAL_THROUGHPUT
from T1, table(flatten(JSON));
BOOSTER_STATION
ID
TIME_STAMP
TOTAL_THROUGHPUT
Comanche County #1
2318805
2021-09-30 23:53:43.019000000
2167.528564451
Comanche County #2
2318805
2020-09-30 23:53:43.019000000
217.528564451

Relation IDs mismatch - Mapping OWL to Oracle DB with Ontop

As a Part of my little App I try to map Data between my Ontology and an Oracle DB with ontop. But my first mapping is not accepted by the reasoner and it's not clear why.
As my first attempt I use the following target:
:KIS/P_PVPAT_PATIENT/{PPVPAT_PATNR} a :Patient .
and the following source:
select * from P_PVPAT_PATIENT
Here KIS is the schema, p_pvpat_patient the table and ppvpat_patnr the key.
Caused by: it.unibz.inf.ontop.exception.InvalidMappingSourceQueriesException:
Error: Relation IDs mismatch: P_PVPAT_PATIENT v "KIS"."P_PVPAT_PATIENT" P_PVPAT_PATIENT
Problem location: source query of triplesMap
[id: MAP_PATIENT
target atoms: triple(s,p,o) with
s/RDF(http://www.semanticweb.org/grossmann/ontologies/kis-ontology#KIS/P_PVPAT_PATIENT/{}(TmpToVARCHAR2(PPVPAT_PATNR)),IRI), p/<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, o/<http://www.semanticweb.org/grossmann/ontologies/kis-ontology#Patient>
source query: select * from P_PVPAT_PATIENT]
As the error said my source query was wrong because I forgot to use the schema in my sql.
the correct sql is
select * from kis.P_PVPAT_PATIENT

Flink SQL : UDTF passes Row type parameters

CREATE TABLE user_log (
data ROW(id String,user_id String,class_id String)
) WITH (
'connector.type' = 'kafka',
...
);
INSERT INTO sink
SELECT * FROM user_log as tab,
LATERAL TABLE(splitUdtf(tab.data)) AS T(a,b,c);
UDTF Code:
public void eval(Row data) {...}
Can the eval method only pass Row type parameters? I want to get the key of Row in SQL,such as id,user_id,class_id,But the key of Row in java is index (such as 0,1,2).How do i do it? Thank you!
Is your sql able to directly convert kafka data to table Row? Maybe not .
Row is the type at the DataStream level, not the type in TableAPI&SQL.
If the data you received from kafka is in json format, you can use the DDL statement in fllink sql or use the Connector API to directly extract the fields in json, as long as your json is in key-value format.

QueryException (42S02)

Illuminate \ Database \ QueryException (42S02)
SQLSTATE[42S02]: Base table or view not found: 1146 Table 'influencingquotes.posts' doesn't exist (SQL: select count(*) as aggregate from posts where quote_title = gtav hyhui)
I am not sure why this Database\QueryException is occurring :(
The Error says that the posts table is missing, maybe you forget to run:
php artisan migrate
1.If in database you couldn't find posts table
Then you need to try to find migration file for it
which for laravel 5 located in project_root_dir\database\migrations
Then via command line
php artisan migrate
2.If in database exist table, but name is wrong , for example name is post
In post model you need to specify table name
protected $table = 'post';
If it doesn't help, you need to write us
1.what you see in database, which tables or views exists there
2.What files you see in project_root_dir\database\migrations directory.

how do you delete a column's data from a postgresql database using datamapper and sinatra?

I have a table with three columns (id, name, age). I would like to keep the name and id the same, but remove all age data, and be able to reassign ages.
ie. I want to clear data from one column only, but not delete the entire column.
I am using sinatra, datamapper and postgresql.
Programmatically you could do something like this:
#myvariable = MyModel.all
#myvariable.map {|m|
m.update(:age => nil)
}
More info on Datamapper can be found here: http://datamapper.org/docs/
Or if you want to do it by hand, if your database is on Heroku, you can connect to the database instance on Heroku like so:
heroku pg:psql
Then you'll be connected to the database and you can just type out the SQL like Gus suggested:
update table_name set age = NULL
If your app is not on Heroku and you're just connecting to a local postgresql instance, it would be like so:
psql -d your_database -U your_user
More info on psql can be found here: http://www.postgresql.org/docs/9.2/static/app-psql.html

Resources