I tried to load comma separated file on agensgraph.
But agensgraph does not have load utility on package.
How can I load file on agensgraph?
You can use "Foreign Data Wrapper", Instead of utility.
First, create file few extension.
agens=# CREATE EXTENSION file_fdw;
CREATE EXTENSION
Second, create server object.
agens =# CREATE SERVER graph_import FOREIGN DATA WRAPPER file_fdw;
CREATE SERVER
Next, create foreign table with file.
agens =# CREATE FOREIGN TABLE fdwSample
agens-# (
agens(# id INT8,
agens(# name VARCHAR(256)
agens(# )
agens-# SERVER graph_import
agens-# OPTIONS
agens-# (
agens(# FORMAT 'csv',
agens(# HEADER 'false',
agens(# DELIMITER ',',
agens(# NULL '',
agens(# FILENAME 'sample.dat'
agens(# );
CREATE FOREIGN TABLE
Last, load file use "LOAD" clause.
agens=# LOAD FROM fdwSample AS sample
agens-# CREATE (:node {id:sample.id,name:sample.name});
GRAPH WRITE (INSERT VERTEX 2, INSERT EDGE 0)
After all, you can find loaded data.
agens =# MATCH (n:node) RETURN n;
n
-------------------------------------
node[3.1]{"id": 1, "name": "steve"}
node[3.2]{"id": 2, "name": "bill"}
(2 rows)
Good luck.
Related
Hi I have declared a table like this
create or replace table app_event (
ID varchar(36) not null primary key,
VERSION number,
ACT_TYPE varchar(255),
EVE_TYPE varchar(255),
CLI_ID varchar(36),
DETAILS variant,
OBJ_TYPE varchar(255),
DATE_TIME timestamp,
AAPP_EVENT_TO_UTC_DT timestamp,
GRO_ID varchar(36),
OBJECT_NAME varchar(255),
OBJ_ID varchar(255),
USER_NAME varchar(255),
USER_ID varchar(255),
EVENT_ID varchar(255),
FINDINGS varchar(255),
SUMMARY variant
);
DETAILS column will contain xml file so that i can run xml function and get element of that xml file .
My sample rows looks like this
dfjkghdfkjghdf8gd7f7997,0,TEST_CASE,CHECK,74356476476DFD,<?xml version="1.0" encoding="UTF-8"?><testPayload><testId>3495864795uiyiu</testId><testCode>COMPLETED</testCode><testState>ONGOING</testState><noOfNewTest>1</noOfNewTest><noOfReviewRequiredTest>0</noOfReviewRequiredTest><noOfExcludedTest>0</noOfExcludedTest><noOfAutoResolvedTest>1</noOfAutoResolvedTest><testerTypes>WATCHLIST</testerTypes></testPayload>,CASE,41:31.3,NULL,948794853948dgjd,(null),dfjkghdfkjghdf8gd7f7997,test user,dfjkghdfkjghdf8gd7f7997,NULL,(null),(null)
When i declare DETAILS as varchar i am able to load file but when i declare this as variant i get below error for that column only
Error parsing JSON:
dfjkghdfkjghdf8gd7f7997COMPLETED</status
File 'SNOWFLAKE/Sudarshan.csv', line 1, character 89 Row 1, column
"AUDIT_EVENT"["DETAILS":6]
Can you please help on this ?
I can not use varchar as i need to query element of xml also in my query .
This is how i load into table and i use default CSV format ,file is available in S3 .
COPY INTO demo_db.public.app_event
FROM #my_s3_stage/
FILES = ('app_Even.csv')
file_format=(type='CSV');
Based on Answer this is how i am loading
copy into demo_db.public.app_event from (
select
$1,$2,$3,$4,$5,
parse_xml($6),$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,parse_xml($17)
from #~/Audit_Even.csv d
)
file_format = (
type = CSV
)
But when i execute it says zero row processed and no mentioned stage here
If you are using a COPY INTO statement then you need to put in a subquery to convert the data before loading it into the table. Use the parse_xml within your copy statement's subquery, something like this:
copy into app_event from (
select
$1,
parse_xml($2) -- <---- "$2" is the column number in the CSV that contains the xml
from #~/test.csv.gz d -- <---- This is my own internal user stage. You'll need to change this to your external stage or whatever
)
file_format = (
type = CSV
)
It is hard to provide you with a good SQL statement without a full example of your existing code (your copy / insert statement). In my example above, I'm copying a file in my own user stage (#~/test.csv.gz) with the default CSV file format options. You are likely using an external stage but it should be easy to adapt this to your own example.
I have read the link Table options do not contain an option key 'connector'
it said we should set the format.
But My Scene is datagen->hive.
Here's my completed example(it's wrong Now)
drop table if exists datagen;
CREATE TABLE datagen (
f_sequence INT,
f_random INT,
f_random_str STRING,
ts AS localtimestamp,
WATERMARK FOR ts AS ts
) WITH (
'connector' = 'datagen',
-- optional options --
'rows-per-second'='5',
'fields.f_sequence.kind'='sequence',
'fields.f_sequence.start'='1',
'fields.f_sequence.end'='50',-- 这个地方限制了一共会产生的条数
'fields.f_random.min'='1',
'fields.f_random.max'='50',
'fields.f_random_str.length'='10'
);
SET table.sql-dialect=hive;
drop table if exists hive_table;
CREATE TABLE hive_table (
f_sequence INT,
f_random INT,
f_random_str STRING
) PARTITIONED BY (dt STRING, hr STRING, mi STRING) STORED AS parquet TBLPROPERTIES (
'partition.time-extractor.timestamp-pattern'='$dt $hr:$mi:00',
'sink.partition-commit.trigger'='partition-time',
'sink.partition-commit.delay'='1 min',
'sink.partition-commit.policy.kind'='metastore,success-file'
);
Flink SQL> insert into hive_table select f_sequence,f_random,f_random_str ,DATE_FORMAT(ts, 'yyyy-MM-dd'), DATE_FORMAT(ts, 'HH') ,DATE_FORMAT(ts, 'mm') from datagen;
[INFO] Submitting SQL update statement to the cluster...
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.ValidationException: Table options do not contain an option key 'connector' for discovering a connector.
Is the solution from above link suitable for this case?
Need your help,Thanks~!
please use SET table.sql-dialect=default; before call insert into hive_table..., the statement insert into hive_table... using datagen connector which hive dialect should not support.
I am unable to find a way to create an external table in Azure SQL Data Warehouse (Synapse SQL Pool) with Polybase where some fields contain embedded commas.
For a csv file with 4 columns as below:
myresourcename,
myresourcelocation,
"""resourceVersion"": ""windows"",""deployedBy"": ""john"",""project_name"": ""test_project""",
"{ ""ResourceType"": ""Network"", ""programName"": ""v1""}"
Tried with the following Create External Table statements.
CREATE EXTERNAL FILE FORMAT my_format
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS(
FIELD_TERMINATOR=',',
STRING_DELIMITER='"',
First_Row = 2
)
);
CREATE EXTERNAL TABLE my_external_table
(
resourceName VARCHAR,
resourceLocation VARCHAR,
resourceTags VARCHAR,
resourceDetails VARCHAR
)
WITH (
LOCATION = 'my/location/',
DATA_SOURCE = my_source,
FILE_FORMAT = my_format
)
But querying this table gives the following error:
Failed to execute query. Error: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too many columns in the line.
Any help will be appreciated.
Currently this is not supported in polybase, need to modify the input data accordingly to get it working.
I am not sure if my issue connecting to the Scala Play 2.5.x Framework or to PostgreSQL so I am going to describe my setup.
I am using the Play 2.5.6 with Scala and PostgreSQL 9.5.4-2 from the BigSQL Sandboxes. I use the Play Framework default evolution package to manage the DB versions.
I created a new database in BigSQL Sandbox's PGSQL and PGSQL created a default schema called public. I use this schema for development.
I would like to create a table with the following script (1.sql in DB evolution config):
# Initialize the database
# --- !Ups
CREATE TABLE user (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL,
creation_date TIMESTAMP NOT NULL
);
# --- !Downs
DROP TABLE user;
Besides that I would like to read the table with a code like this:
val resultSet = statement.executeQuery("SELECT id, name, email FROM public.user WHERE id=" + id.toString)
I got an error if I would like to execute any of the mentioned code or even if I use the CREATE TABLE... code in pgadmin. The issue is with the user table name. If I prefix it with public (i.e. public.user) everything works fine.
My questions are:
Is it normal to prefix the table name with the schema name every time? It seems to odd to me.
How can I make the public schema a default option so I do not have to qualify the table name? (e.g. CREATE TABLE user (...); will not throw an error)
I tried the following:
I set the search_path for my user: ALTER USER my_user SET search_path to public;
I set the search_path for my database: ALTER database "my_database" SET search_path TO my_schema;
search_path correctly shows this: "$user",public
I got the following errors:
In Play: p.a.d.e.DefaultEvolutionsApi - ERROR: syntax error at or near "user"
In pgadmin:
ERROR: syntax error at or near "user"
LINE 1: CREATE TABLE user (
********** Error **********
ERROR: syntax error at or near "user"
SQL state: 42601
Character: 14
This has nothing to do with the default schema. user is a reserved word.
You need to use double quotes to be able to create such a table:
CREATE TABLE "user" (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL,
creation_date TIMESTAMP NOT NULL
);
But I strongly recommend not doing that. Find a different name that does not require a quoted identifier.
I have created Non partitined table in db2.
create table test (name varchar (22), cell# integer);
table created succesffully.
Now I want to create index on test table in tablespace TEST_IDX.
I execute following query
CREATE INDEX test1 ON test (cell#) in TEST_IDX.
It give me following error:
[CREATE - 0 row(s), 0.000 secs] [Error Code: -109, SQL State: 42601] DB2 SQL Error: SQLCODE=-109, SQLSTATE=42601, SQLERRMC=IN
db2 database version is DB2/LINUXZ64 9.7.3
I think you will have to specify that for the table, i.e.:
create table test (
...
) in <tblspc> index in TEST_IDX
See ADMIN_MOVE_TABLE ( http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.dm.doc/doc/t0054864.html?resultof=%22%6d%6f%76%65%5f%74%61%62%6c%65%22%20 ) for more information.
Index tablespace should be defined during table creation, if not mentioned index will be created in same tablespace where table is exist.