When creating a flat file using a transient external table I get a strange error. Error: Unexpected protocol character/message (State:08S02, Native Code: F)
Code
CREATE EXTERNAL TABLE 'FilePath' USING( REMOTESOURCE 'ODBC' DELIM 199 NULLVALUE '' )
AS select * from table
After researching online, this answer helped. I ended up adding escapeChar '\' to the parameters and the file was successfully created.
CREATE EXTERNAL TABLE 'FilePath'
USING( REMOTESOURCE 'ODBC' DELIM 199 NULLVALUE '' escapeChar '\' )
AS select * from table
Related
I'm executing select * from owner_name.tablename; which is the external table having dat file.
while executing this it is not returning any rows and not even count(0) fetching results. I could see the below error in log file.
error processing column ID in row 1 for datafile /external/ab/output/user/table_name.dat
ORA-01722: invalid number
.
.
.
.
.
.
.
error processing column ID in row 140489 for datafile /external/ab/output/user/table_name.dat
ORA-01722: invalid number
But, the same dat file and same table executing fine in owner schema.
I did check all like dat file, DDL of table, file location, spaces in data file, delimiters and everything looking same but still error ORA-01722
What am I missing here? Previous stackoverflow questions are for insert queries and not for external tables.
DDL :
CREATE TABLE "OWNER"."TABLE_EXT"
( "ID" NUMBER(22,0),
"SOURCE_CODE" NUMBER(3,0)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "OWNER_EXT"
ACCESS PARAMETERS
( records delimited BY newline FIELDS TERMINATED BY "Ç" missing field VALUES are NULL
(
CLAIM_ID ,
SOURCE_CODE ,
) )
LOCATION
( "OWNER_EXT":'table.dat'
)
)
REJECT LIMIT UNLIMITED
PARALLEL 16 ;
DAT file :
Ç3Ç5278260051Ç557065100415Ç5278260051ÇÇÇÇÇÇÇ
I am unable to find a way to create an external table in Azure SQL Data Warehouse (Synapse SQL Pool) with Polybase where some fields contain embedded commas.
For a csv file with 4 columns as below:
myresourcename,
myresourcelocation,
"""resourceVersion"": ""windows"",""deployedBy"": ""john"",""project_name"": ""test_project""",
"{ ""ResourceType"": ""Network"", ""programName"": ""v1""}"
Tried with the following Create External Table statements.
CREATE EXTERNAL FILE FORMAT my_format
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS(
FIELD_TERMINATOR=',',
STRING_DELIMITER='"',
First_Row = 2
)
);
CREATE EXTERNAL TABLE my_external_table
(
resourceName VARCHAR,
resourceLocation VARCHAR,
resourceTags VARCHAR,
resourceDetails VARCHAR
)
WITH (
LOCATION = 'my/location/',
DATA_SOURCE = my_source,
FILE_FORMAT = my_format
)
But querying this table gives the following error:
Failed to execute query. Error: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too many columns in the line.
Any help will be appreciated.
Currently this is not supported in polybase, need to modify the input data accordingly to get it working.
I am trying to load data from azure blob storage.
The data has already been staged.
But, the issue is when I try to run
copy into random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
Below is the error I encounter:
raise error_class(
snowflake.connector.errors.ProgrammingError: 001757 (42601): SQL compilation error:
Table 'random_table_name' does not exist
Basically, it says table does not exist, which it does not, but the syntax on website is the same as mine.
COPY INTO query on Snowflake returns TABLE does not exist error
In my case the table name is case-sensitive. Snowflake seems to convert everything to upper case. I changed the database/schema/table names to all upper-case and it started working.
First run the below query to fetch the column headers
select $1 FROM #stage_name_i_created/filename.csv limit 1
Assuming below are the header lines from your csv file
id;first_name;last_name;email;age;location
Create a file_format csv
create or replace file format semicolon
type = 'CSV'
field_delimiter = ';'
skip_header=1;
Then you should define the datatype and field name as below
create or replace table <yourtable> as
select $1::varchar as id
,$2::varchar as first_name
,$3::varchar as last_name
,$4::varchar as email
,$5::int as age
,$6::varchar as location
FROM #stage_name_i_created/yourfile.csv
(file_format => semicolon );
The table must exist prior to running a COPY INTO command. In your post, you say that the table does not exist...so that is your issue.
If your table exist, try by forcing the table path like this:
copy into <database>.<schema>.<random_table_name>
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv'
or by steps like this:
use database <database_name>;
use schema <schema_name>;
copy into database.schema.random_table_name
from #stage_name_i_created
file_format = (type='csv')
pattern ='*.csv';
rbachkaniwala, what do you mean by 'How do I create a table?( according to snowflake syntax it is not possible to create empty tables)'.
You can just do below to create a table
CREATE TABLE random_table_name (FIELD1 VARCHAR, FIELD2 VARCHAR)
The table does need to exist. You should check the documentation for COPY INTO.
Other areas to consider are
do you have the right context set for the database & schema
does the user / role have access to the table or object.
It basically seems like you don't have the table defined yet. You should
ensure the table is created
ensure all columns in the CSV exist as columns in the table
ensure the order of the columns are the same as in the CSV
I'd check data types too.
"COPY INTO" is not a query command, it is the actual data transfer execution from source to destination, which both must exist as others commented here but If you want just to query without loading the files then run the following SQL:
//Display list of files in the stage to verify stage
LIST #stage_name_i_created;
//Create a file format
CREATE OR REPLACE FILE FORMAT RANDOM_FILE_CSV
type = csv
COMPRESSION = 'GZIP' FIELD_DELIMITER = ',' RECORD_DELIMITER = '\n' SKIP_HEADER = 0 FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE ESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = 'NONE' DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N');
//Now select the data in the files
Select $1 as first_col,$2 as second_col //can add as necessary number of columns ...etc
from #stage_name_i_created
(FILE_FORMAT => RANDOM_FILE_CSV)
More information can be found in the documentation link here
https://docs.snowflake.com/en/user-guide/querying-stage.html
I tried to create this table :
create table tmp_test (
id_ string,
myelement array<struct<from:string>>
)
STORED AS PARQUET
LOCATION '/donne/tmp_test'
And i have this error :
Error while compiling statement: FAILED: ParseException line 3:23 cannot recognize input near 'from' ':' 'string' in column specification.
How can i escape words 'from', cause i must use this word ?
thxs for your help
FROM is a reserved keyword in Hive.
Use backtick (`) to quote it,
create table tmp_test (
id_ string,
myelement array<struct<`from`:string>>
)
STORED AS PARQUET
LOCATION '/donne/tmp_test';
I get following error message when trying to insert an object in the database:
com.ibm.db2.jcc.am.SqlIntegrityConstraintViolationException:
DB2 SQL Error: SQLCODE=-407, SQLSTATE=23502, SQLERRMC=TBSPACEID=2,
TABLEID=19, COLNO=0, DRIVER=4.15.134
How can I retrieve the table/column name for which the error is thrown?
Apparently at the package level, DB2 only works with the IDs and not the names.
You can find them back using the following query:
SELECT C.TABSCHEMA, C.TABNAME, C.COLNAME
FROM SYSCAT.TABLES AS T,
SYSCAT.COLUMNS AS C
WHERE T.TBSPACEID = 2
AND T.TABLEID = 19
AND C.COLNO = 0
AND C.TABSCHEMA = T.TABSCHEMA
AND C.TABNAME = T.TABNAME