How to change the export format in the Snowflake - snowflake-cloud-data-platform

Could you help me? I don't know how to get the result. I have to prepare the export file to the customer by strict structure from Snowflake. I suppose that the answer could be in the change or create new file format or change the view (cause this file generates from view)
Current export file looks like:
id,id_new,sub_building_name,building_name,building_number,price
"34106391","","FLAT THIRD FLOOR","","7","3.8963552865168741"
"34106392","","FLAT FOURTH FLOOR","","7","3.4363554835138543"
The new export file should be look like:
"id","id_new","sub_building_name","building_name","building_number","price"
34106391,,"FLAT THIRD FLOOR",,7,3.8963552865
34106392,,"FLAT FOURTH FLOOR",,7,3.4363554835
So, what changes need to do:
enclosed the header to double-quotes
type numeric and null values haven't enclosed double-quote (only strings should be enclosed "")
change precision float values from "3.8963552865168741" (16) to 3.8963552865 (10)
Thanks

These options will get you halfway there:
copy into #stage/stacko/out1.csv
from (
select '1' a, '2' b, 1234.12345678901234567890 c, null d, 'a,b,c' e
)
file_format = (type = 'csv' compression=None, null_if=(), field_optionally_enclosed_by='"')
header = true
overwrite = true
"A","B","C","D","E"
"1","2",1234.1234567890123456789,,"a,b,c"
Now, you will need to cast the numbers and get the right precision in SQL before formatting them. Then things will look as you want them to:
copy into #fhoffa_lit_stage/stacko/out1.csv
from (
select '1'::number a, '2'::number b, 1234.12345678901234567890::number(38,5) c, null d, 'a,b,c' e
)
file_format = (type = 'csv' compression=None, null_if=(), field_optionally_enclosed_by='"')
header = true
overwrite = true
"A","B","C","D","E"
1,2,1234.12346,,"a,b,c"

Related

How to remove commas from string in database with SQLite in C?

I have problem on removing all comma from string data in SQLite database.
The program is based on C so I use C API of SQLite as sqlite3_mprintf().
I try to get rows matched with input and it needs to be checked without comma(,).
REPLACE() of SQLite is used as REPLACE(data, ',', '').
Sample code is below.
sqlite3_stmt *main_stmt;
const char* comma = "','";
const char* removeComma = "''"
char *zSQL;
zSQL = sqlite3_mprintf(SELECT * FROM table WHERE (REPLACE(colA||colB, %w, %w) LIKE %%%q%%, comma, removeComma, input);
int result = sqlite3_prepare_v2(database, zSQL, -1, &main_stmt, 0);
I refer the sqlite3 reference.
https://www.sqlite.org/printf.html
Any Substitution Types of SQLite3 reference delete apostrophes from input data.
It makes REPLACE() functions different from what I think.
What I expect is SELECT * FROM table WHERE (REPLACE(colA||colB, ',', '') LIKE %q
by passing ',' and '' as argument of sqlite3_mprintf().
However, result is changed as SELECT * FROM table WHERE (REPLACE(colA||colB, ,, ) LIKE %q
so comma is not removed from data, colA||colB, and result is different from what I expect.
Are there any idea to input comma as first argument of REPLACE() with apostrophes
and blank with apostrophes as second argument?

How do I unload a CSV file where only non-null values are wrapped in quotes, quotes are optionally enclosed, and null values are not quoted?

(Submitting on behalf of a Snowflake User)
For example - ""NiceOne"" LLC","Robert","GoodRX",,"Maxift","Brian","P,N and B","Jane"
I have been able use create a file format that satisfies each of these conditions, but not one that satisfies all three.
I've used the following recommendation:
Your first column is malformed, missing the initial ", it should be:
"""NiceOne"" LLC"
After fixing that, you should be able to load your data with almost
default settings,
COPY INTO my_table FROM #my_stage/my_file.csv FILE_FORMAT = (TYPE =
CSV FIELD_OPTIONALLY_ENCLOSED_BY = '"');
...but the above format returns:
returns -
"""NiceOne"" LLC","Robert","GoodRX","","Maxift","Brian","P,N and B","Jane"
I don't want quotes around empty fields. I'm looking for
"""NiceOne"" LLC","Robert","GoodRX",,"Maxift","Brian","P,N and B","Jane"
Any recommendations?
If you use the following you will not get quotes around NULL fields, but you will get quotes on '' (empty text). You can always concatenate the fields and format the resulting line manually if this doesn't suite you.
COPY INTO #my_stage/my_file.CSV
FROM (
SELECT
'"NiceOne" LLC' A, 'Robert' B, 'GoodRX' C, NULL D,
'Maxift' E, 'Brian' F, 'P,N and B' G, 'Jane' H
)
FILE_FORMAT = (
TYPE = CSV
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
NULL_IF = ()
COMPRESSION = NONE
)
OVERWRITE = TRUE
SINGLE = TRUE

snowflake - How to use a file format to decode a csv column?

I've got some data in a string column that is in a strange csv format. I can write a file format that correctly interprets it. How do I use my file format against data that has already been imported?
create table test_table
(
my_csv_column string
)
How do I split/flatten this column with:
create or replace file format my_csv_file_format
type = 'CSV'
RECORD_DELIMITER = '0x0A'
field_delimiter = ' '
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
VALIDATE_UTF8 = FALSE
Please assume that I cannot use split, as I want to use the rich functionality of the file format (optional escape characters, date recognition etc.).
What I'm trying to achieve is something like the below (but I cannot find how to do it)
copy into destination_Table
from
(select
s.$1
,s.$2
,s.$3
,s.$4
from test_table s
file_format = (column_name ='my_csv_column' , format_name = 'my_csv_file_format'))

Anorm SQL Folding a List into a class result

please pardon the level of detail. I'm not completely sure how to phrase this question.
I am new to scala and still learning the intricacies of the language. I have a project where all the data I need is contained in a table with a layout like this:
CREATE TABLE demo_data ( table_key varchar(10), description varchar(40), data_key varchar(10), data_value varchar(10) );
Where the table_key column contains the main key I'm searching on, and the description repeats for every row with that table_key. In addition there are descriptive keys and values contained in the data_key and data_value pairs.
I need to consolidate a set of these data_keys into my resulting class so that the class will end up like this:
case class Tab ( tableKey: String, description: String, valA: String, valB: String, valC: String )
object Tab {
val simple = {
get[String]("table_key") ~
get[String]("description") ~
get[String]("val_a") ~
get[String]("val_b") ~
get[String]("val_c") map {
case tableKey ~ description ~ valA ~ valB ~ valC => Tab(table_key, description, valA, valB, valC)
}
}
def list(tabKey: String) : List[Tab] = {
DB.withConnection { implicit connection =>
val tabs = SQL(
"""
SELECT DISTINCT p.table_key, p.description,
a.data_value val_a,
b.data_value val_b,
c.data_value val_c
FROM demo_data p
JOIN demo_data a on p.table_key = a.table_key and a.data_key = 'A'
JOIN demo_data b on p.table_key = b.table_key and b.data_key = 'B'
JOIN demo_data c on p.table_key = c.table_key and c.data_key = 'C'
WHERE p.table_key = {tabKey}
"""
).on('tabKey -> tabKey).as(Tab.simple *)
}
return tabs
}
}
which will return what I want, however I have more than 30 data keys that I wish to retrieve in this manner, and the joins to itself rapidly becomes unmanageable. As in the query ran for 1.5 hours and used up 20GB worth of temporary tablespace before running out of disk space.
So instead I am doing a separate class that retrieves a list of data keys and data values for a given table key using the "where data_key in ('A','B','C',...)", and now I'd like to "flatten" the returned list into a resulting object that will have the valA, valB, valC, ... in it. I still want to return a list of the flattened objects to the calling routine.
Let me try to idealize what I'd like to accomplish..
Take a header result set and a detail result set, extract out the keys out of the detail result set to populate additional elements/properties in the header result set and produce a list of classes containing the all the elements of the header result set, and the selected properties from the detail result set. So I get a list of TabHeader(tabKey,Desc) and for each I retrieve a list of interesting TabDetail(DataKey,DataValue), I then extract out the element where the DataKey == 'A' and put the DataValue element in Tab(valA), and do the same for DataKey == 'B', 'C', ... After I'm done I wish to produce a Tab(tabKey, Desc, valA, valB, valC, ...) in place of the corresponding TabHeader. I could quite possibly muddle through this in Java, but I'm treating this as a learning opportunity and would like to know a good way to do this in Scala.
I'm feeling that something with the scala mapping should do what I need, but I haven't been able to track down exactly what.

Matlab read csv string array

I have a comma seperated dataset called Book2.csv I want to extract the contents. The contents are a 496024x1 array of strings (normal, neptune, smurf).
I tryed:
[text_data] = xlsread('Book2.csv');
But it just outputed a text_data empty array?
When trying csvread
M = csvread('Book2.csv')
??? Error using ==> dlmread at 145
Mismatch between file and format string.
Trouble reading number from file (row 1, field 1) ==>
norma
Error in ==> csvread at 54
m=dlmread(filename, ',', r, c);
I get this error. Can anyone help?
Off the top of my head this should get the job done. But possibly not the best way to do it.
fid = fopen(your file); //open file
//read all contents into data as a char array
//(don't forget the `'` to make it a row rather than a column).
data = fread(fid, '*char')';
fclose(fid);
//This will return a cell array with the individual
//entries for each string you have between the commas.
entries = regexp(data, ',', 'split');
Try something like: textread
data = textread('data.csv', '', 'delimiter', ',', ...
'emptyvalue', NaN);
The easiest way for me is :
path='C:\folder1\folder2\';
data = 'data.csv';
data = dataset('xlsfile',sprintf('%s\%s', path,data));
Of cource you could also do the following:
[data,path] = uigetfile('C:\folder1\folder2\*.csv');
data = dataset('xlsfile',sprintf('%s\%s', path,data));
now you will have loaded the data as dataset.
An easy way to get a column 1 for example is
double(data(1))

Resources