How can I import a CSV file containing JSON into postgresql? - database

I am trying to import this CSV file:
HEADER
"{
""field1"": ""123"",
""field2"": ""456""
}"
"{
""field1"": ""789"",
""field2"": ""101""
}"
into my Postgres table.
However, it seems that the \copy my_table(my_text) from 'my_file.csv' command is creating a row for each line of the file.
This is what I get :
my_text
-----------------------------------------------------
HEADER
"{
""field1"": ""123"",
""field2"": ""456""
}"
"{
""field1"": ""789"",
""field2"": ""101""
}"
(9 rows)
and what I expect:
{""field1"": ""123"", ""field2"": ""456""}
{""field1"": ""789"", ""field2"": ""101""}
(2 rows)

Escaping the new line might do the trick:
\copy my_table(my_text) from my_file.csv csv header escape E'\n' quote '"'

The default for \copy is "text" format, not "csv" format. All you have to do is tell your \copy to use csv, and that there is a header line.
\copy my_table(my_text) from 'my_file.csv' csv header
Changing the escape character as the other answer suggests is unnecessary and in fact will break things. Your data will load, but will not be valid JSON.
You probably want to make this column type JSON or JSONB, not text. This will automatically validate your data as being valid JSON data, and in the case of JSONB will make future parsing of it faster.

Related

Modify the delimiter of an external table with HiveQL

I'm taking a CSV file from HDFS and transferring it to my External Table in hive.
But my CSV file has the delimiter " ; " and in my second column, I have " ; " along with the information.
You can see in the image below:
Can you guide me what I should do? Are there any Hive properties that allow me to do this or any other solution?
By default, ROW FORMAT TEXT FIELDS TERMINATED BY ';' will split it apart
If you want the (OS) value to be part of the second column, you need to quote that column. e.g. A;"Mozilla//5.0;(Linux)";BR. In other words, change how the file is written/stored outside of Hive
If you cannot modify the file, you can make your queries simply concatenate those two columns, e.g. SELECT CONCAT(user_agent, ';', os) FROM data;

How to read multiple values in one row from csv file to post body as variable in Jmeter

How to make an Post HTTP request Body from multiple values in a single column from a CSV file?
In CSV file, under the transactional_currencies, I need to insert two or more values as per requirement.
This is the Json Body need to pass in Post HTTP request Body
{
"country_name": "${country}",
"status": "$ {status}",
"transactional_currencies": ["${transactional_currencies[0]", "${transactional_currencies[1]"]
}``
``
I'm not sure about what did you ask but let's give it a try
You should define a delimiter character in your CSV dataset config element.
For example
Single Line in our CSV: sample#sample.com,testusername,testpassword
After adding a CSV data config to your project, open the configuration panel.
Choose your delimiter character as " , " since the line above has
some commas.
Choose variable names as "email", "username" and "password"
Now you have splitted a single csv line into 3 different variables.
On your request's body, write these variables as ${email} , ${username}
, ${password}
Not with the CSV Data Set Config
If you have fixed number of entries in the transactional_currencies (i.e. always 2) you can use __CSVRead() function where you will be able to decide when to go to the next entry/row.
If the number of entries in the transactional_currencies is dynamic, you can go for JSR223 PreProcessor and build your request body using Groovy language as it's described in the Apache Groovy - Parsing and producing JSON article

BCP - Include terminator in data insert

I'm using BCP to load a json file to SQL Server (yes I know there are better ways, but need to try this)
The problem is, the json document is not formed properly because the terminator in the format file is being removed, but I want it included
bcp db.dbo.test IN G:\JSON\json.out -f G:\JSON\formatfile.out -T
format file terminator:
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="] }" COLLATION="Latin1_General_CI_AI"/>
How can I end the file without truncating the json closing tags?
BCP is not designed for importing a file into a single column, so you run into these problems. To import a file as a single object, use the OPENROWSET(... SINGLE_BLOB) functionality, like this:
INSERT INTO JsonTable(jsonColumn)
SELECT BulkColumn
FROM OPENROWSET (BULK ‘TextFile Path’, SINGLE_BLOB) FileName
If you absolutely, positively must use BCP, there is one trick that is often used for XML files that should work for JSON files as well.
Do not add a Row terminator value
Make your Field terminator value something that absolutely, positively, cannot exist in the JSON file, such as '\0~\0\0~' (which is NULL + ~ + two NULLs + ~). If this could exist in the JSON, try some other value. Just ensure that it can't exist in the file.
By default, this imports an entire XML file as a unit. It should work on JSON files as well, but I cannot guarantee that.

Load table issue - BCP from flat file - Sybase IQ

I am getting the below error while trying to do bcp from a flat delimited file into Sybase IQ table.
Could not execute statement.
Non-space text found after ending quote character for an enclosed field.
I couldn't observe any non space text in the file, but this error is stopping me from doing the bulk copy. | is column delimiter with " as text qualifier and \n is row delimiter.
Below is the sample template for the same, am using.
LOAD TABLE TABLE_NAME(a NULL('(null)'),b NULL('(null)'),c NULL('(null)'))
USING CLIENT FILE '/home/...../a.txt' //unix
QUOTES ON
FORMAT bcp
STRIP RTRIM
DELIMITED BY '|'
ROW DELIMITED BY '\n'
When i perform the same query with QUOTES OFF, the load was successful. But, the same query is getting failed with QUOTES ON. I would like to get quotes stripped off, as well.
Sample Data
12345|"abcde"|(null)
12346|"abcdf"|"zxf"
12347|(null)|(null)
12348|"abcdg"|"zyf"
Any leads would be helpful!
If IQ bcp is the same as ASE, then I think those '(null)' fields are being interpreted as strings, not fields that are NULL.
You'd need to stream edit out those (null).
You're on unix so use sed or perl -ne.
E.g. pipe the file through " | perl -pne 's/(null)//g'" to the loading command or filename.
QUOTES OFF might seem to work, but I wonder if when you look in your loaded data, you'll see double quotes inside the 2nd field, and '(null)' where you expect a field to be NULL.

Commas within CSV Data

I have a CSV file which I am directly importing to a SQL server table. In the CSV file each column is separated by a comma. But my problem is that I have a column "address", and the data in this column contains commas. So what is happening is that some of the data of the address column is going to the other columns will importing to SQL server.
What should I do?
For this problem the solution is very simple.
first select => flat file source => browse your file =>
then go to the "Text qualifier" by default its none write here double quote like (") and follow the instruction of wizard.
Steps are -
first select => flat file source => browse your file => Text qualifier (write only ") and follow the instruction of wizard.
Good Luck
If there is a comma in a column then that column should be surrounded by a single quote or double quote. Then if inside that column there is a single or double quote it should have an escape charter before it, usually a \
Example format of CSV
ID - address - name
1, "Some Address, Some Street, 10452", 'David O\'Brian'
New version supports the CSV format fully, including mixed use of " and , .
BULK INSERT Sales.Orders
FROM '\\SystemX\DiskZ\Sales\data\orders.csv'
WITH ( FORMAT='CSV');
I'd suggest to either use another format than CSV or try using other characters as field separator and/or text delimiter. Try looking for a character that isn't used in your data, e.g. |, #, ^ or #. The format of a single row would become
|foo|,|bar|,|baz, qux|
A well behave parser must not interpret 'baz' and 'qux' as two columns.
Alternatively, you could write your own import voodoo that fixes any problems. For the later, you might find this Groovy skeleton useful (not sure what languages you're fluent in though)
Most systems, including Excel, will allow for the column data to be enclosed in single quotes...
col1,col2,col3
'test1','my test2, with comma',test3
Another alternative is to use the Macintosh version of CSV, which uses TAB's as delimiters.
The best, quickest and easiest way to resolve the comma in data issue is to use Excel to save a comma separated file after having set Windows' list separator setting to something other than a comma (such as a pipe). This will then generate a pipe (or whatever) separated file for you that you can then import. This is described here.
I don't think adding quote could help.The best way I suggest is replacing the comma in the content with other marks like space or something.
replace(COLUMN,',',' ') as COLUMN
Appending a speech mark into the select column on both side works. You must also cast the column as a NVARCVHAR(MAX) to turn this into a string if the column is a TEXT.
SQLCMD -S DB-SERVER -E -Q "set nocount on; set ansi_warnings off; SELECT '""' + cast ([Column1] as nvarchar(max)) + '""' As TextHere, [Column2] As NormalColumn FROM [Database].[dbo].[Table]" /o output.tmp /s "," -W

Resources