'text' field in segments sometimes has single and sometimes double quotes - whisper

The result of the transcribe function has a field called segments.
Each segment contains a text accessible under the key text.
For some segments, the quotes around the text are " ", for some it is ' '.
Why is that the case, and what is the best way to handle this?

Related

Special character in JSON_VALUE in MSSQL 2017

I want to extract an element from a json in a column.
However, the key of the element I am interested contains % (the exact name is: Use%).
Following this, I tried to use double quotes, however, I still have the same problem:
JSON_VALUE(results, '$."Use%"') as value
JSON text is not properly formatted. Unexpected character ''' is found
at position 1.
JSON_VALUE(results, '$.Use%') as value
JSON path is not properly formatted. Unexpected character '%' is found
at position 5.
How can I extract the value from my json string ?
The JSON is the following:
{'Filesystem': 'hdfs://nameservice1', 'Size': '67945349394432', 'Used': '22662944968704', 'Available': '41812184838144', 'Use%': '33%'}
The problem isn't your attempt, it's your JSON; it isn't valid JSON. JSON uses double quotes (") for delimit identifing not single quotes ('). For the example we have, simply REPLACEing the single quotes with double quotes fixes the problem:
DECLARE #YourJSON nvarchar(MAX) = N'{''Filesystem'': ''hdfs://nameservice1'', ''Size'': ''67945349394432'', ''Used'': ''22662944968704'', ''Available'': ''41812184838144'', ''Use%'': ''33%''}';
SELECT JSON_VALUE(REPLACE(#YourJSON, '''', '"'), '$."Use%"') AS value;
Of course, I strongly suggest you investigate how you are creating JSON which uses single quotes, rather than double quotes, and fix both your existing data and process that creates it.

Varchar fields containing a single double quote get exported with two double quotes from snowflake web UI

I have a table with a single Varchar column and a row with value A”B”C .
While downloading it from snowflake UI as a CSV file the result shows A””B””C
.
In this case if you are using Data Unloading feature, you need to use File format in the copy into statement.
In the file format you need to use the option "FIELD_OPTIONALLY_ENCLOSED_BY "
Ref - https://docs.snowflake.com/en/user-guide/data-unload-considerations.html#empty-strings-and-null-values
When a field contains this character, escape it using the same character. For example, if the value is the double quote character and a field contains the string "A", escape the double quotes as follows: ""A"".

Exports have random double quotes

Why does the same field in different records have double quotes? Whenever the first field begins with double quote it is also padded with spaces to the right. Also, notice that some last fields end with double quote and others do not. Another weird thing is when exporting to Flat File, Code Page 65001(UTF-8) is auto selected and would not export. Perhaps it is sensing this in the data? 1252 ANSI Latin 1 works for the example below. There is no difference if I used “ for text exported CSV or not. Output is the same. The table data does not have quotes. Tbl Def 1
Tbl Def2

Snowflake Copy Error due to field enclosed with double double quotes

Found character 'B' instead of field delimiter ',' File
While trying to load
123,""BigB" - some data", 345 ....
I tried tackling this with ESCAPE_UNENCLOSED_FIELD = '"' file format parameter and able to load without any errors but with few defects.
1) The particular field got loaded as BigB" - some data (double quotes in the mid of the field not trimmed).
2) Some other fields format got messed up because of file format ESCAPE_UNENCLOSED_FIELD = '"'.
Any idea or recommendation to over come this scenario in SnowFlake?
Try adding FIELD_OPTIONALLY_ENCLOSED_BY to your File format.

SSIS Text Qualifier not working correctly

I have a CSV file I am importing through SSIS.Below is an sample of the data in my file
"MEM1001","OTHER","P" ,20101001,20781231,,20781231,20101001,
"Medic","General >21" ,
"A100100" ,"2210",20101001,20781231
I have added , as column delimiter and " as Text Qualifier in the connection manager.
But columns like "P" ,"Medic","General >21" ,"A100100" , are still coming enclosed with double quotes when I preview the data while rest the of the string columns are coming without double quotes.
I am guessing it has something to do with the spaces after the quotes.
Can somebody explain why this is happening and how can i make this columns to come without double quotes while importing the data from file to table.
I just stumbled across this post, I had the same issues, I was trying around and could not find any other solution.
The text qualifier " only works in csv files, when the quote is directly after the colon, no space after the colon and the text identifier/qualifier. I have no idea why.
If you aren't able to fix the input data, an option would be to create a derived column and to replace the double quotes.
This worked for me:
How to replace double quotes in derived column transformation?
Trim(REPLACE(COLA, "\"", ""))
You should also add the Trim(), otherwise you have empty spaces before and maybe after the word. This could be problematic in a merge join (in my case it was).
I don't know why this extra spaces cause this issue.
Here is what I would do. It may not be the best idea, but it should work.
You will need to add script task before data flow task that would replace all " ," and ", " to ",".
Thank you
Why not just go to the Connection Manager for that csv file, click on Columns, and under the Column delimiter box just enter a space followed by a comma? Worked for me.

Resources