Incompatibility due to GUID - sql-server

i have flatfile which has a field accountid(ex:123123123) .
I need to import my accountid which is in flatfile to database in which it is also named accountid(uniqueidentifier,null)(it's a GUID unique identifier)
Tried changing the metadata of flat to unique identifier but am getting error.
[Flat File Source [2]] Error: Data conversion failed. The data
conversion for column "Account Id" returned status value 2 and status
text "The value could not be converted because of a potential loss of
data.". [Flat File Source [2]] Error: SSIS Error Code
DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "Flat File
Source.Outputs[Flat File Source Output].Columns[Account Id]" failed
because error code 0xC0209084 occurred, and the error row disposition
on "Flat File Source.Outputs[Flat File Source Output].Columns[Account
Id]" specifies failure on error. An error occurred on the specified
object of the specified component. There may be error messages posted
before this with more information about the failure.

Create field for storing source accountid as integer. Add derived column with generated GUID id's (howto Create a GUID column in SSIS) to your source data and use it as primary key in target.
If you trying import GUID values with the same error message:
[Flat File Source [2]] Error: Data conversion failed. The data conversion for column "AccountId" returned status value 2 and status text "The value could not be converted because of a potential loss of data."...
or (the same in Russian):
[Flat File Source [177]] Error: Ошибка преобразования данных. При преобразовании данных для столбца "accountid" возращено значение состояния 2 и текст состояния "Невозможно преобразовать значение из-за возможной потери данных.".
[Flat File Source [177]] Error: Код ошибки служб SSIS: DTS_E_INDUCEDTRANSFORMFAILUREONERROR. Сбой Flat File Source.Выводы[Выход источника "Неструктурированный файл"].Столбцы[accountid] из-за возникновения ошибки с кодом 0xC0209084, и стратегия обработки ошибок строк в "Flat File Source.Выводы[Выход источника "Неструктурированный файл"].Столбцы[accountid]" определяет сбой по ошибке. Ошибка возникла в указанном объекте указанного компонента. Возможно, до этого были опубликованы сообщения об ошибках, в которых содержатся более подробные сведения о причине сбоя.
Dublecheck that GUID values have curly braces. This CSV throws errors:
ReqType;accountid;contactid;
0;6E0DAA5D-CB68-4348-A7B2-AD2367190F83;FFA9D382-D534-4731-82A0-D9F36D8221B0;
This will be processed:
ReqType;accountid;contactid;
0;{6E0DAA5D-CB68-4348-A7B2-AD2367190F83};{FFA9D382-D534-4731-82A0-D9F36D8221B0};

I would solve this by importing your flat file to a Staging table that has a varchar datatype for AccountId.
Then call a stored procedure that copies the data from the staging table to your final destination and uses TRYPARSE() to convert the AccountId column to a GUID. Then you will be able to handle the rows that don't have a valid GUID in the AccountId column without losing the rows that do.

Related

SQL Data Warehouse External Table with String fields

I am unable to find a way to create an external table in Azure SQL Data Warehouse (Synapse SQL Pool) with Polybase where some fields contain embedded commas.
For a csv file with 4 columns as below:
myresourcename,
myresourcelocation,
"""resourceVersion"": ""windows"",""deployedBy"": ""john"",""project_name"": ""test_project""",
"{ ""ResourceType"": ""Network"", ""programName"": ""v1""}"
Tried with the following Create External Table statements.
CREATE EXTERNAL FILE FORMAT my_format
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS(
FIELD_TERMINATOR=',',
STRING_DELIMITER='"',
First_Row = 2
)
);
CREATE EXTERNAL TABLE my_external_table
(
resourceName VARCHAR,
resourceLocation VARCHAR,
resourceTags VARCHAR,
resourceDetails VARCHAR
)
WITH (
LOCATION = 'my/location/',
DATA_SOURCE = my_source,
FILE_FORMAT = my_format
)
But querying this table gives the following error:
Failed to execute query. Error: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too many columns in the line.
Any help will be appreciated.
Currently this is not supported in polybase, need to modify the input data accordingly to get it working.

How to fix JSON data Special character issue in snowflake (UTF-8 conversion error - snowflake table loaded with �)?

JSON file from S3 to snowflake failed. Here S3_STG_AREA_JSON is the staging area & STG_TABLE_NAME_JSON is staging table.
Statement executed :
COPY INTO STG_TABLE_NAME_JSON FROM #S3_STG_AREA_JSON FILE_FORMAT=(TYPE='json' STRIP_OUTER_ARRAY=true)
Error:
**Code: 100183 State: P0000 Message: Error parsing JSON: missing first byte in UTF-8 sequence**
I have tried
COPY INTO STG_TABLE_NAME_JSON FROM #S3_STG_AREA_JSON FILE_FORMAT=(TYPE='json' STRIP_OUTER_ARRAY=true SKIP_BYTE_ORDER_MARK = TRUE)
Got same issue as above.
Then tried
COPY INTO STG_TABLE_NAME_JSON FROM #S3_STG_AREA_JSON FILE_FORMAT=(TYPE='json' STRIP_OUTER_ARRAY=true IGNORE_UTF8_ERRORS = TRUE)
Load is completed. But now, the snowflake table has � (U+FFFD)
How to fix this issue ?

Issue with derived column transformation

I have a column which is int and want to load the data based on condition for example say:
if the value is 1 then load column with inserted
value is 2 then load column with DELETED
value is 3 then load column with UPDATED
But when I tried doing this I am getting the following error:
Error at Data Flow Task [Derived Column [1666]]: Attempt to parse the expression "[Copy of operation]== "1" ? "INSERTED"" failed. The expression might contain an invalid token, an incomplete token, or an invalid element. It might not be well-formed, or might be missing part of a required element such as a parenthesis.
Error at Data Flow Task [Derived Column [1666]]: Cannot parse the expression "[Copy of operation]== "1" ? "INSERTED"". The expression was not valid, or there is an out-of-memory error.
Error at Data Flow Task [Derived Column [1666]]: The expression "[Copy of operation]== "1" ? "INSERTED"" on "Derived Column.Outputs[Derived Column Output].Columns[Derived Column 1]" is not valid.
Error at Data Flow Task [Derived Column [1666]]: Failed to set property "Expression" on "Derived Column.Outputs[Derived Column Output].Columns[Derived Column 1]".
(Microsoft Visual Studio)
===================================
Exception from HRESULT: 0xC0204006 (Microsoft.SqlServer.DTSPipelineWrap)
------------------------------
Program Location:
I assume the issue is that you have an incomplete expression. The ternary operator ? : has three parts to it (boolean expression) ? True bits : False bits
[Copy of operation]== "1" ? "INSERTED" : [Copy of operation]== "2" ? "DELETED" : [Copy of operation]== "3"? "UPDATED" : "UNKNOWN"
This expression would read
If the value of column copy of operation is 1, then return INSERTED
else If the value of column copy of operation is 2, then return DELETED
else If the value of column copy of operation is 3, then return UPDATED
else return UNKNOWN
This does assume the data type of the column Copy of operation is a string. If it's a whole number, then you'd remove the double quotes around the values 1,2,3.
In the comments, you've indicated the __$operation indicates the value of the operation as where 1 = delete, 2 = insert, 3 = update (before change), and 4 = update (after change)
Continue with the above pattern along with changing out the differences (1 is delete in comment whereas 1 is inserted in question) to generate values.
A different approach is to use a tiny lookup table. You could even define it with an inline query and use a Lookup Component to add your operation description into the data flow
SELECT
OperationId
, OperationName
FROM
(
VALUES ('1', 'INSERTED')
, ('2', 'DELETED')
-- etc
)D(OperationId, OperationName);
Again, ensure you have your data types aligned

Getting Incorrect syntax near the keyword 'TO' When Attempting to Create Export File

I am trying to create an export file from a database view that I created.
I was able to successfully complete the steps to define the query and populate the temp tables but when I attempt to export the data to file I am receiving the error message
'Incorrect syntax near the keyword 'TO'.
This is the complete error:
1:52:06 AM [ 2063] (15) Populate temp tables with values
(AG16_RUN0020)
1:52:06 AM ERROR : Error (156) Error in AGRExecSql:
Couldn't execute statement (156) when (25) Create export file: 42000
[Microsoft][SQL Server Native Client 11.0][SQL Server]Incorrect syntax
near the keyword 'TO'.:
I am using the following for the Create Export File step:
COPY TO EXPORT FILE ='DataFeed.txt',colsep=;,
SELECT degree__1, huid__1,stage__1, f22_schooltag, status_effective_date__1,f21_effdt,last_name__1,first_name__1, middle_name__1, f20_other_last_, name__1,f1_prefix, name_suffix__1,alias_name,client,date_of_birth__1,f3_birth_state,f4_birth_countr, gender__1,f5_marital_stat,ssn__1,f6_itin,f7_military_sta,f8_disabled_vet,us_citizenship__1,f17_citizen_sta,huit_country_code__1,country_of_citizenship__1,visa_type__1,f19_huit_visa_t,ethnicity__1,f9_ethnicity_w,f10_ethnicity_b,f11_ethnicity_a,f12_ethnicity_p,f13_ethnicity_a, f14_ethnicity_h,telephone__1, f18_primary_pho,f15_other_phone, f16_other_phone, type__1,term__1,student_pk__2, student_pk__1,student_pk
FROM $*hlptab25
ORDER BY huid__1 ASC,last_name__1 ASC,first_name__1 ASC
I have tried to just use 'COPY TO' and 'COPY INTO FILE' but the syntax error still persists. I also put parenthesis around the whole SQL statement and that did not resolve the issue.
Has anyone encountered this before - if so how did you resolve it?
Thanks

Kafka-connect with sqlserver

These are commands which I am running:-
bin/zookeeper-server-start etc/kafka/zookeeper.properties &
bin/kafka-server-start etc/kafka/server.properties &
bin/schema-registry-start etc/schema-registry/schema-registry.properties &
bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/quickstart-sqlserver.properties &
bin/kafka-avro-console-consumer --new-consumer --bootstrap-server localhost:9094 --topic test3-sqlserver-jdbc-ErrorLog --from-beginning
I am trying to connect sqlserver using confluent platform(kafka-connect) and facing following issues:
When I am trying to connect to default schema i.e. dbo , connection is built but it is not able to fetch data into the kafka consumer. The connection details that I am using are:
name=test-sqlserver-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:sqlserver://********:1433;database=AdventureWorks2012;user=****;password=****
mode=incrementing
incrementing.column.name=ErrorLogID
topic.prefix=test3-sqlserver-jdbc-
table.whitelist=ErrorLog
schema.registry=dbo
When I am trying to connect to any other schema, the producer is throwing error, connection details that i am using are :
name=test-sqlserver-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:sqlserver://********:1433;database=AdventureWorks2012;user=****;password=****
mode=incrementing
incrementing.column.name=AddressID
topic.prefix=test3-sqlserver-jdbc-
table.whitelist=Address
schema.registry=Person
Error :
INFO Source task WorkerSourceTask{id=test-sqlserver-jdbc-autoincrement-0} finished
initialization and start (org.apache.kafka.connect.runtime.WorkerSourceTask:138)
[2017-03-07 17:55:47,041] ERROR Failed to run query for table
TimestampIncrementingTableQuerier{name='Address', query='null',
topicPrefix='test3-sqlserver-jdbc-', timestampColumn='null',
incrementingColumn='AddressID'}:
com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'Address'.
io.confluent.connect.jdbc.JdbcSourceTask:239)
[2017-03-07 17:55:52,124] ERROR Failed to run query for table
TimestampIncrementingTableQuerier{name='Address', query='null',
topicPrefix='test3-sqlserver-jdbc-', timestampColumn='null',
incrementingColumn='AddressID'}: com.microsoft.sqlserver.jdbc.SQLServerException:
Invalid object name 'Address'. (io.confluent.connect.jdbc.JdbcSourceTask:239)
[2017-03-07 17:55:53,684] INFO Reflections took 9299 ms to scan
262 urls, producing 12112 keys and 79402 values
(org.reflections.Reflections:229)
[2017-03-07 17:55:57,181] ERROR Failed to run query for table
TimestampIncrementingTableQuerier{name='Address', query='null',
topicPrefix='test3-sqlserver-jdbc-', timestampColumn='null',
incrementingColumn='AddressID'}:
com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'Address'.
(io.confluent.connect.jdbc.JdbcSourceTask:239)

Resources