Here is my scenario.
I am using a simple copy activity in ADF to move data from one SQL Server to another.
I mainly do a full load to start off and then write a SQL Query to fetch data from last 3 hours. Followed by a trigger run each hour.
My data pipeline works like this:
Source -> Azure Table storage -> Azure SQL.
Problem is happening from source to Table.
It is working fine for all other tables except one.
For that particular table it is only copying 1 row. Unless I go back in and do a full load thats when I get all the data.
Can someone please help me understand what is going on.
Please ask for clarification and more details if needed.
You are facing this issue because of following reason
• Primary key violation when writing to SQL Server/Azure SQL Database/Azure Cosmos DB.
For example: Copy data from a SQL server to a SQL database. A primary key is defined in the sink SQL database, but no such primary key is defined in the source SQL server. The duplicated rows that exist in the source cannot be copied to the sink. Copy activity copies only the first row of the source data into the sink. The subsequent source rows that contain the duplicated primary key value are detected as incompatible and are skipped.
Refer - https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance#supported-scenarios
Related
I am relatively new to SSIS and have to come up with a SSIS package for work such that certain tables must be dynamically moved from one SQL server database to another SQL server database. I have the following constraints that need to be met:
Source table names and destination table names may differ so direct copying of table does not work with transfer SQL server object task.
Only certain columns may be transferred from source table to destination table.
This package needs to run every 5 minutes so it has to be relatively fast.
The transfer must be dynamic such that if there are new source tables, the package need not be reconfigured with hard coded values.
I have the following ideas for now:
Use transfer SQL Server object task but I'm not sure if the above requirements can be met, especially selective transfer of tables and dynamic mapping of columns.
Use SQLBulkCopy in a script component to perform migration.
I would appreciate if anyone could give some direction as to how I can go about meeting the requirements and if my existing ideas are possible.
I am trying to copy data from views on a trusted SQL Server 2012 to tables on a local instance of SQL Server on a scheduled transfer. What would be the best practice for this situation?
Here are the options I have come up with so far:
Write an executable program in C# or VB to delete existing local table, query the data from remote database and then write results to tables in the local database. The executable would run on a scheduled task.
Use BCP to copy data to a file and then upload into local table.
Use SSIS
Note: The connection between local and remote SQL Server is very slow.
Since the transfers are scheduled, so I suppose you want this data to be up-to-date.
My recommendation would be to use SSIS and schedule it using SQL Agent. If you wrote a C# program, I think the best outcome you will gain is a program imitating SSIS. Moreover, SSIS will be a very easy to amend the workflow anytime.
Either way, to make such program/package up-to-date, you will have to answer an important question: Is the source table updatable or is it like a log (inserts only)?
This question is so important because it will determine how you will fetch the new updates from the source table. For example, if the table represents logs, you will most probably use the Primary Key to detect new records, if not, you might want to seek a column representing update date/time. If you have the authority to alter the source table, you might want to add timestamp column which represent the row version (timestamp differs than datetime)
For building an SSIS package, it will mainly contain the following components:
Execute SQL Task to get the maximum value from source table.
Execute SQL Task to get the last value where it should start from at the destination table. You can get this value either by selecting the maximum value from the destination table or if the table is pretty large you can store that value in another table (configuration table for example).
Data Flow which moves the data from source table starting after the value fetched in step 2 to the value fetched in step 1.
Execute SQL Task for updating the new maximum value back to the configuration table if you chose this technique.
BCP can be used to export the data compress and transfer over network which can be then imported into local instance of SQL.
Also with BCP data exports can be contained with smaller batches of data for easier management of data.
https://msdn.microsoft.com/en-us/library/ms191232.aspx
https://technet.microsoft.com/en-us/library/ms190923(v=sql.105).aspx
I am a newbie in SQL so please bear with me. I am hoping you can help/guide me. I have a table on 5 MS SQL Servers that have identical Columns and I want to consolidate the data into a separate table/separate MS SQL Server.
the challenge is that I only have "Read Only Permission" from the source table (5 MS SQL Servers) but I have permission to create a table on the destination MS SQL Server DB.
another challenge is I wan to truncate or extract parts of the txt in one column of the source table and save them into different columns on the destination table.
Next challenge is for the destination table to query once a day the source table for any update.
See screenshot by clicking either of the URL.
Screenshot URL1
Screenshot URL2
Appreciate it very much if you can help/guide me. Many thanks in advance.
You'll need to setup a linked server and use either an SSIS package to pull the data into the form you need, or OPENROWSET/OPENQUERY queries with an insert on the server you do have write privileges.
Either pre-create a table to put the new data in, or if not needed build up a temporary table or the insert the data into a table variable.
To concat a field to a new field use something like the examples below:
SELECT (field1 + field 2) as Newfield
or
SELECT (SUBSTRING(field1, 2,2) + SUBSTRING(field2, 3,1)) as Newfield
Finally you should setup all this an agent Job scheduled to your needs.
Apologies if this is not as detailed as you like, but it seems there are many questions to be answered and not enough detail to help further.
Alternatively you could also do a lookup upon lookup (USING SSIS):
data flow task > download first table completely to destination server
JOIN TO
dataflow task > reading from destination server, do a lookup to 2 origin server (if match you might update, if not, insert)
repeat until all 5 of them are done.
This is NOT the most elegant or efficient solution, but it will definitely get the work done.
I have a table in SQL Server 2005 whose primary key is an identity column (increment 1), and I also have a default value set for one of the other columns.
When I open the table in SQL Server Management Studio and type in a new record into the table, the inserted values are not displayed, and I get the following message on save:
However, if the table has either an identity column, or one or more columns with a default value specified, the inserted value(s) will be displayed in the table after a save. And can be edited.
I frequently create test data in ssms this way and this issue makes it cumbersome to do some things I would like to.
Is there any way around this?
Right click on it and say Execute SQL...it should not display it(error)..its just sql server way of doing things..since it inserts the identity column later..You should not add records in that way in the first place.
You should not add records to a database that way! It can have unfortunate side effects (especially on large tables) as you have discovered.
Records for lookup tables should be added through rerunable scripts. Those scripts should in source control. This makes them easy to promote from dev to Qa to staging to prod.
Test records should also be done in scripts (including scripts to remove the test records) so that you can run thenm on other environments as well as being able to delete and recreate them if some process you are testing went bad. These too should eb in source control (as should all database changes which also should not be done through the GUI).
I am not at all familiar with Oracle so bear with me!
I am using version Oracle 10G with the web front end called Enterprise Manager. I have been given some CSV files to import however when I use the Load Data from User Files option I think I can set everything up but when the job runs it complains that there are unique constraints, I guess because there is duplicate data trying to be inserted.
How can I get the insert to create a new primary key similar to a MSSQL auto inc number?
Oracle does not have an analog to the MSSQL auto incrementing field. The feature has to be simulated via triggers and Oracle sequences. Some options here are to:
create a trigger to populate the columns you want auto incremented from a sequence
delete the offending duplicate keys in the table
change the values in your CSV file.
You might look at this related SO question.
There is no autoinc type in Oracle. You have to use a sequence.
By using a before insert trigger, you could get something similar to what you get by using an autoinc in SQL Server.
You can see here how to do it.