I have a SQL Server 2008 database with a table that has 575 columns in it. I have a CSV file that matches the table.
When I use SqlBulkCopy (.NET 4), only the first 256 columns get populated. The rest get nulls inserted into them. Has anyone else experienced this issue?
Thanks,
ts3
I'm going to guess that your primary key column is autoincrementing from zero and it is of type tinyint. If so then you should change your primary key on that column to be an integer or something that can hold more values.
Related
After some advice. I'm using SSIS\SQL Server 2014. I have a nightly SSIS package that pulls in data from non-SQL Server db's into a single table (the SQL table is truncated beforehand each time) and I then extract from this table to create a daily csv file.
Going forward, I only want to extract to csv on a daily basis the records that have changed i.e. the Deltas.
What is the best approach? I was thinking of using CDC in SSIS, but as I'm truncating the SQL table before the initial load each time, will this be best method? Or will I need to have a master table in SQL with an initial load, then import into another table and just extract where there are different? For info, the table in SQL contains a Primary Key.
I just want to double check as CDC assumes the tables are all in SQL Server, whereas my data is coming from outside SQL Server first.
Thanks for any help.
The primary key on that table is your saving grace here. Obviously enough, the SQL Server database that you're pulling the disparate data into won't know from one table flush to the next which records have changed, but if you add two additional tables, and modify the existing table with an additional column, it should be able to figure it out by leveraging HASHBYTES.
For this example, I'll call the new table SentRows, but you can use a more meaningful name in practice. We'll call the new column in the old table HashValue.
Add the column HashValue to your table as a varbinary data type. NOT NULL as well.
Create your SentRows table with columns for all the columns in the main table's primary key, plus the HashValue column.
Create a RowsToSend table that's structurally identical to your main table, including the HashValue.
Modify your queries to create the HashValue by applying HASHBYTES to all of the non-key columns in the table. (This will be horribly tedious. Sorry about that.)
Send out your full data set.
Now move all of the key values and HashValues to the SentRows table. Truncate your main table.
On the next pull, compare the key values and HashValues from SentRows to the new data in the main table.
Primary key match + hash match = Unchanged row
Primary key match + hash mismatch = Updated row
Primary key in incoming data but missing from existing data set = New row
Primary key not in incoming data but in existing data set = Deleted row
Pull out any changes you need to send to the RowsToSend table.
Send the changes from RowsToSend.
Move the key values and HashValues to your SentRows table. Update hashes for changed key values, insert new rows, and decide how you're going to handle deletes, if you have to deal with deletes.
Truncate the SentRows table to get ready for tomorrow.
If you'd like (and you'll thank yourself later if you do) add a computed column to the SentRows table with default of GETDATE(), which will tell you when the row was added.
And away you go. Nothing but deltas from now on.
Edit 2019-10-31:
Step by step (or TL;DR):
1) Flush and Fill MainTable.
2) Compare keys and hashes on MainTable to keys and hashes on SentRows to identify new/changed rows.
3) Move new/changed rows to RowsToSend.
4) Send the rows that are in RowsToSend.
5) Move all the rows from RowsToSend to SentRows.
6) Truncate RowsToSend.
Why does SQL Server sometimes not provide sequence no in identity columns? For example we were inserting records into the table; until 300 it was ok, but for the next record the id which was assign it 318 - an 18 numbers gap in identity column.
Can anyone tell me why? And I have seen this for other tables as well one number or two number gap between the before and after.
No records deleted
Thanks
I'm pretty good around Oracle but I've been struggling to find a decent solution to a problem I'm having with Sybase.
I have a table which has an IDENTITY column which is also a User Defined Datatype (UDD) "id" which is numeric(10,0). I've decided to replace the UDD with the native datatype but I get an error when I do this.
I've found that the only way to do this is:
Rename the original table (table_a to table_a_backup) using the procedure sp_rename
Recreate the original table (table_a) but use native data types
Copy the contents of the backup table to the original (i.e insert into table_a select * from table_b)
This works however I have over 10M records and it eventually runs out of log segment and halts (I can't increase the segment any more due to physical requirements).
Does anybody have a solution, preferably not a solution which would involve processing the records as anything but one large set?
Cheers,
JLove
conceptually, something like this works (in Sybase ASE 12.5.x) ...
do an "alter table drop column" on your current ID column
do "alter table add column" stmt to add new column (w/ native datatype) with IDENTITY attribute
Note that the ID field might not have the same numbers, so be very wary of doing the above if the ID field is used as an explicit or implicit key to other tables.
Using SQL Server 2005 and 2008.
I've got a potentially very large table (potentially hundreds of millions of rows) consisting of the following columns:
CREATE TABLE (
date SMALLDATETIME,
id BIGINT,
value FLOAT
)
which is being partitioned on column date in daily partitions. The question then is should the primary key be on date, id or value, id?
I can imagine that SQL Server is smart enough to know that it's already partitioning on date and therefore, if I'm always querying for whole chunks of days, then I can have it second in the primary key. Or I can imagine that SQL Server will need that column to be first in the primary key to get the benefit of partitioning.
Can anyone lend some insight into which way the table should be keyed?
As is the standard practice, the Primary Key should be the candidate key that uniquely identifies a given row.
What you wish to do, is known as Aligned Partitioning, which will ensure that the primary key is also split by your partitioning key and stored with the appropriate table data. This is the default behaviour in SQL Server.
For full details, consult the reference Partitioned Tables and Indexes in SQL Server 2005
There is no specific need for the partition key to be the first field of any index on the partitioned table, as long as it appears within the index it can then be aligned to the partition scheme.
With that in mind, you should apply the normal rules for index field order supporting the most queries / selectivity of the values.
I'm building an ASP.Net/MVC application using SQL 2008 Developer edition and a DB in Sql2005 compatibility mode. Using Entity Framework as DAL.
My problem is that I have a table where I'm using the integer identity column in a like an Invoice Number, that is, it always has to be unique and never reused. So using the GUID column type won't work without a substantial effort.
What I'm seeing is that the DB is filling in the gaps in the identity column. This will cause me long term problems. Is there a setting to disable this "filling in"
That sounds like something outside SQL server; SQL server does not "go back" and re-use gaps in identities unless the table's been reseeded, but even then it will blindly increment one-by-one and probably return a lot of duplicate key errors as it hits rows with existing values.
Are you sure the column is an identity? Is there anything else that might be re-assigning keys and/or turning on identity insert when creating rows?
SQL Server does not fill in the gaps of an identity field by default it will just keep going up in numbers as you insert rows.
It is possible to reset the identity back to 1 and therefore you may then see what you are describing.
Can I suggest you post some code / db structure that shows your problem and search for any code you may have that my perform an identity reseed.
Unless I am not understanding your issue correctly. If you create a primary key on your identity column, or a unique constraint, you can avoid the issue of duplicate values.
For example:
create table TableName
(
InvoiceID int identity(1,1) not null primary key
)