I am looking for a solution to handle invoice numbers in a SQL Server table.
How the system works: on load the system queries a cloud-based accounting package for the last invoice number used. On retrieval of this number the system enters the number into table 2 (replacing the number that existed previously).
When processing invoices, the system should query table 2 for the last used invoice number and increment it by 1 for the first row in table 1 where the invoice number is NULL, and then increment each row after that by 1, where the invoice number is NULL.
I can't use an auto increment row for the entire table as an invoice run (batch of invoices processed together) could have a starting invoice number that isn't equal to the last invoice number plus 1 – the reason being is this system is designed solely to handle the invoices of a single customer, other invoices may have been created for other customers from within the accounts package.
I can't use SQL Server to automate the process as invoice numbers could change on the accounts package and not be reflected in the database – there is no relationship between SQL Server & the accounts package; therefore I would need to manually run an update statement from my application.
I have looked at using RESEED as outlined in the following suggested answer - but I don't think it will solve my problem:
Reset IDENTITY value
Any ideas, help, table design suggestions are greatly appreciated.
Related
Hello i am currently try different data automation processes with python and postgreSQL. I automated the cleaning and upload of a dataset with 40.000 data emtries into my Database. Due to some flaws in my process i had to truncate some tables or data entries.
i am using: python 3.9.7 / postgeSQL 13.3 / pgAdmin 4 v.5.7
Problem
Currently i have ID's of tables who start at the ID of 44700 instead of 1 (due do my editing).
For Example a table of train stations begins with the ID 41801 and ends with ID of 83599.
Question
How can reorganize my index so that the ID starts from 1 to 41801?
After looking online i found topics like "bloat" or "reindex". I tired Vacuum or Reindex but nothing really showed a difference in my tables? As far as now my tables have no relations to each other. What would be the approach to solve my problem in postgreSQL. Some hidden Function i overlooked? Maybe it's not a problem at all, but it definitely looks weird. At some point of time i end up with the ID of 250.000 while only having 40.000 data entries in my table.
Do you use a Sequence to generate ID column of your table? You can check it in pgAdmin under your database if you have a Sequence object in your database: Schemas -> public -> Sequences.
You can change the current sequence number with right-click on the Sequence and set it to '1'. But only do this if you deleted all rows in the table and before you start to import your data again.
As long as you do not any other table which references the ID column of your train station table, you can even update the ID with an update statement like:
UPDATE trainStations SET ID = ID - 41801 WHERE 1 = 1;
In SSIS, if an incoming dataset has multiple records for the same Business Key, how do I load it to the dimensions table with SCD type 2 without using the SCD Wizard.
Sample dataset
Customer ID Name Segment Postal Code
1 James Corporate 50026
2 Andrew Consumer 33311
3 Steven Consumer 90025
2 Andrew Consumer 33306
3 Steven Consumer 90032
1 James Corporate 50087
3 Steven Consumer 90000
In my case, if I try Loading the dimension table with other SSIS components (Lookup/Conditional Split) all the record show up a new row in the table because they are all coming in all at the same time.
I have ‘CurrentFlag’ as the indicator of the current record.
In SSIS, if I have an incoming dataset that has multiple records for the same Business Key, How do I get to recognize these, and set the CurrentFlag as necessary, whether or not a record in the target table has that Business Key already?
Thanks.
OK, this is a massive simplification because SCD's are very challenging to correctly implement. You will need to sit down and think critically about this. My answer below only handles ongoing daily processing - it does not explain how to handle historical files being re-processed, which could potentially result in duplicate records with different EffectiveStart and End Dates.
By definition, you will have an existing record source component (i.e., query from the database table) and an incoming data source component (i.e., a *.csv flatfile). You will need to perform a merge join to identify new records versus existing records. For existing records, you will need to determine if any of the columns have changed (do this in a Derived Column transformation).
You will need to also include two columns for EffectiveStartDate and EffectiveEndDate.
IncomingEffectiveStartDate = FileDate
IncomingEffectiveEndDate = 12-31-9999
ExistingEffectiveEndDate = FileDate - 1
Note on 12-31-9999: This is effectively the Y10K bug. But, it allows users to query the database between date ranges without having to consciously add ISNULL(GETDATE()) in the WHERE clause of a query in the event that they are querying between date ranges.
This will prevent the dates on the columns from overlapping, which could potentially result in multiple records being returned for a given date.
To determine if a record has changed, create a new column called RecordChangedInd of type Bit.
(ISNULL(ExistingColumn1, 0) != ISNULL(IncomingColumn1, 0) ||
ISNULL(ExistingColumn2, 0) != ISNULL(IncomingColumn2, 0) ||
....
ISNULL(ExistingColumn_N, 0) != ISNULL(IncomingColumn_N, 0) ? 1 : 0)
Then, in your split condition you can create two outputs: RecordHasChanged (this will be an INSERT) and RecordHasNotChanged (this will be an UPDATE to deactivate the exiting record and an INSERT).
You can conceivably route both inputs to the same INSERT destination. But, you will need to be careful suppress the update record's ExistingEffectiveEndDate value that deactivates the date.
I have WinForm application (C#) and Backed as SQL Server 2008, which works offline at client location. Currently all branches were getting managed from single location, so an identity column value is always unique across branches.
Now, I want this application to be managed from multiple locations and should work offline as well as online. To make it possible we are putting a SQL Server in PUBLIC IP and each branch will have a separate local SQL Server instance running. The data wil be synced between local and central sever on regular intervals.
Is this approach good to sync data, or is there something better that I can do?
If I go with above approach the problem that I will face is in syncing the data. There is a problem with table structure, eg. I have a table COURSES as follows:
COURSES ( COURSE_ID (identity column), COURSE_NAME, COURSE_BRANCH_ID)
where
COURSE_ID is an IDENTITY column
COURSE_NAME represents the name of Standard
COURSE_BRANCH_ID represents the branch, where the course is taken.
Now each SQL Server will generate its own value for the COURSE_ID column and that might be same for different server.
The unique combination is COURSE_ID and COURSE_BRANCH_ID.
Is there any way I can a append COURSE_BRANCH_ID with COURSE_ID without adding a new IDENTITY column?
Approach I thought of
Remove identity from COURSE_ID column,
Add a new column say ID which will be a identity column.
Now after insert on COURCES table write a trigger which will update the value of COURSE_ID as Concate(COURSE_BRANCH_ID,ID)
convert(number(convert(varchar,COURSE_BRANCH_ID) + convert(varchar,ID))
But this will require lots of efforts as I have around 19 tables with such problem. I there any thing better than this we can do? Any suggestions are welcome! Thank You!
There are several approaches related to this issue. The one you mentioned where you concatenate the branch id to the identity field.
You can use GUID, the possibility of collision is almost zero.
Or you can set the Identity Seed and Increment, such that each branch has a different start value, and all incremented by the number of branches.
For example, if you have four branches, then on Branch1 you may set the
ID INT IDENTITY(1, 4) NOT NULL -- IDs will be 1, 5, 9...etc.
On Branch2
ID INT IDENTITY(2, 4) NOT NULL -- IDs will be 2, 6, 10 ...etc
On Branch3
ID INT IDENTITY(3, 4) NOT NULL -- IDs will be 3, 7, 11 ...etc
And on Branch4
ID INT IDENTITY(4, 4) NOT NULL -- IDs will be 4, 8, 12 ...etc.
I am currently working on a phone directory application. For this application I get a flat file (csv) from corporate SAP that is updated daily that I use to update an sql database twice a day using a windows service. Additionally, users can add themselves to the database if they do not exist (ie: is not included in the SAP file). Thus, a contact can be of 2 different types: 'SAP' or 'ECOM'.
So, the Windows service downloads the file from a SAP ftp, deletes all existing contacts in the database of type 'SAP' and then adds all the contacts on the file to the database. To insert the contacts into the database (some 30k), I load them into a DataTable and then make use of SqlBulkCopy. This works particularly, running only a few seconds.
The only problem is the fact that the primary key for this table is an auto-incremented identity. This means that my contact id's grows at a rate of 60k per day. I'm still in development and my id's are in the area of 20mil:
http://localhost/CityPhone/Contact/Details/21026374
I started looking into reseeding the id column, but if I were to reseed the identity to the current highest number in the database, the following scenario would pose issues:
Windows Service Loads 30 000 contacts
User creates entry for himself (id = 30 001)
Windows Service deletes all SAP contacts, reseeds column to after current highest id: 30 002
Also, I frequently query for users based on this this id, so, I'm concerned that making use of something like a GUID instead of an auto-incremented integer will have too high a price in performance. I also tried looking into SqlBulkCopyOptions.KeepIdentity, but this won't work. I don't get any id's from SAP in the file and if I did they could easily conflict with the values of manually entered contact fields. Is there any other solution to reseeding the column that would not cause the id column values to grow at such an exponential rate?
I suggest following workflow.
import to brand new table, like tempSAPImport, with your current workflow.
Add to your table only changed rows.
Insert Into ContactDetails
(Select *
from tempSAPImport
EXCEPT
SELECT Detail1, Detail2
FROM ContactDetails)
I think your SAP table have a primary key, you can make use of the control if a row updated only.
Update ContactDetails ( XXX your update criteria)
This way you will import your data fast, also you will keep your existing identity values. According to your speed requirements, adding indexes after import will speed up your process.
If SQL Server version >= 2012 then I think the best solution for the scenario above would be using a sequence for the PK values. This way you have control over the seeding process (you can cycle values).
More details here: http://msdn.microsoft.com/en-us/library/ff878091(v=sql.110).aspx
I have Statistic table with these fields Id UserId DateStamp Data
Also there is User table in database which has CreditsLeft(int) field. I need to create function(let's name it FindNewRecordsAndUpdate) which will read Statistic table every 10 minutes from my application and decrease CreditLeft field by number of new Statistic records found for specified user.
My main concern is when I execute FindNewRecordsAndUpdate function next time how to find new records in Statistic field and skip already counted ones? I could add Counted(bool) field in Statistic and set True for already "used" records but maybe there is better solution withotu adding new field?
At least 3 other options:
Use a trigger. So when rows are inserted into the Statistic table, the balance is User is automatically updated
Just do an aggregate on demand over the Statistic table to get the SUM(Data)
Use an indexed view to "pre calculate" the SUM in point 2
Personally, I'd go for point 2 (and point 3 depending on query frequency) to avoid denormalised data in the User table.