I have a query for MySQL to get hashvalue of table columns except for the primary key.
I need to do same thing for SQL Server so far I have tried hashbytes but the result is not looking right.
The query for MySQL I'm using to create hash values for all columns except primary key is this
SELECT MD5(CONCAT(ISNULL(COLUMN1,''),ISNULL(COLUMN2,''),ISNULL(COLUMN3,''))) as HASHVALUE, PRIMARYKEY FROM TABLE_NAME1
And what I have tried with SQL so far is this..
SELECT HashBytes('MD5', COLUMN1) AS HASHVALUE , PRIMARYKEY FROM TABLE_NAME1
Result from mysql >
|HASHVALUE | Primary |
|:------------------------------- |:-------:|
|7a16284f87ab262f0047f4c0b4e50b2c | 0 |
|d41b398c603086da409e87ee35824cf6 | 2 |
|c0e7b9b9c29703282a9f192a4c0aead9 | 6 |
|ce13595c356a0a373140f3fda1eb5fb3 | 7 |
Result from SQL >
| HASHVALUE | PRIMARY |
| ---------- | ------- |
|ã $t J ¦OlU | 1 |
The SQL Server result is still in binary. You must convert to a string:
LOWER(CONVERT(varchar(32), HashBytes('MD5', COLUMN1), 2))
See it here to prove you get the same results, even after combining columns:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b11c422fe9236043ba5edbeb7d97043d
You can use the menu at the top of the linked page to switch between MySql and SQL Server.
Related
The PostgreSQL database we have is common multi tenant database.
Question is, need to auto generate a unique number in "customerNumber" column which needs to be in sequential order.
The trick here is, the sequence needs to be unique for each "hotelLocation".
For "hotelLocation"= 1, If we have numbers: 1,2,3 for
"customerNumber"
For "hotelLocation"= 2, We need have numbers: 1,2,3
for "customerNumber"
Following is the sample layout for table,
#Entity
public class CustomerInfo {
#Id
#GeneratedValue(...)
private Long idNumber;
String hotelLocation;
/** Looking for option where, this number needs to
auto generated on SAVE, and need to be in separate sequence
for each hotelLocation **/
private Long customerNumber;
}
So finally here's how output will look like,
+----------+---------------+----------------+
| idNumber | hotelLocation | customerNumber |
+----------+---------------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 1 | 3 |
| 5 | 2 | 2 |
+----------+---------------+----------------+
I am ok with generating unique number both via Hibernate based or via Triggers also.
Searching across, i got following,
Hibernate JPA Sequence (non-Id)
But this one would keep generating in sequence without having separate sequence for each "hotelLocation"
Any solution to this will be very helpful. I am sure there are lot of people with multi tenant database looking for similar solution.
Thanks
You can do this easly with postgresql window function row_number().
Don't know your database but it should be something like this:
SELECT idNumber, hotelLocation,
row_number() OVER (PARTITION BY hotelLocation ORDER BY idNumber) AS
customerNumber FROM table
Check more on window functions here: Understanding Window Functions
Let's say I have the following table:
Table Test:
ID | Name | Ver | Col3| Col4 | ...
01 | ABC | 2 | xxx | yyy | ...
02 | DEF | 8 | xxx | yyy | ...
03 | DEF | 8 | xxx | yyy | ...
...
The ID column is primary key, unique key, and clustered index Ver column is nothing special.
So far I had SELECT queries in the following way:
SELECT (NAME, Col1, Col2) WHERE ID = '01'
The next version will contain SELECT queries in the following way:
SELECT (NAME, Col1, Col2) WHERE ID = '01' AND Ver = '8'
Why? Because I am planning to include this query in the UPDATE and DELETE queries in the system, which would make sure that there will be no concurrent editing conflicts, since it can only update where ID and Ver is matching, if the entity changed in the meantime, the SELECT part will defend the entity from update, since it will return nothing. (nothing to update or delete)
The question
Is this change going to affect my DBs performance, or it is not going to bother about the Ver column in the query, since one of the columns in the select is a unique primary key.
If it is going to affect performance of record retrieval, should I include the Ver together with ID in a clustered index? Should Ver be a second index?
Facts and opinions are welcome.
You would need to try it and look at the query plan to make sure but my feeling is:
Ver does not need to be in an index. When SQL Server generates the plan it's smart enough to see that ID is unique. It will therefore get the record with ID = '1' and then filter that single record by Ver = '8'. As this part is only acting on zero or one records it doesn't need an index.
Yesterday, I was asked the same question by two different people. Their tables have a field that groups records together, like a year or location. Within those groups, they want to have a unique ID that starts at 1 and increments up sequentially. Obviously, you could search for MAX(ID), but if these applications have a lot of traffic, they'd need to lock the entire table to ensure the same ID wasn't returned multiple times. I thought about using sequences but that would mean dynamically creating a sequence for each group.
Example 1:
Records created during the year should increment by one and then restart at 1 at the beginning of the next year.
| Year | ID |
|------|----|
| 2016 | 1 |
| 2016 | 2 |
| 2017 | 1 |
| 2017 | 2 |
| 2017 | 3 |
Example 2:
A company has many locations and they want to generate a unique ID for each customer, combining a the location ID with a incrementing ID.
| Site | ID |
|------|----|
| XYZ | 1 |
| ABC | 1 |
| XYZ | 2 |
| XYZ | 3 |
| DEF | 1 |
| ABC | 2 |
One trick that is often under-used is to create a clustered index on Site / ID or Year / ID - BUT Change the order of the ID column to Desc rather than ASC.
This way when you need to scan the CI to get the Next ID value it only needs to check 1 row in the clustered index. I've used this on Multi-Billion Record tables and it runs quite quickly. You can get even better performance by partitioning the table by Site or Year then you'll get the added benefit of partition elimination when you run your MAX(ID) queries.
I have following staging table and a destination table with the same data:
ID | Name | Job | Hash
1 | A | IT | XYZ1
2 | B | Driver | XYZ2
The staging table gets truncated each time and new data gets inserted. Sometimes, a person can get a second job. In that case, we have 2 records with ID 2 and Name B, but with a differentjobandhash` in the staging table.
ID | Name | Job | Hash
1 | A | IT | XYZ1
2 | B | Driver | XYZ2
2 | B | IT | XYY4
If this happens, I need to insert all records with ID 2 into the destination table. I already have a LKP that checkes for (un-)matching ID's, but how can I "tell" SSIS to take ALL records from the staging table based on the ID's I get from the no match output?
You tell ssis by link the no match output from the lookup to the destination. Assume you have already set 'Redirect rows to no match output' in lookup - general. And in your lookup, you check for matching id (not sure how you check unmatching) This way, lookup will output all non-matched rows (by Id) to the destination.
I have a table with 2 columns, Username & Email
Username | Email
-------------------------
a#a.com | a#a.com
b#b.com | c#c.com
I want the Username to be unique in both columns except the current record.
Same for the Email column.
For example, if I am trying to insert the following rows into the table, it should not be allowed.
Username | Email
e#e.com | b#b.com
c#c.com | f#f.com
Can this be done via SQL Server constraints?
Best regards