SQL Server Update with complex logic

SQL Server Update with complex logic - sql-server

I have a table which I need to update to value of Y, but this update is based upon some results from other tables, and I am not sure how to do this.
Basically, I need to complete the following checks
I need to check from the table I need to update that the other table has exactly 19 matching rows
In those matching rows that one of the fields is not null
I have two other tables which I need to check that records exist in the latter table plus to ensure that the matching records in the latter do not contain the value of "Y" in one of the fields.
My approach to this is to use UNIONs, but I would like someone to advise me whether this approach is correct or whether there is a much better way of doing it:
SELECT '1'
FROM t_A_Outbound
Where NOT HEADER IN (Select HEADER FROM t_B_Outbound)
UNION
SELECT '1'
FROM t_A_Outbound
Where HEADER IN (Select HEADER FROM t_B_Outbound
WHERE NOT INCOMPLETE ='Y')
UNION
Select '1'
From t_C_Outbound
Where ValueString = ''
UNION
Select '1'
From t_C_Outbound
WHERE Exists(Select Count(key), HEADER
From t_C_Outbound IN (SELECT HEADER FROM table_that_needs_updating)
Group By HEADER
Having NOT Count(Cast(key as Int)) = 19)
I thought of using 1 as flag to say if this value comes back to update the field in the table I need to change.
Can anyone advise me?

It is rather unclear to me what unions do for you.
You want an update statement something like:
update table
set y = . . .
where header in (Select header
From t_C_Outbound
Group By HEADER
Having Count(*)= 19 and
count(KEY) = count(*)
) and
header in (select header
from other table
group by header
having sum(case when col = 'Y' then 1 else 0 end) = 0
)
and so on. You don't describe the problem clearly enough to give more detailed code.

Related

How to check whether two values exist in column [SQL Server]?

I have a BIT parameter that I am hoping to set to 1 if two values exist in a column of my temporary table.
I have had a look online and most people suggest using the CONTAINS function, however that requires me to change the settings in my environment which isn't an option. Elsewhere, I've seen LIKE mentioned, but when I've tried this its been no use as my aim is to make sure both values exist in the column but since LIKE is working on a row by row basis its not working as I am hoping. Here is what I have so far:
CREATE TABLE tempTable (Description nvarchar(100))
INSERT INTO tempTable
VALUES ('Word1'), ('Word2')
DECLARE #bValuesExist BIT
IF EXISTS(SELECT Description
FROM tempTable
WHERE Description LIKE 'Word1'
AND Description LIKE 'Word2')
SET #bValuesExist = 1
SELECT #bValuesExist
http://www.sqlfiddle.com/#!18/7fbe1c/6
The results I'd hope for from the above code snippet is for the bValuesExist variable to be set to true since both values exist in the description column of tempTable. However, the code is currently checking whether the description column contains "Word1" and "Word2" on a row by row basis, how can I do this check so its on the whole column and not just the row?

Description LIKE 'Word1' AND Description LIKE 'Word2' can never be true, as a column's value (which is a scalar value) cannot be 2 different values at the same time. What you want here is a HAVING and a conditional aggregate:
IF EXISTS(SELECT 1
FROM tempTable --This isn't a temporary table.
HAVING COUNT(CASE Description WHEN 'Word1' THEN 1 END) > 0
AND COUNT(CASE Description WHEN 'Word2' THEN 1 END) > 0)
SELECT 1;

If I understand correctly, you want to check a list of values in the table. One method uses aggregation:
SET #bValuesExist = (SELECT (CASE WHEN COUNT(DISTINCT Description) = 2 THEN 1 ELSE 0 END)
FROM tempTable tt
WHERE Description in ('Word1', 'Word2')
);
The 2 is the number of words. The DISTINCT is to account for the possibility of duplicates in tempTable.
Note: This does not require an IF in the code.

Field is being updated with same value

I have a table that has a new column, and updating the values that should go in the new column. For simplicity sake I am reducing my example table structure as well as my query. Below is how i want my output to look.
IDNumber NewColumn
1 1
2 1
3 1
4 2
5 2
WITH CTE_Split
AS
(
select
*,ntile(2) over (order by newid()) as Split
from TableA
)
Update a
set NewColumn = a.Split
from CTE_Split a
Now when I do this I get my table and it looks as such
IDNumber NewColumn
1 1
2 1
3 1
4 1
5 1
However when I do the select only I can see that I get the desire output, now I have done this before to split result sets into multiple groups and everything works within the select but now that I need to update the table I am getting this weird result. Not quiet sure what I'm doing wrong or if anyone can provide any sort of feedback.
So after a whole day of frustration I was able to compare this code and table to another that I had already done this process to. The reason that this table was getting updated to all 1s was because turns out that whoever made the table thought this was supposed to be a bit flag. When it reality it should be an int because in this case its actually only two possible values but in others it could be more than two.
Thank you for all your suggestion and help and it should teach me to scope out data types of tables when using the ntile function.

Try updating your table directly rather than updating your CTE. This makes it clearer what your UPDATE statement does.
Here is an example:
WITH CTE_Split AS
(
SELECT
*,
ntile(2) over (order by newid()) as Split
FROM TableA
)
UPDATE a
SET NewColumn = c.Split
FROM
TableA a
INNER JOIN CTE_Split c ON a.IDNumber = c.IDNumber
I assume that what you want to achieve is to group your records into two randomized partitions. This statement seems to do the job.

Update a view doesn't work

I'm working on a view which is then updated by the user. This update basically changes the value of column. But right now it doesnt let me do that and produces this :
Update or insert of view or function '' failed because it contains a derived or constant field.
I know this is because I have a constant in the select statement but is there a way to get around it? Please help
This is my code for the view
Create view Schema.View1
as
SELECT
Convert(Varchar(20),l.jtpName) as JobType, Convert(Varchar(10),' <All> ')as SubCategory , Convert(varchar (3), Case when a.jtpName= l.jtpName and a.subName= ' <All> ' then 'Yes' else 'No' end) As AutoProcess from Schema.JobType l left join Schema.Table1 a on l.jtpName=a.jtpName
UNION
SELECT
Convert(Varchar(20),a.jtpName) as JobType, Convert(Varchar(10),a.subName) as SubCategory, Convert(varchar (3),Case when b.jtpName= a.jtpName and b.subName= a.subName then 'Yes' else 'No' end) As AutoProcess from Schema.SubCategory a left join fds.Table1 b on a.subName=b.subName
GO
Finally the update statement:
UPDATE Schema.View1 SET AUTOPROCESS = Case WHEN AUTOPROCESS = 'Yes' Then 'No' END Where JOBTYPE = 'Transport' and SUBCATEGORY= 'Cargo'
Thank You

You cannot update a column that is the result of a computation.
According to MSDN, one of the conditions for a view column to be updatable is this:
Any modifications, including UPDATE, INSERT, and DELETE statements, must reference columns from only one base table.
The columns being modified in the view must directly reference the underlying data in the table columns. The columns cannot be derived in any other way, such as through the following:
An aggregate function: AVG, COUNT, SUM, MIN, MAX, GROUPING, STDEV, STDEVP, VAR, and VARP.
A computation. The column cannot be computed from an expression that uses other columns. Columns that are formed by using the set operators UNION, UNION ALL, CROSSJOIN, EXCEPT, and INTERSECT amount to a computation and are also not updatable.
The columns being modified are not affected by GROUP BY, HAVING, or DISTINCT clauses.
TOP is not used anywhere in the select_statement of the view together with the WITH CHECK OPTION clause.
Here not only does your view uses the UNION statement, the AutoProcess field you are trying to update is actually the result of a CASE statement that uses two fields. It makes no sense to try and update that.
I would recommend that you use stored proc to perform writing operations. Or, as Damien suggest, you could use an INSTEAD OF trigger on the view too.

You have to create a TRIGGER and manually apply the changes from the inserted and deleted pseudo-tables against the base tables yourself.

There is no way for sql server to work backwards from your convert functions to the original fields. You cannot update a view this way.
If the view contained your jptName and subName fields, you might be able to update just those fields.

The fastest way to check if some records in a database table?

I have a huge table to work with . I want to check if there are some records whose parent_id equals my passing value .
currently what I implement this is by using "select count(*) from mytable where parent_id = :id"; if the result > 0 , means the they do exist.
Because this is a very huge table , and I don't care what's the exactly number of records that exists , I just want to know whether it exists , so I think count(*) is a bit inefficient.
How do I implement this requirement in the fastest way ? I am using Oracle 10.
#
According to hibernate Tips & Tricks https://www.hibernate.org/118.html#A2
It suggests to write like this :
Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();
I don't know what's the magic of uniqueResult() here ? why does it make this fast ?
Compare to "select 1 from mytable where parent_id = passingId and rowrum < 2 " , which is more efficient ?

An EXISTS query is the one to go for if you're not interested in the number of records:
select 'Y' from dual where exists (select 1 from mytable where parent_id = :id)
This will return 'Y' if a record exists and nothing otherwise.
[In terms of your question on Hibernate's "uniqueResult" - all this does is return a single object when there is only one object to return - instead of a set containing 1 object. If multiple results are returned the method throws an exception.]

There's no real difference between:
select 'y'
from dual
where exists (select 1
from child_table
where parent_key = :somevalue)
and
select 'y'
from mytable
where parent_key = :somevalue
and rownum = 1;
... at least in Oracle10gR2 and up. Oracle's smart enough in that release to do a FAST DUAL operation where it zeroes out any real activity against it. The second query would be easier to port if that's ever a consideration.
The real performance differentiator is whether or not the parent_key column is indexed. If it's not, then you should run something like:
select 'y'
from dual
where exists (select 1
from parent_able
where parent_key = :somevalue)

select count(*) should be lighteningly fast if you have an index, and if you don't, allowing the database to abort after the first match won't help much.
But since you asked:
boolean exists = session.createQuery("select parent_id from Entity where parent_id=?")
.setParameter(...)
.setMaxResults(1)
.uniqueResult()
!= null;
(Some syntax errors to be expected, since I don't have a hibernate to test against on this computer)
For Oracle, maxResults is translated into rownum by hibernate.
As for what uniqueResult() does, read its JavaDoc! Using uniqueResult instead of list() has no performance impact; if I recall correctly, the implementation of uniqueResult delegates to list().

First of all, you need an index on mytable.parent_id.
That should make your query fast enough, even for big tables (unless there are also a lot of rows with the same parent_id).
If not, you could write
select 1 from mytable where parent_id = :id and rownum < 2
which would return a single row containing 1, or no row at all. It does not need to count the rows, just find one and then quit. But this is Oracle-specific SQL (because of rownum), and you should rather not.

For DB2 there is something like select * from mytable where parent_id = ? fetch first 1 row only. I assume that something similar exists for oracle.

This query will return 1 if any record exists and 0 otherwise:
SELECT COUNT(1) FROM (SELECT 1 FROM mytable WHERE ROWNUM < 2);
It could help when you need to check table data statistics, regardless table size and any performance issue.

Efficient checking of possible duplicate entities

I have a requirement to produce a list of possible duplicates before a user saves an entity to the database and warn them of the possible duplicates.
There are 7 criteria on which we should check the for duplicates and if at least 3 match we should flag this up to the user.
The criteria will all match on ID, so there is no fuzzy string matching needed but my problem comes from the fact that there are many possible ways (99 ways if I've done my sums corerctly) for at least 3 items to match from the list of 7 possibles.
I don't want to have to do 99 separate db queries to find my search results and nor do I want to bring the whole lot back from the db and filter on the client side. We're probably only talking of a few tens of thousands of records at present, but this will grow into the millions as the system matures.
Anyone got any thoughs of a nice efficient way to do this?
I was considering a simple OR query to get the records where at least one field matches from the db and then doing some processing on the client to filter it some more, but a few of the fields have very low cardinality and won't actually reduce the numbers by a huge amount.
Thanks
Jon

OR and CASE summing will work but are quite inefficient, since they don't use indexes.
You need to make UNION for indexes to be usable.
If a user enters name, phone, email and address into the database, and you want to check all records that match at least 3 of these fields, you issue:
SELECT i.*
FROM (
SELECT id, COUNT(*)
FROM (
SELECT id
FROM t_info t
WHERE name = 'Eve Chianese'
UNION ALL
SELECT id
FROM t_info t
WHERE phone = '+15558000042'
UNION ALL
SELECT id
FROM t_info t
WHERE email = '42#example.com'
UNION ALL
SELECT id
FROM t_info t
WHERE address = '42 North Lane'
) q
GROUP BY
id
HAVING COUNT(*) >= 3
) dq
JOIN t_info i
ON i.id = dq.id
This will use indexes on these fields and the query will be fast.
See this article in my blog for details:
Matching 3 of 4: how to match a record which matches at least 3 of 4 possible conditions
Also see this question the article is based upon.
If you want to have a list of DISTINCT values in the existing data, you just wrap this query into a subquery:
SELECT i.*
FROM t_info i1
WHERE EXISTS
(
SELECT 1
FROM (
SELECT id
FROM t_info t
WHERE name = i1.name
UNION ALL
SELECT id
FROM t_info t
WHERE phone = i1.phone
UNION ALL
SELECT id
FROM t_info t
WHERE email = i1.email
UNION ALL
SELECT id
FROM t_info t
WHERE address = i1.address
) q
GROUP BY
id
HAVING COUNT(*) >= 3
)
Note that this DISTINCT is not transitive: if A matches B and B matches C, this does not mean that A matches C.

You might want something like the following:
SELECT id
FROM
(select id, CASE fld1 WHEN input1 THEN 1 ELSE 0 "rule1",
CASE fld2 when input2 THEN 1 ELSE 0 "rule2",
...,
CASE fld7 when input7 THEN 1 ELSE 0 "rule2",
FROM table)
WHERE rule1+rule2+rule3+...+rule4 >= 3
This isn't tested, but it shows a way to tackle this.

What DBS are you using? Some support using such constraints by using server side code.

Have you considered using a stored procedure with a cursor? You could then do your OR query and then step through the records one-by-one looking for matches. Using a stored procedure would allow you to do all the checking on the server.
However, I think a table scan with millions of records is always going to be slow. I think you should work out which of the 7 fields are most likely to match are make sure these are indexed.

I'm assuming your system is trying to match tag ids of a certain post, or something similar. This is a multi-to-multi relationship and you should have three tables to handle it. One for the post, one for tags and one for post and tags relationship.
If my assumptions are correct then the best way to handle this is:
SELECT postid, count(tagid) as common_tag_count
FROM posts_to_tags
WHERE tagid IN (tag1, tag2, tag3, ...)
GROUP BY postid
HAVING count(tagid) > 3;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server Update with complex logic - sql-server

Related

How to check whether two values exist in column [SQL Server]?

Field is being updated with same value

Update a view doesn't work

The fastest way to check if some records in a database table?

Efficient checking of possible duplicate entities

Categories

Resources