update and join tables without a where - sql-server

I'm trying to get the relationships of two databases straighten out and I have an update clause like this
update dbo1.table1
set Relationship_column = table2.id
from dbo1.table2 as table2
inner join dbo2.table3 as Table3
on table2.number = Table3.number
Basically if I join two tables and update a 3rd table based on the results. This works on SSIS because you can just remap the entire table, I dont want to do that.
Of course this does not work in SQL because I need a WHERE clause else each record is repeated on all columns.Is there a work around to this??

Related

netezza left outer join query performance

I have a question related to Netezza query performance .I have 2 tables Table A and Table B and Table B is the sub set of Table A with data alteration .I need to update those new values to table A from table B
We can have 2 approaches here
1) Left outer join and select relevant columns and insert in target table
2) Insert table a data into target table and update those values from tableB using join
I tried both and logically both are same.But Explain plan is giving different cost
for normal select
a)Sub-query Scan table "TM2" (cost=0.1..1480374.0 rows=8 width=4864 conf=100)
update
b)Hash Join (cost=356.5..424.5 rows=2158 width=27308 conf=21)
for left outer join
Sub-query Scan table "TM2" (cost=51.0..101474.8 rows=10000000 width=4864 conf=100)
From this I feel left outer join is better .Can anyone put some thought on this and guide
Thanks
The reason that the cost of insert into table_c select ... from table_a; update table_c set ... from table_b; is higher is because you're inserting, deleting, then inserting. Updates in Netezza mark the records to be updated as deleted, then inserts new rows with the updated values. Once the data is written to an extent, it's never (to my knowledge) altered.
With insert into table_c select ... from table_a join table_b using (...); you're only inserting once, thereby only updating all the zone maps once. The cost will be noticeably lower.
Netezza does an excellent job of keeping you away from the disk on reads, but it will write to the disk as often as you tell it to. In the case of updates, seemingly more so. Try to only write as often as is necessary to gain benefits of new distributions and co-located joins. Any more than that, and you're just using excess commit actions.

How to compare two tables in SSIS? (SQL Server)

I am creating an SSIS package that will compare two tables and then insert data in another table.
Which tool shall I use for that? I tried to use "Conditional Split" but it looks like it only takes one table as input and not two.
These are my tables:
TABLE1
ID
Status
TABLE2
ID
Status
TABLE3
ID
STatus
I want to compare STATUS field in both tables. If Status in TABLE1 is "Pending" and in TABLE2 is "Open" then insert this record in TABLE3.
If your tables are not large you can use a Lookup transformation with Full Cache, but I wouldn't recommend it because if your tables grow you will run into problems. I know I did.
I would recommend Merge Join transformation. Your setup will include following:
two data sources, one table each
two Sort transformations, as Merge Join transformation needs sorted input; I guess you need to match records using ID, so this would be a sort criteria
one Merge Join transformation to connect both (left and right) data flows
one Conditional Split transformation to detect if there are correct statuses in your tables
any additionally needed transformation (e.g. Derived Column to introduce data you have to insert to your destination table)
one data destination to insert into destination table
This should help, as the article explains the almost exact problem/solution.
I managed to do it by using Execute SQL Task tool and writing the following query in it.
INSERT INTO TABLE3 (ID, Status)
SELECT * FROM TABLE1 t1, TABLE2 t2
WHERE t1.ID = t2.ID and t1.status = 'Pending' and t2.status = 'Open'
i think so this is what you are looking for.?
In your case if both the tables are Sql tables then follow the steps below
Drag dataflow task
Edit dataflow task add Oledb source and in sql command paste the below sql
code
add oledb destination and map the columns with table3
sql code
select b.id,b.status
from table1 a
join table2 b on a.id = b.id
where a.status = 'Pending' and b.status = 'open'
I think this will work for you.

Transfer data from one table to another in sql server

I have two tables, one of which I don't need anymore. I want to transfer the piece of data i need from the obsolete table, into the table I'm going to keep. There are bookingid columns in both tables, which I can use to match the rows up. Its a 1 to 0 or 1 relationship. I've looked around and built up this query to accomplish the transfer, but I'm getting a could not be bound error on bookingtoupdate.bookingid
WITH bookingtoupdate (bookingid) AS
(
SELECT bookingid
FROM bookings
)
UPDATE bookings
SET meetinglocation = (SELECT business.name
FROM abk_Locations
INNER JOIN business ON dbo.abk_Locations.IP_Number = business.businessid
WHERE
(dbo.abk_Locations.Booking_Number = bookingtoupdate.bookingid)
)
WHERE
bookingid = bookingtoupdate.bookingid
Are there any obvious issues with my code?
I referred the following pages...
http://msdn.microsoft.com/en-us/library/ms175972.aspx
SQL Server FOR EACH Loop
You declare bookingtoupdate but you don't select anything from it. That's why it can't be bound.
Here is a simplified query to do what you need without CTE
UPDATE bookings
SET meetinglocation = business.name
FROM bookings
INNER JOIN abk_Locations ON abk_Locations.Booking_Number = bookings.bookingid
INNER JOIN business ON dbo.abk_Locations.IP_Number = business.businessid

Join multiple table performance

In my current project, I have to left join multiple table (about 10->20 table) together. In these tables, there are about 1->3 large table with millions row (at maximum: 80 millions), the other table only have thousands row at most.
Currently, my query is like:
SELECT *
FROM table1 left join table2 on table1.A=table2.A
table1 left join table3 on table1.B=table3.B
table1 left join table4 on table1.C=table4.C
table1 left join table5 on table1.D=table5.D
....
table1 left join table15 on table1.Z=table15.Z
table1 and table2 are large table, other are small.
I have clustered index in all of these table but the performance is still low. So, I want to know if there is anything I can try to increase the performance.
p/s: I have try to create nonclustered index in these table but the performance become lower than before.
Well the fastest query would be if you de-normalized your table1 so that the split out normalized values were actually part of the table.
Another solution that you might try is building a temp table that was one big collection of the 20 other small tables. And then just join that temp table back to your table1.
First of all, do you really need all those joined data? I suppose most of the situations you don't. If you do, you probably need to review your requirements and architecture.
So the trick is, you only get the data you want, instead of all of them. And filter the data as early as possible (even before joining the next table. but don't worry, SQL Server would do some optimization for you).
I would start from checking the execution plan with Ctrl+L. Try finding out those "Index Scan" nodes and build index for them. I can't go any further without seeing your execution plan.

UPDATE query from OUTER JOINed tables or derived tables

Is there any way in MS-Access to update a table where the data is coming from an outer joined dataset or a derived table? I know how to do it in MSSQL, but in Access I always receive an "Operation must use updateable query" error. The table being updated is updateable, the source data is not. After reading up on the error, Microsoft tells me that the error is caused when the query would violate referential integrity. I can assure this dataset will not. This limitation is crippling when trying to update large datasets. I also read that this can supposedly be remedied by enabling cascading updates. If this relationship between my tables is defined in the query only, is this a possibility? So far writing the dataset to a temp table and then inner joining that to the update table is my only solution; that is incredibly clunky. I would like to do something along the lines of this:
UPDATE Table1
LEFT JOIN Table2 ON Table1.Field1=Table2.Field1
WHERE Table2.Field1 IS Null
SET Table1.Field1= Table2.Field2
or
UPDATE Table1 INNER JOIN
(
SELECT Field1, Field2
FROM Table2, Table3
WHERE Field3=’Whatever’
) AS T2 ON Table1.Field1=T2.Field1
SET Table1.Field1= T2.Field2
Update Queries are very problematic in Access as you've been finding out.
The temp table idea is sometimes your only option.
Sometimes using the DISTINCTROW declaration solves the problem (Query Properties -> Unique Records to 'Yes'), and is worth trying.
Another thing to try would be to use Aliases on your tables, this seems to help out the JET engine as well.
UPDATE Table3
INNER JOIN
(Table1 INNER JOIN Table2 ON Table1.uid = Table2.uid)
ON
(Table3.uid = Table2.uid)
AND
(Table3.uid = Table1.uid)
SET
Table2.field=NULL;
What I did is:
1. Created 3 tables
2. Establish relationships between them
3. And used the query builder to update a field in Table2.
There seems to be a problem in the query logic. In your first example, you LEFT JOIN to Table2 on Field1, but then have
Table2.Field1 IS NULL
in the WHERE clause. So, this limits you to records where no JOIN could be made. But then you try and update Table 1 with data from Table2, despite there being no JOIN.
Perhaps you could explain what it is you are trying to do with this query?

Resources