compare columns for a table in different 2 databases

compare columns for a table in different 2 databases - sybase

I got 1 table in 2 different databases, in a database the number of columns are 284 columns in the other databse the number of columns are 281 columns so there are 3 columns missing.
is there a query (not a tool as I have found out somthing called compare it ) where it can help to find the missing columns ?
example:
database 1
column1
column2
column3
column4
column5
column6
database 2
column1
column2
column3
column5
column6
In the above example column 4 is missing, is there a query in sybase can tell me what is the missing column?

Create two Temporary Tables for the two tables in two different databases, suppose #TableColumns1 and #TableColumns2
CREATE TABLE #TableColumns1(ColumnName VARCHAR(255))
CREATE TABLE #TableColumns2(ColumnName VARCHAR(255))
INSERT INTO #TableColumns1
SELECT SC.column_name
FROM SYS.SYSCOLUMN SC, SYS.SYSTABLE ST
WHERE SC.table_id = ST.table_id AND ST.table_name = '<DatabaseName1.TableName1>';
INSERT INTO #TableColumns1
SELECT SC.column_name
FROM SYS.SYSCOLUMN SC, SYS.SYSTABLE ST
WHERE SC.table_id = ST.table_id AND ST.table_name = '<DatabaseName2.TableName2>';
Now Create one more Temporary table #MissingTableColumns which will contain the actual results of the missing columns
CREATE TABLE #MissingTableColumns(ColumnName VARCHAR(255), TableName VARCHAR(255))
INSERT INTO #MissingTableColumns
(ColumnName, TableName)
SELECT ColumnName, '<Table1Name>'
FROM #TableColumns1, #TableColumns2
WHERE #TableColumns1.ColumnName *= #TableColumns2.ColumnName
INSERT INTO #MissingTableColumns
(ColumnName, TableName)
SELECT ColumnName, '<Table2Name>'
FROM #TableColumns1, #TableColumns2
WHERE #TableColumns1.ColumnName =* #TableColumns2.ColumnName
Hope this will solve your problem.

Related

Query using a statement within a VARCHAR2 column

Is there a way for a select statement to include in the WHERE clause a statement that is contained within the table? For example, the following table:
CREATE TABLE test_tab(
date_column DATE,
frequency NUMBER,
test_statement VARCHAR2(255)
)
/
If
MOD(SYSDATE - DATE, frequency) = 0
were contained within the column test_statement, is there a way to select rows where this is true? The test_statement will vary and not be the same throughout the table. I am able to do this in PL/SQL but looking to do this without the use of PL/SQL.

This kind of dynamic SQL in SQL can created with DBMS_XMLGEN.getXML. Although the query looks a bit odd so you might want to consider a different design.
First, I created a sample table and row using your DDL. I'm not sure exactly what you're trying to do with the conditions, so I simplified them into two rows with simpler conditions. The first row matches the first condition, and neither row matches the second condition.
--Create sample table and row that matches the condition.
CREATE TABLE test_tab(
date_column DATE,
frequency NUMBER,
test_statement VARCHAR2(255)
)
/
insert into test_tab values(sysdate, 1, 'frequency = 1');
insert into test_tab values(sysdate, 2, '1=2');
commit;
Here's the large query, and it only returns the first row, which only matches the first condition.
--Find rows where ROWID is in a list of ROWIDs that match the condition.
select *
from test_tab
where rowid in
(
--Convert XMLType to relational data.
select the_rowid
from
(
--Convert CLOB to XMLType.
select xmltype(xml_results) xml_results
from
(
--Create a single XML file with the ROWIDs that match the condition.
select dbms_xmlgen.getxml('
select rowid
from test_tab where '||test_statement) xml_results
from test_tab
)
where xml_results is not null
)
cross join
xmltable
(
'/ROWSET/ROW'
passing xml_results
columns
the_rowid varchar2(128) path 'ROWID'
)
);

This calls for dynamic SQL, so - yes, it is PL/SQL that handles it. I don't think that SQL layer is capable of doing it.
I don't know what you tried so far, so - just an idea: a function that returns ref cursor might help, e.g.
SQL> create table test (date_column date, frequency number, test_statement varchar2(255));
Table created.
SQL> insert into test values (trunc(sysdate), 2, 'deptno = 30');
1 row created.
SQL> create or replace function f_test return sys_refcursor
2 is
3 l_str varchar2(200);
4 l_rc sys_refcursor;
5 begin
6 select test_statement
7 into l_str
8 from test
9 where date_column = trunc(sysdate);
10
11 open l_rc for 'select deptno, ename from emp where ' || l_str;
12 return l_rc;
13 end;
14 /
Function created.
Testing:
SQL> select f_test from dual;
F_TEST
--------------------
CURSOR STATEMENT : 1
CURSOR STATEMENT : 1
DEPTNO ENAME
---------- ----------
30 ALLEN
30 WARD
30 MARTIN
30 BLAKE
30 TURNER
30 JAMES
6 rows selected.
SQL>
A good thing about it is that you could save the whole statements into that table and run any of them using the same function.

You can try this
select * from test_tab where mod(sysdate - date, frequency) = 0;

Show all and only rows in table 1 not in table 2 (using multiple columns)

I have one table (Table1) that has several columns used in combination: Name, TestName, DevName, Dept. When each of these 4 columns have values, the record is inserted into Table2. I need to confirm that all of the records with existing values in each of these fields within Table1 were correctly copied into Table 2.
I have created a query for it:
SELECT DISTINCT wr.Name,wr.TestName, wr.DEVName ,wr.Dept
FROM table2 wr
where NOT EXISTS (
SELECT NULL
FROM TABLE1 ym
WHERE ym.Name = wr.Name
AND ym.TestName = wr. TestName
AND ym.DEVName = wr.DEVName
AND ym. Dept = wr. Dept
)
My counts are not adding up, so I believe that this is incorrect. Can you advise me on the best way to write this query for my needs?

You can use the EXCEPT set operator for this one if the table definitions are identical.
SELECT DISTINCT ym.Name, ym.TestName, ym.DEVName, ym.Dept
FROM table1 ym
EXCEPT
SELECT DISTINCT wr.Name, wr.TestName, wr.DEVName, wr.Dept
FROM table2 wr
This returns distinct rows from the first table where there is not a match in the second table. Read more about EXCEPT and INTERSECT here: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql?view=sql-server-2017

Your query should do the job. It checks anything that are in Table1, but not Table2
SELECT ym.Name, ym.TestName, ym.DEVName, ym.Dept
FROM Table1 ym
WHERE NOT EXISTS (
SELECT 1
FROM table2
WHERE ym.Name = Name AND ym.TestName = TestName AND ym.DEVName = DEVName AND ym. Dept = Dept
)
If the structure of both tables are the same, EXCEPT is probably simpler.

IF OBJECT_ID(N'tempdb..#table1') IS NOT NULL drop table #table1
IF OBJECT_ID(N'tempdb..#table2') IS NOT NULL drop table #table2
create table #table1 (id int, value varchar(10))
create table #table2 (id int)
insert into #table1(id, value) VALUES (1,'value1'), (2,'value2'), (3,'value3')
--test here. Comment next line
insert into #table2(id) VALUES (1) --Comment/Uncomment
select * from #table1
select * from #table2
select #table1.*
from #table1
left JOIN #table2 on
#table1.id = #table2.id
where (#table2.id is not null or not exists (select * from #table2))

Get matching string with the percentage

I have the following details of the data:
Table 1: Table1 is of small in size around few records.
Table 2: Table2 is having 50 millions of rows.
Requirement: I need to match the any string column from table1 to table2 for example name column to name and get the percentage of matching (note column can be any, maybe address or any string column which have multiple words in a single cell).
Sample data:
create table table1(id int, name varchar(100), address varchar(200));
insert into table1 values(1,'Mario Speedwagon','H No 10 High Street USA');
insert into table1 values(2,'Petey Cruiser Jack','#1 Church Street UK');
insert into table1 values(3,'Anna B Sthesia','#101 No 1 B Block UAE');
insert into table1 values(4,'Paul A Molive','Main Road 12th Cross H No 2 USA');
insert into table1 values(5,'Bob Frapples','H No 20 High Street USA');
create table table2(name varchar(100), address varchar(200), email varchar(100));
insert into table2 values('Speedwagon Mario ','USA, H No 10 High Street','mario#gmail.com');
insert into table2 values('Cruiser Petey Jack','UK #1 Church Street','jack#gmail.com');
insert into table2 values('Sthesia Anna','UAE #101 No 1 B Block','Aanna#gmail.com');
insert into table2 values('Molive Paul','USA Main Road 12th Cross H No 2','APaul#gmail.com');
insert into table2 values('Frapples Bob ','USA H No 20 High Street','BobF#gmail.com');
Expected Result:
tbl1_Name tbl2_Name Percentage
--------------------------------------------------------
Mario Speedwagon Speedwagon Mario 100
Petey Cruiser Jack Cruiser Petey Jack 100
Anna B Sthesia Sthesia Anna around 80+
Paul A Molive Molive Paul around 80+
Bob Frapples Frapples Bob 100
Note: Above given is just sample data to understand, I have few records in table1 and 50 millions in table2 in actual senario.
My Try:
Step 1: As suggested by Shnugo have normalize data and stored in the same table's.
For table1:
ALTER TABLE table1 ADD Name_Normal VARCHAR(1000);
GO
--00:00:00 (5 row(s) affected)
UPDATE table1
SET Name_Normal=CAST('<x>' + REPLACE((SELECT LOWER(name) AS [*] FOR XML PATH('')),' ','</x><x>') + '</x>' AS XML)
.query(N'
for $fragment in distinct-values(/x/text())
order by $fragment
return $fragment
').value('.','nvarchar(1000)');
GO
For table2:
ALTER TABLE table2 ADD Name_Normal VARCHAR(1000);
GO
--01:59:03 (50000000 row(s) affected)
UPDATE table2
SET Name_Normal=CAST('<x>' + REPLACE((SELECT LOWER(name) AS [*] FOR XML PATH('')),' ','</x><x>') + '</x>' AS XML)
.query(N'
for $fragment in distinct-values(/x/text())
order by $fragment
return $fragment
').value('.','nvarchar(1000)');
GO
Step 2: Create Percentage calculation function using Levenshtein distance in Microsoft Sql Server
Step 3: Query to get the matching percentage.
--00:00:33 (23456 row(s) affected)
SELECT t.name AS [tbl1_Name],t1.name AS [tbl2_Name],
dbo.ufn_Levenshtein(t.Name_Normal,t1.Name_Normal) percentage
into #TempTable
FROM table2 t
INNER JOIN table1 t1
ON CHARINDEX(SOUNDEX(t.Name_Normal),SOUNDEX(t1.Name_Normal))>0
--00:00:00 (23456 row(s) affected)
SELECT *
FROM #TempTable
WHERE percentage >= 50
order by percentage desc;
Conclusion: Getting expected result but it's taking around 2 hours for normalizing table2 as mentioned in comment in above query. Any suggestion for better optimization at step 1 for table2?

Have you tried looking into DQS (Data Quality Services)?
Depends on your SQL version, it comes with the installation file.
https://learn.microsoft.com/en-us/sql/data-quality-services/data-matching?view=sql-server-2017

Performance tuning on join two tables columns with patindex

Sample data:
Note:
The table tbl_test1 is filtered table, may have less records based on filtered earlier.
The following is just the data sample for understanding purpose. The actual table tbl_test2 is having 70 columns and 100 millions of records.
The WHERE condition is dynamic comes with any combination.
The display columns are also dynamic, i mean one or more columns.
create table tbl_test1
(
col1 varchar(100)
);
insert into tbl_test1 values('John Mak'),('Omont Boy'),('Will Smith'),('Mak John');
create table tbl_test2
(
col1 varchar(100)
);
insert into tbl_test2 values('John Mak'),('Smith Will'),('Jack Don');
query 1: The following query is take more than 10 min and still running for 100 millions records.
select t2.col1
from tbl_test2 t2
inner join tbl_test1 t2 on patindex('%'+t1.col1+'%',t2.col1) > 0
query 2: This also keeps running unable to get the result after 10 min of wait.
select t2.col1
from tbl_test2 t2
where exists
(
select * from tbl_test1 t1 where charindex(t1.col1,t2.col1) > 0
)
expected result:
col1
----------
John Mak
Smith Will

Select data from one table where a field is greater than that of another field in another table

I want to be able to select data from TableA where Field1 is greater than Field2 in TableB.
In my head i image it to be something like this
Select TableA.*
from TableA
Join TableB
On TableA.PK = TableB.FK
WHERE TableA.Field1 > TableB.Field2
I am using SQL server 2005 and the TableA.Field1 and tableB.Field2 look like:
2004102881010 - data type - Vrachar
My PK and FK look like:
0908232 - data type - nvarchar
The probelm is when this query is ran ALL the data is displaying and not just the rows where Field1 is greater.
Cheers:)

Seems to be working correctly for this demo code. Perhaps I'm not understanding the problem or data.
;
with TABLEA (PK, Field1) AS
(
-- Sample row that is filtered out
SELECT CAST('0908232' AS nvarchar(10)), CAST('2004102881010' AS varchar(50))
-- This is bigger than what's in B
UNION ALL SELECT CAST('0908232' AS nvarchar(10)), CAST('2005102881010' AS varchar(50))
)
, TABLEB(FK, Field2) AS
(
-- This matches row 1 above and will be excluded
SELECT CAST('0908232' AS nvarchar(10)), CAST('2004102881010' AS varchar(50))
)
SELECT TableA.*
FROM TableA
INNER JOIN TableB
ON TableA.PK = TableB.FK
WHERE TableA.Field1 > TableB.Field2
Results
PK Field1
0908232 2005102881010

This seems like a problem with missing zeroes:
20041028*0*81010
There is nothing wrong with your query, but your data.
Consider 2001-01-01 01:01:01, this would be seen as: 200111111
It should be seen as: 20010101010101

Comparrison operators (>, <) used on strings (varchars, nvarchars, etc.) work alphabetically. For example, '9' > '11' is true. You might try doing a data type conversion...
WHERE cast(A.field1 as int) > cast(B.field2 as int)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

compare columns for a table in different 2 databases - sybase

Related

Query using a statement within a VARCHAR2 column

Show all and only rows in table 1 not in table 2 (using multiple columns)

Get matching string with the percentage

Performance tuning on join two tables columns with patindex

Select data from one table where a field is greater than that of another field in another table

Categories

Resources