SQL Server: Comparing dates from multiple records - sql-server

When a part is created in a table ("ASC_PMA_TBL"), a number is auto-generated. Any "sub-parts" that are subsequently created then have an associated number. So for example, the "master" part might be 18245, and it may have several subparts which would be "18245-50", or "18245-40", etc. Subparts are always identified by having the master part number, followed by a '-' then a two-digit number. Each sub part has a date associated with it ("EO_DATE"). All I want to do is display records where the "master" dates don't match each of the sub-parts dates. All data is in the one table "ASC_PMA_TBL".
Normally this would be easily achieved using a join. However in the database, the subparts are not related to their master through the use of foreign keys, so I'm having to find a different way of doing things.
Furthermore, the date field is a date/ time field, so to compare them I first have to convert the field into a date only field. I can do this, but then am unable to use the alias in my query!
Any help is much appreciated :)
I have tried creating temporary tables and using subqueries, but cannot solve this problem :(
UPDATE: Managed to solve the problem using temporary tables, truncating the part number of the sub-parts to match the master parts, and then joining the two to compare the dates. Might be messy, but it works!
SELECT
PMA_PART_ONLY,
CONVERT(DATE,PMA_EFFECT_DATE_OFF) As 'EO_DATE'
INTO
##MParts
FROM
ASC_PMA_TBL
WHERE
(PMA_PROC_CODE = 'M') AND
(PMA_EFFECT_DATE_OFF IS NOT NULL)
SELECT
PMA_PART_ONLY,
CONVERT(DATE,PMA_EFFECT_DATE_OFF) As 'EO_DATE',
SUBSTRING(PMA_PART_ONLY,0,CHARINDEX('-',PMA_PART_ONLY,0)) As 'MP_NO'
INTO
##SParts
FROM
ASC_PMA_TBL
WHERE
(PMA_PROC_CODE = 'S') AND
(PMA_EFFECT_DATE_OFF IS NOT NULL)
SELECT
##SParts.PMA_PART_ONLY As 'SUB_PART_NO',
##MParts.EO_DATE As 'M_PART_DATE',
##SParts.EO_DATE As 'S_PART_DATE'
FROM
##MParts INNER JOIN ##SParts ON ##SParts.MP_NO = ##MParts.PMA_PART_ONLY
WHERE
(##MParts.EO_DATE <> ##SParts.EO_DATE)
ORDER BY
SUB_PART_NO DESC
DROP TABLE ##MParts
DROP TABLE ##SParts

If you want to compare just dates and not times you gotta convert the dates:
select *
from ASC_PMA_TBL master
inner join ASC_PMA_TBL parts
ON parts.number like CAST(master.number AS VARCHAR(30)) + '[_]%'
where CAST(master.EO_DATE AS DATE) <> CAST(parts.EO_DATE AS DATE)
That's the main idea, get all master and parts where part number is like master number + underscope.
Note that you have to escape "_" in []-quotes when performing LIKE

Related

Can I use Dimension table ‘startdate’ instead Fact table?

I’m joining dim table & fact table with start date. Can I use start date from dim table instead fact? If so why we need to use fact table start date? Below is the example:
Select count(*)
from dim_table d
Inner join fact_table f
On d.bizkeys = f.bizkeys
Where currentind =‘1’
And d.startdate = (select max(startdate) from dim_table)
After giving startdate condition I’m getting 1.8 million records, if I give
f.startdate = (select max(startdate) from fact_table)
I got 100 million records.
Can anyone Please clarify my doubt? Why I’m seeing this huge variation?
If you're just trying to get a list of all possible dates from the dimension (say, for a dropdown in your reporting tool that would let the user pick a date), then there's probably no reason that you would join to the fact table—unless you only want to include dates for which there's a corresponding fact record.
Without some sample data (or at least a little more information about the substance of the fact and dimension tables), I'm not sure I can give a better answer than that.

How to determine id date falls under one of the ranges (ranges are stored in separate rows of another table)

I have SalesFacts Table, which contains Sales_Amount, Customer_ID and Invoice_Date.
In another table I have Information's about special agreements for some of the customers (columns are: Customer_ID, Agreement_Start_Date, Agreement_End_Date).
Now - i would like to check, if the sales from SalesFact table occurred when special agreement was active for the Customer. This would be pretty easy, if there was only one date range when special agreement was active. However, in my case, Table with Special Agreements date ranges contains duplicated Customer ID, because for one Customer there might be several time ranges, where special agreement was active.
E.G. In SalesFact Table I have 3 transactions for one customer:
In SpacialAgreements Table I can see, that there are 2 data ranges when this customer had a right to special agreements.
I would like to create a query, that adds additional column to my SalesFacts table, that would determine, if the transaction happened when there was a Special Agreement Active. So in case shown above, it would be:
If there was Only one date range with special agreement it would be pretty easy:
Select
S.[Sales_Amount], S.[Customer_ID], S.[Invoice_Date],
IIF(S.[Invoice_Date] >= A.[Agreement_Start_Date] and S.[Invoice_Date]<=A.[Agreement_End_Date],'YES','NO') as AGREEMENT
From SalesFacts S left join SpacialAgreements A on S.[Customer_ID] = A.[Customer_ID]
But since there are several date ranges in SpacialAgreement table, i don't know how to achieve that properly, without risking any duplicates in Sales_Amount and without loosing any data.
Any ideas?
If you want to get data exactly as you shown in question then for the SELECT statement you can use something like this:
SELECT
S.[Sales_Amount],
S.[Customer_ID],
S.[Invoice_Date],
CASE WHEN EXISTS (SELECT 1
FROM SpacialAgreements A
WHERE A.Customer_ID = S.Customer_ID
AND S.[Invoice_Date] >= A.[Agreement_Start_Date]
AND S.[Invoice_Date] <= A.[Agreement_End_Date])
THEN 'YES'
ELSE 'NO'
END as Agreement
FROM SalesFacts S
So, this solution can be used if you are selecting data or creating view from this query.
If you want to have persisted value as one physical column in your SalesFacts table then you can try to solve your problem with triggers.

How to force reasonable execution plan for query with LIKE statement?

When creating ad-hoc queries to look for information in a table I have run into this issue over and over.
Let's say I have a table with a million records with fields id - int, createddatetime - timestamp, category - varchar(50) and content - varchar(max). I want to find all records in the last day that have a certain string in the content field. If I create a query like this...
select *
from table
where createddatetime > '2018-1-31'
and content like '%something%'
it may complete in a second because in the last day there may only be 100 records so the LIKE clause is only operating on a small number of records
However if I add one more item to the where clause...
select *
from table
where createddatetime > '2018-1-31'
and content like '%something%'
and category = 'testing'
then it could take many minutes to complete while locking up the table.
It appears to be changing from performing all the straight forward WHERE clause items first and then the LIKE on the limited set of records, over to having the LIKE clause first. There are even times where there are multiple LIKE statements and adding one more causes the query to go from a split second to minutes.
The only solutions I've found are to either generate an intermediate table (maybe temp tables would work), insert records based on the basic WHERE clause items, then run a separate query to filter by one or more LIKE statements. I've tried various JOIN and CTE approaches which usually have no improvement. Alternatively CHARINDEX also appears to work though difficult to use if trying to convert the logic of multiple LIKE statements.
Is there any hint or something that can be placed in the query statement to tell sql server to wait until records are filtered by the basic WHERE clause items before filtering by the LIKE?
I actually just tried this approach and it had the same issue...
select *
from (
select *, charindex('something', content) as found
from bounce
where createddatetime > '2018-1-31'
) t
where found > 0
while the subquery independently returns in a couple seconds, the overall query just never returns. Why is this so bad
Not fancy, but I've had better luck with temp tables than nested select statements... It will isolate the first data set, and then you can select just from that. If you're looking for quick and dirty, which usually serves my purposes for ad-hoc, this may help. If this is a permanent stored proc, the indexing suggestions may serve you better in the long run.
select *
into #like
from table
where createddatetime > '2018-1-31'
and content like '%something%'
select *
from #like
where category = 'testing'

Convert Date Stored as VARCHAR into INT to compare to Date Stored as INT

I'm using SQL Server 2014. My request I believe is rather simple. I have one table containing a field holding a date value that is stored as VARCHAR, and another table containing a field holding a date value that is stored as INT.
The date value in the VARCHAR field is stored like this: 2015M01
The data value in the INT field is stored like this: 201501
I need to compare these tables against each other using EXCEPT. My thought process was to somehow extract or TRIM the "M" out of the VARCHAR value and see if it would let me compare the two. If anyone has a better idea such as using CAST to change the date formats or something feel free to suggest that as well.
I am also concerned that even extracting the "M" out of the VARCHAR may still prevent the comparison since one will still remain VARCHAR and the other is INT. If possible through a T-SQL query to convert on the fly that would be great advice as well. :)
REPLACE the string and then CONVERT to integer
SELECT A.*, B.*
FROM TableA A
INNER JOIN
(SELECT intField
FROM TableB
) as B
ON CONVERT(INT, REPLACE(A.varcharField, 'M', '')) = B.intField
Since you say you already have the query and are using EXCEPT, you can simply change the definition of that one "date" field in the query containing the VARCHAR value so that it matches the INT format of the other query. For example:
SELECT Field1, CONVERT(INT, REPLACE(VarcharDateField, 'M', '')) AS [DateField], Field3
FROM TableA
EXCEPT
SELECT Field1, IntDateField, Field3
FROM TableB
HOWEVER, while I realize that this might not be feasible, your best option, if you can make this happen, would be to change how the data in the table with the VARCHAR field is stored so that it is actually an INT in the same format as the table with the data already stored as an INT. Then you wouldn't have to worry about situations like this one.
Meaning:
Add an INT field to the table with the VARCHAR field.
Do an UPDATE of that table, setting the INT field to the string value with the M removed.
Update any INSERT and/or UPDATE stored procedures used by external services (app, ETL, etc) to do that same M removal logic on the way in. Then you don't have to change any app code that does INSERTs and UPDATEs. You don't even need to tell anyone you did this.
Update any "get" / SELECT stored procedures used by external services (app, ETL, etc) to do the opposite logic: convert the INT to VARCHAR and add the M on the way out. Then you don't have to change any app code that gets data from the DB. You don't even need to tell anyone you did this.
This is one of many reasons that having a Stored Procedure API to your DB is quite handy. I suppose an ORM can just be rebuilt, but you still need to recompile, even if all of the code references are automatically updated. But making a datatype change (or even moving a field to a different table, or even replacinga a field with a simple CASE statement) "behind the scenes" and masking it so that any code outside of your control doesn't know that a change happened, not nearly as difficult as most people might think. I have done all of these operations (datatype change, move a field to a different table, replace a field with simple logic, etc, etc) and it buys you a lot of time until the app code can be updated. That might be another team who handles that. Maybe their schedule won't allow for making any changes in that area (plus testing) for 3 months. Ok. It will be there waiting for them when they are ready. Any if there are several areas to update, then they can be done one at a time. You can even create new stored procedures to run in parallel for any updated app code to have the proper INT datatype as the input parameter. And once all references to the VARCHAR value are gone, then delete the original versions of those stored procedures.
If you want everything in the first table that is not in the second, you might consider something like this:
select t1.*
from t1
where not exists (select 1
from t2
where cast(replace(t1.varcharfield, 'M', '') as int) = t2.intfield
);
This should be close enough to except for your purposes.
I should add that you might need to include other columns in the where statement. However, the question only mentions one column, so I don't know what those are.
You could create a persisted view on the table with the char column, with a calculated column where the M is removed. Then you could JOIN the view to the table containing the INT column.
CREATE VIEW dbo.PersistedView
WITH SCHEMA_BINDING
AS
SELECT ConvertedDateCol = CONVERT(INT, REPLACE(VarcharCol, 'M', ''))
--, other columns including the PK, etc
FROM dbo.TablewithCharColumn;
CREATE CLUSTERED INDEX IX_PersistedView
ON dbo.PersistedView(<the PK column>);
SELECT *
FROM dbo.PersistedView pv
INNER JOIN dbo.TableWithIntColumn ic ON pv.ConvertedDateCol = ic.IntDateCol;
If you provide the actual details of both tables, I will edit my answer to make it clearer.
A persisted view with a computed column will perform far better on the SELECT statement where you join the two columns compared with doing the CONVERT and REPLACE every time you run the SELECT statement.
However, a persisted view will slightly slow down inserts into the underlying table(s), and will prevent you from making DDL changes to the underlying tables.
If you're looking to not persist the values via a schema-bound view, you could create a non-persisted computed column on the table itself, then create a non-clustered index on that column. If you are using the computed column in WHERE or JOIN clauses, you may see some benefit.
By way of example:
CREATE TABLE dbo.PCT
(
PCT_ID INT NOT NULL
CONSTRAINT PK_PCT
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, SomeChar VARCHAR(50) NOT NULL
, SomeCharToInt AS CONVERT(INT, REPLACE(SomeChar, 'M', ''))
);
CREATE INDEX IX_PCT_SomeCharToInt
ON dbo.PCT(SomeCharToInt);
INSERT INTO dbo.PCT(SomeChar)
VALUES ('2015M08');
SELECT SomeCharToInt
FROM dbo.PCT;
Results:

Merge query using two tables in SQL server 2012

I am very new to SQL and SQL server, would appreciate any help with the following problem.
I am trying to update a share price table with new prices.
The table has three columns: share code, date, price.
The share code + date = PK
As you can imagine, if you have thousands of share codes and 10 years' data for each, the table can get very big. So I have created a separate table called a share ID table, and use a share ID instead in the first table (I was reliably informed this would speed up the query, as searching by integer is faster than string).
So, to summarise, I have two tables as follows:
Table 1 = Share_code_ID (int), Date, Price
Table 2 = Share_code_ID (int), Share_name (string)
So let's say I want to update the table/s with today's price for share ZZZ. I need to:
Look for the Share_code_ID corresponding to 'ZZZ' in table 2
If it is found, update table 1 with the new price for that date, using the Share_code_ID I just found
If the Share_code_ID is not found, update both tables
Let's ignore for now how the Share_code_ID is generated for a new code, I'll worry about that later.
I'm trying to use a merge query loosely based on the following structure, but have no idea what I am doing:
MERGE INTO [Table 1]
USING (VALUES (1,23-May-2013,1000)) AS SOURCE (Share_code_ID,Date,Price)
{ SEEMS LIKE THERE SHOULD BE AN INNER JOIN HERE OR SOMETHING }
ON Table 2 = 'ZZZ'
WHEN MATCHED THEN UPDATE SET Table 1.Price = 1000
WHEN NOT MATCHED THEN INSERT { TO BOTH TABLES }
Any help would be appreciated.
http://msdn.microsoft.com/library/bb510625(v=sql.100).aspx
You use Table1 for target table and Table2 for source table
You want to do action, when given ID is not found in Table2 - in the source table
In the documentation, that you had read already, that corresponds to the clause
WHEN NOT MATCHED BY SOURCE ... THEN <merge_matched>
and the latter corresponds to
<merge_matched>::=
{ UPDATE SET <set_clause> | DELETE }
Ergo, you cannot insert into source-table there.
You could use triggers for auto-insertion, when you insert something in Table1, but that will not be able to insert proper Shared_Name - trigger just won't know it.
So you have two options i guess.
1) make T-SQL code block - look for Stored Procedures. I think there also is a construct to execute anonymous code block in MS SQ, like EXECUTE BLOCK command in Firebird SQL Server, but i don't know it for sure.
2) create updatable SQL VIEW, joining Table1 and Table2 to show last most current date, so that when you insert a row in this view the view's on-insert trigger would actually insert rows to both tables. And when you would update the data in the view, the on-update trigger would modify the data.

Resources