The situation is quite complicated to express in the title. An example should be much easier to understand.
My table A:
uid id ticket created_date
001 1 movie 2015-01-23 08:23:16
002 25 TV 2012-01-13 12:02:20
003 1 movie 2015-02-01 07:15:36
004 1 movie 2014-02-15 15:38:40
What I need to achieve is to remove duplicate records that appear within 31 days between each other and retain the record that appear first. So the above table would be reduced to B:
uid id ticket created_date
001 1 movie 2015-01-23 08:23:16
002 25 TV 2012-01-13 12:02:20
004 1 movie 2014-02-15 15:38:40
because the 3rd row in A were within 31 days of row 1 and it appeared later than row 1 (2015-02-01 vs 2015-01-23), so it gets removed.
Is there a clean way to do this?
I would suggest the following approach:
SELECT A.uid AS uid
INTO #tempA
FROM A
LEFT JOIN A AS B
ON A.id=B.id AND A.ticket=B.ticket
WHERE DATEDIFF(SECOND,B.date,A.date) > 0 AND
DATEDIFF(SECOND,B.date,A.date) < 31*24*60*60;
DELETE FROM A WHERE uid IN (SELECT uid FROM #tempA);
This is assuming that by 'duplicate records' you mean records that have both identical id as well as identical ticket fields. If that's not the case you should adjust the ON clause accordingly.
Related
I have a table called Customer that has several columns called National Code and Name. It also has a number of other features called Contact Numbers and Recommenders, since the number of Contact Numbers and Recommenders is more than one, so you need some other table to store them.
Also suppose I have other tables like the Customer, each of which has a number of attributes greater than one.
What is your suggestion for storing these values?
In one source, it was suggested that for each table, a table called StringValue be used for storage. Does EF core have a way to implement StringValue without writing additional code?
Example:
Customer Table:
CustomerId Name NationalCode
------------------------------------------------------------------------
1 David xxxx
------------------------------------------------------------------------
StringValue Table:
StringId CustomerId StringName Value
------------------------------------------------------------------------
10 1 PhoneNumber 915245
11 1 PhoneNumber 985452
12 1 PhoneNumber 935446
13 1 Recommenders Mr Jhon
14 1 Recommenders Mr bb
------------------------------------------------------------------------
I think it is more intutive create a new table for the field which has more than one records, then configure a one-to-many relationship between the two tables. Take your case as an example, you can divide the customer table into three tables, they can be linked by foreignkey:
1.Customer Table:
CustomerId Name NationalCode
---------------------------------------------
1 David xxxx
2.Contact Table:
Id CustomerId PhoneNumber
---------------------------------------------
1 1 915245
2 1 985452
3 1 935446
3.Recommender Table:
Id CustomerId RecommenderName
---------------------------------------------
1 1 Mr Jhon
2 1 Mr bb
I have two tables:
Account & Amount column
list of related accounts
Data samples:
Account | Amount
--------+---------
001 | $100
002 | $150
003 | $200
004 | $300
Account | Related Account
--------+------------------
001 | 002
002 | 003
003 | 002
My goal is to be able to aggregate all related accounts. From table two - 001,002 & 003 are actually all related to each other. What I would like to be able to do is to get a sum of all related accounts. Possibly ID 001 to 003 as Account #1, so I can aggregate them.
Result below
ID | Account | Amount
-----+-----------+--------
#1 | 001 | $100
#1 | 002 | $150
#1 | 003 | $200
#2 | 004 | $300
I can then manipulate the above table as below (final result)
ID | Amount
-----+--------
#1 | $450
#2 | $300
I tried doing a join, but it doesn't quite achieve what I want. I still have a problem relating account 001 with 003 (they are indirectly related because 002 is related with both 001 and 003.
If anyone can point me to the right direction, will be much appreciated.
Well, you really made this harder then it should be.
If you could change the data in the second table, so it will not contain reversed duplicates (in your sample data - 2,3 and 3,2) it would simplify the solution.
If you could refactor both tables into a single table, where the related column is a self referencing nullable foreign key, it would simplify the solution even more.
Let's assume for a minute you can't do either, and you have to work with the data as provided. So the first thing you want to do is to ignore the reversed duplicates in the second table. This can be done using a common table expression and a couple of case expressions.
First, create and populate sample tables (Please save us this step in your future questions):
DECLARE #TAccount AS TABLE
(
Account int,
Amount int
)
INSERT INTO #TAccount (Account, Amount) VALUES
(1, 100),
(2, 150),
(3, 200),
(4, 300)
DECLARE #TRelatedAccounts AS TABLE
(
Account int,
Related int
)
INSERT INTO #TRelatedAccounts (Account, Related) VALUES
(1,2),
(2,3),
(3,2)
You want to get only the first two records from the #TRelatedAccounts table.
This is the AccountAndRelated CTE.
Now, you want to left join the #TAccount table with the results of this query, so for each Account we will have the Account, the Amount, and the Related Account or NULL, if the account is not related to any other account or it's the first on the relationship chain.
This is the CTERecursiveBase CTE.
Then, based on that you can create a recursive CTE (called CTERecursive), and finally select the sum of amount from the recursive CTE based on the root of the recursion.
Here is the entire script:
;WITH AccountAndRelated AS
(
SELECT DISTINCT CASE WHEN Account > Related THEN Account Else Related END As Account,
CASE WHEN Account > Related THEN Related Else Account END As Related
FROM #TRelatedAccounts
)
, CTERecursiveBase AS
(
SELECT A.Account, Related, Amount
FROM #TAccount As A
LEFT JOIN AccountAndRelated As R ON A.Account = R.Account
)
, CTERecursive AS
(
SELECT Account As Id, Account, Related, Amount
FROM CTERecursiveBase
WHERE Related IS NULL
UNION ALL
SELECT Id, B.Account, B.Related, B.Amount
FROM CTERecursiveBase AS B
JOIN CTERecursive AS R ON B.Related = R.Account
)
SELECT Id, SUM(Amount) As TotalAmount
FROM CTERecursive
GROUP BY Id
Results:
Id TotalAmount
1 450
4 300
You can see a live demo on rextester.
Now, Let's assume you can modify the data of the second table. You can use the AccountAndRelated cte to get only the records you need to keep in the #TRelatedAccounts table - This means you can skip the AccountAndRelated cte and use the #TRelatedAccounts directly in the CTERecursiveBase cte.
You can see a live demo of that as well.
Finally, let's assume you can refactor your database. In that case, I would recommend joining the two tables together - so your #TAccount table would look like this:
Account Amount Related
1 100 NULL
2 150 1
3 200 2
4 300 NULL
Then you only need the recursive cte.
Here is a live demo of that option as well.
I have some issues with creating database schema for a following scenario:
Shop, where you configure your order. Let's say user orders flowers and chocolate. So far I had the following structure:
OrderID FK_Flower FK_Chocolate
1 1 1
Where FK's pointed to the entry in database such as:
Id Name Price
1 Rose 100
The same for chocolate.
However now, there is a change: use can order multiple different flowers. So let's say, he can order 5 Roses, 3 Daisies.
What changes should I make, to solve this issue?
You want a many-to-many relationship.
Change the Orders table to something like this:
Orders:
-------
Id
1
The Flowers table stays the same:
Flowers:
------------------
Id Name Price
1 Rose 100
2 Daisy 120
Create a new table with with the Order and Flower ID's as foreign keys:
Orders_Flowers:
-------------------------
FK_Order_Id FK_Flower_ID
1 1
1 1
1 1
1 1
1 1
1 2
1 2
1 2
This way, the Order with Id = 1, has 5 Roses and 3 Daisies.
All tables in the database has a Date column named EffectiveDate.
Data is imported into the database using a logic which detects and inserts changed records only.
Let us assume 5 imports happened between 1/1/2014 and 5/1/2014
So Table A has:
EffectiveDate id1 column1 column2
-------------- ---- -------- --------
01/01/2014 1 ABC 123
02/01/2014 1 ABC 999
05/01/2014 1 XXX 999
01/01/2014 2 CCCC 555
03/01/2014 2 CCCC 444
04/01/2014 2 DDDD 444
01/01/2014 3 xxxxx 333
and Table B has
EffectiveDate id2 column1 column2
-------------- ----- -------- --------
01/01/2014 1 ZZZZ AAAAA
03/01/2014 1 ZZZZ AABBB
01/01/2014 2 TTTT AAAAA
05/01/2014 2 TTTT AABBB
Now The task is to create 3 set of views for all tables:
The first set is to give the Effective data as of current date
The second set is to give latest data
The third set is to give the data changes after today date (just next changes not the latest)
Consideration:
All views should return only one row for each id with applicable effective date.
If effective date is not available then the maximum effective date in the table less then the requested effective date should be used.
I was able to come up with solution for the Effective and Latest views but not for the third set of views (Next changes)
Any idea how to address this?
You'll need to use the Row_Number function to get this. For each id, the first future row (whatever that means...) will have a row_number of 1.
with RowNumbers as
(select
id1,
effectivedate,
row_number() over (partition by id1 order by effectivedate) as RowNumber
from
a
where
effectivedate > getdate()
)
select
a.*
from
A
inner join RowNumbers
on a.id1 = Rownumbers.id1
and rownumbers.rownumber = 1
and a.effectivedate = rownumbers.effectivedate
SQL Fidldle
I have a huge access mdb file which contains a single table with 20-30 columns and over 50000 rows and
i have some thing like this
columns:
id desc name phone email fax ab bc zxy sd country state zip .....
1 a ab 12 fff 12 w 2 3 2 d sd 233
2 d ab 12 fff 12 s 2 3 1 d sd 233
here I have some column values related to addresses repeating is there a way to normalize the above table so that we can remove duplicates or repeating data.
Thanks in advance.
Here's a quick answer. You just need to move your address fields to a new table (remove dups) and add a FK back to your primary table.
Table 1 (People or whatever)
id desc name phone email fax ab bc zxy sd address_id
1 a ab 12 fff 12 w 2 3 2 1
2 d ab 12 fff 12 s 2 3 1 2
3 d ab 12 fff 12 s 2 3 1 2
4 d ab 12 fff 12 s 2 3 1 1
Table 2 (Address)
address_id country state zip .....
1 d sd 233
2 e ac 123
Jim W has a good start, but to normalize even further, make your redundant address elements into separate tables as well.
Create the tables for which address data is repeated (Country, State, etc.) Once you have your data tables, you'll want to add columns such as StateID, CountryID, etc. to the Address table.
You now have options for fixing the existing data. You can be quick and dirty and use Update statements to set all the newly created ID fields to point to the right data table.
UPDATE Addresses SET StateID=1 WHERE STATE='AL'
You can do this fairly quickly as a batch .sql file, but I'd recommend a more programmatic solution that rolls through the Address table and tries to match the current 'State' to an entry in the new States table. If found, the StateID on the Address table is updated with the id from the corresponding row in States.
You can then delete the old State field from the address table, as it is now normalized nice and neatly into a separate States table.
This process can be repeated for all redundant data elements. However, IMO db normalization can be taken too far. For example, if you have a commonly used query that, after normalization, requires 10 joins to accomplish, you may see a performance reduction. This doesn't appear to be the case here, as I think you're on the right track.
From a comment above:
#Lance i wanted something similar to that but here is the problem i have raw data coming in the form of single table and i need to refine and send it to two tables i can add address in table 2 but i m not undertanding how would you insert the address_id in table 1
You can retrieve the newly created ID from the address table using ##IDENTITY, and update the address_ID with this value.