Finding invoices without matching credits - sql-server

The simplified table looks like that:
BillID|ProductID|CustomerID|Price|TypeID
------+---------+----------+-----+-------
111111|Product1 |Customer1 | 100| I
111112|Product1 |Customer1 | -100| C
111113|Product1 |Customer1 | 100| I
111114|Product1 |Customer1 | -100| C
111115|Product1 |Customer1 | 100| I
I need to find invoices (I) that have their matching credits (C) but not "odd" invoices without matching credits (the last record) - or the other way around (unmatched invoices without corresponding credits).
So far I've got this:
SELECT Invoices.billid, Credits.billid
FROM
(SELECT B1.billid
FROM billing B1
WHERE B1.typeid='I') Invoices
INNER JOIN
(SELECT B2.billid
FROM billing B2
WHERE B2.typeid='C') Credits
ON Invoices.customerid = Credits.customerid
AND Invoices.productid = Credits.productid
AND Invoices.price = -(Credits.price)
But it obviously doesn't work, as it returns something looking like:
billid | billid2
-------+ -------
111111 | 111112
111113 | 111114
111115 | 111114
What I would like to get is a list of unmatched invoices;
billid |
-------+
111115 |
Or alternatively only the matching invoices;
billid | billid2
-------+ -------
111111 | 111112
111113 | 111114
The invoice numbers (BillID) will not necessarily be consecutive of course, it's just a simplified view.
Any help would be appreciated.

This should work. I tested by adding a few consecutive invoices before a credit. The query below shows all invoices with matching credit and shows NULL for the aliased "bar" part of the query if a match doesn't exist.
SELECT * FROM (
SELECT
ROW_NUMBER() OVER(Partition By TypeID, CustomerID, ProductID, Price ORDER BY BillID ASC) AS rownumber,
*
FROM Billing
) AS foo
LEFT JOIN
(SELECT
ROW_NUMBER() OVER(Partition By TypeID, CustomerID, ProductID, Price ORDER BY BillID ASC) AS rownumber,
*
FROM Billing
) AS bar
on foo.CustomerID = bar.CustomerID and
foo.ProductID = bar.ProductID and
foo.rownumber = bar.rownumber and
foo.Price = -1*bar.Price
where foo.Price > 1
Here's the updated data that I used:
And Here are what my results looked like:

I wrote this a long time ago so there may be better ways to solve it now. Also I've attempted to adapt it to your table structure, so apologies if its not 100% there. I also assume that your BillID is sequential in date order i.e. larger numbers were entered later. I've also assumed that invoices are always positive and credit notes always negative - so I don't bother checking the type.
Essentially the query filters out any matched items.
Anyway here goes:
select *
from billing X
/* If we are inside the number of unmatched entries then show it. e.g. if there are 3 unmatched entries, and we are in the top 3 then display */
where (
/* Number of later entries relating that match this account entry e.g. Price/Product/Customer */
select count(*)
from billing Z
where Z.Customer = X.Customer and Z.ProductID = X.ProductID
and Z.Price = X.Price
and Z.BillID >= X.BillId
) <=
(
/* Number of unmatched entries for this Price/Product/Customer there are, and whether they are negative or positive. */
select abs(Y.Number)
from (
-- Works out how many unmatched billing entries for this Price/Product/Customer there are, and whether they are negative or positive
select ProductID, CustomerID, abs(Price) Price, sum(case when Price < 0 then -1 else +1 end) Number
from billing
group by ProductID, CustomerID, abs(Price)
having sum(Price) <> 0
) as Y
where X.ProductID = Y.ProductID
and X.CustomerID = Y.CustomerID
and X.Price = case when Y.Number < 0 then -1*Y.Amount else Y.Amount end
)

The odd/even thing concerns me a bit. But assuming this is an incremental key and your business logic is in place, try including this logic in the WHERE clause, the JOIN PREDICATE, or implementing a Lead/Lag function.
SELECT DISTINCT
Invoices.billid
,Credits.billid
FROM
(SELECT B1.billid
FROM billing B1
WHERE B1.typeid='I') Invoices
INNER JOIN (SELECT B2.billid
FROM billing B2
WHERE B2.typeid='C') Credits
ON Invoices.customerid = Credits.customerid
AND Invoices.productid = Credits.productid
AND Invoices.price = -(Credits.price)
AND (Invoices.Billid + 1) = Credits.Billid
Note: This is using your INNER JOIN, so we will get the cases where the invoices have a corresponding credit. You could also do a FULL OUTER JOIN instead, then include a WHERE CLAUSE that specifies WHERE Invoices.Billid IS NULL OR Credits.Billid IS NULL. That scenario would give you the trailing case where you don't have a match.

Related

Joining 2nd Table with Random Row to each record

I need to join table B to Table A, where Table B's records are randomly assigned, or joined. Most of the queries out there are based off of having a key between them and conditions, where I just want to randomly join records without a key.
I'm not sure where to start, as none of the queries I've found are doing this. I assume a nested join could be helpful for this, but how can I randomly assort the records on join?
**Table A**
| Associate ID| Statement|
|:----: |:------:|
| 33691| John is |
| 82451| Susie is |
| 25485| Sam is|
| 26582| Lonnie is|
| 52548| Carl is|
**Table B**
| RowID | List|
|:----: |:------:|
| 1| admirable|
| 2| astounding|
| 3| excellent|
| 4| awesome|
| 5| first class|
The result would be something like this, where items from the list are not looped through in order, but random:
**Result Table**
| Associate ID| Statement| List|
|:----: |:------:|:------:|
| 33691| John is |astounding|
| 82451| Susie is |first class|
| 25485| Sam is|admirable|
| 26582| Lonnie is|excellent|
| 52548| Carl is|awesome|
These are some of the queries I've tried:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/aeb83251-e132-435a-8630-e5b842a69368/random-join-between-tables?forum=sqldataaccess
-This seems to loop through values from 'Table B', not random.
https://www.daveperrett.com/articles/2009/08/11/mysql-select-random-row-with-join
-This is based off of a common key between the two tables and returning one of the records with the key, which I do not have.
SQL Join help when selecting random row
- I'll be honest, I don't understand this one, but it doesn't seem to assign random for each row from Table A, but more of a selection overall link the link above this.
Join One Table To Get Random Rows from 2nd Table
- This seems to be specific to a key, and not an overall random.
using 2 CTEs we generate a select which generates a row number for each table based on a random order and then join based on that row number.
Using a CTE to get N times the records in B as described here:
Repeat Rows N Times According to Column Value (Not included below) Note to get the "N" you'll need to get count from A and B, then divide by eachother and Add 1.
Assuming Even Distribution
With A as(
SELECT *, Row_number() over (order by NewID()) RN
FROM A),
B as (
SELECT *, Row_number () over (order by NewID()) RN
FROM B)
SELECT *
FROM A
INNER JOIN B
on A.RN = B.RN
Or use (assuming uneven distribution)
SELECT *
FROM A
CROSS APPLY (SELECT TOP 1 * FROM B ORDER BY NewID()) Z
This method assumes you know in advance which is the smaller table.
First it assigns an ascending row numbering from 1. This does not have to be randomized.
Then for each row in the larger table it uses the modulus operator to randomly calculate a row number in the range to join onto.
WITH Small
AS (SELECT *,
ROW_NUMBER() OVER ( ORDER BY (SELECT 0)) AS RN
FROM SmallTable),
Large
AS (SELECT *,
1 + CRYPT_GEN_RANDOM(3) % (SELECT COUNT(*) FROM SmallTable) AS RND
FROM LargeTable
ORDER BY RND
OFFSET 0 ROWS)
SELECT *
FROM Large
INNER JOIN Small
ON Small.RN = Large.RND
The ORDER BY RND OFFSET 0 ROWS is to get the random numbers materialized in advance.
This will allow a MERGE join on the smaller table. It also avoids an issue that can sometimes happen where the CRYPT_GEN_RANDOM is moved around in the plan and only evaluated once rather than once per row as required.

SQL - Return first non-empty value for previous days

I'm currently working with an exchange rates table in SQL that has these fields:
| Country | ExchangeRateDt | ExchangeRateValue |
| DK | 202000601 | 0.2 |
| DK | 202000603 | 0.21 |
| HR | 202000601 | 0.10 |
| HR | 202000602 | 0.12 |
For each currency I don't have a value for any day of the year because of bank holidays or simply weekends.
I need to join it with an order table where some orders are placed on weekends and on a specific day I could not have an exchange rate to calculate taxes.
I need to take the first non missing value from the previous days (so in the examples should I have an order for day 2020-06-02 in Denmark I should exchange it using the rate 0.2)
I thought about using a calendar table but I can't manage to get the job done.
Can someone help me?
Thanks in advance,
R
To get the most recent value less than or equal to the current day:
SELECT
<whatever columns you need from order>
,exchange.ExchangeRateValue
FROM
<order table> order
LEFT JOIN
<exchange rate table> exchange
ON exchange.Country = order.Country
AND exchange.ExchangeRateDt =
(
SELECT
MAX(ExchangeRateDt)
FROM
<exchange rate table>
WHERE
Country = order.Country
AND ExchangeRateDt <= order.OrderDt
)
Ensure the clustered index on the exchange rate table is (Country, ExchangeRateDt).
I have this as a left join so you will still return order results if the currency information is somehow missing. You would have to refer to business rules on how to proceed if no exchange rate was available.
You would typically create a calendar table that stores all the days you are interested in, say dates, with each date on a separate row.
You would also probably have a table that lists the countries: I assumed countries.
Then, one option is a lateral join:
select c.country, d.date, t.ExchangeRateValue
from dates d
cross join countries c
outer apply (
select top (1) t.*
from mytable t
where t.country = c.country and t.ExchangeRateDt <= d.date
order by t.ExchangeRateDt desc limit 1
) t
If you don't have these two tables, or can't create them, then one option is a recursive query to generate the dates and a subquery to list the countries. For example, this would generate the data for the month of June:
with dates as (
select '20200601' date
union all
select dateadd(day, 1, date) from dates where date < '20200701'
)
select c.country, d.date, t.ExchangeRateValue
from dates d
cross join (select distinct country from mytable) c
outer apply (
select top (1) t.*
from mytable t
where t.country = c.country and t.ExchangeRateDt <= d.date
order by t.ExchangeRateDt desc limit 1
) t
You should be able to do the mapping between the transation date and the exchange rate date with this query:
select TAB.primary_key, TAB.TransationDate, max(EXR.ExchangeRateDt)
from yourtable TAB
inner join exchangerate EXR
on TAB.Country = EXR.Country and TAB.TransationDate >= EXR.ExchangeRateDt
group by TAB.primary_key, TAB.TransationDate

SQL GROUP BY with columns which contain mirrored values

Sorry for the bad title. I couldn't think of a better way to describe my issue.
I have the following table:
Category | A | B
A | 1 | 2
A | 2 | 1
B | 3 | 4
B | 4 | 3
I would like to group the data by Category, return only 1 line per category, but provide both values of columns A and B.
So the result should look like this:
category | resultA | resultB
A | 1 | 2
B | 4 | 3
How can this be achieved?
I tried this statement:
SELECT category, a, b
FROM table
GROUP BY category
but obviously, I get the following errors:
Column 'a' is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause.
Column 'b' is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
How can I achieve the desired result?
Try this:
SELECT category, MIN(a) AS resultA, MAX(a) AS resultB
FROM table
GROUP BY category
If the values are mirrored then you can get both values using MIN, MAX applied on a single column like a.
Seams you don't really want to aggregate per category, but rather remove duplicate rows from your result (or rather rows that you consider duplicates).
You consider a pair (x,y) equal to the pair (y,x). To find duplicates, you can put the lower value in the first place and the greater in the second and then apply DISTINCT on the rows:
select distinct
category,
case when a < b then a else b end as attr1,
case when a < b then b else a end as attr2
from mytable;
Considering you want a random record from duplicates for each category.
Here is one trick using table valued constructor and Row_Number window function
;with cte as
(
SELECT *,
(SELECT Min(min_val) FROM (VALUES (a),(b))tc(min_val)) min_val,
(SELECT Max(max_val) FROM (VALUES (a),(b))tc(max_val)) max_val
FROM (VALUES ('A',1,2),
('A',2,1),
('B',3,4),
('B',4,3)) tc(Category, A, B)
)
select Category,A,B from
(
Select Row_Number()Over(Partition by category,max_val,max_val order by (select NULL)) as Rn,*
From cte
) A
Where Rn = 1

SQL Server : count ProductID to get total times sold

I'm having trouble with what seems to be a simple query. I'm trying to get the amount of times an entire product has sold by counting and grouping by the ProductID. I've researched it online and every where I go it's just add a simple COUNT, but when I do it, it still outputs the same numbers of rows.
So if I don't use COUNT (for example) it outputs 1,000 rows, and if I DO use COUNT it outputs 1,000 rows and doesn't give me the correct times sold. They are all listed as "1" and not being grouped and counted. I'm guessing it has something to do with my joins but I can't figure it out.
Here's an example below of what I'm seeing after using the COUNT (I've removed brand and date_added just to make it easier to read). ProductID's are showing more than once even though they should be grouped together and counted.
times_sold | ProductID | title
---------- | --------- | ---------
1 | 17998 | title 2
1 | 13670 | title 3
1 | 17956 | title 4
1 | 4569 | title 5
1 | 12598 | title 1
1 | 12598 | title 1
1 | 17998 | title 2
And here's the query I'm running:
SELECT TOP (100) PERCENT
COUNT(s.ProductID) AS times_sold,
s.ProductID, p.title, p.brandname, p.date_added
FROM
dbo.TBL_OrderSummary AS s
INNER JOIN
dbo.jewelry AS p ON s.ProductID = p.ProductID
INNER JOIN
dbo.sent_items AS i ON s.InvoiceID = i.ID
GROUP BY
s.ProductID, p.title, p.brandname, p.flare_type, p.date_added,
i.date_order_placed, i.ship_code, p.jewelry
HAVING
(p.title LIKE '%stone%')
AND (i.date_order_placed > CONVERT(DATETIME, '2016-01-01 00:00:00', 102))
AND (i.ship_code = N'paid')
AND (p.flare_type = 'Single flare')
AND (p.jewelry LIKE '%plugs%')
Thanks for any help!
The reason why they aren't looking right is because the records aren't the same all the way across in the row. If you have a product name Widget 2 and year made is 2015 and you have another one product name widget and year made 2016 it is only going to count a 1 next to each product because the whole row only appears one time. You will need to limit your group by to get an accurate count.
GROUP BY s.productID, p.title, COUNT(s.productID)
This should give you an accurate count. You are just limiting your group by to a too large of sample to get any unique records. You will have to cut down what is in your select for this to work you need to have s.Product and p.title in your select to match the group by. Hope this helps.
Unless you are filtering by your aggregate function (ie. HAVING COUNT(s.ProductID) > 2) then you could move all of your selection criteria to the WHERE line.
So you could try:
select count(s.ProductID) times_sold, s.ProductID, p.title
from dbo.TBL_OrderSummary s inner join dbo.jewelry p on s.ProductID = p.ProductID
inner join dbo.sent_items i on s.InvoiceID = i.ID
where p.title like '%stone%'
and i.date_order_placed > CONVERT(DATETIME, '2016-01-01 00:00:00', 102)
and i.ship_code = N'paid'
and p.flare_type = 'Single flare'
and p.jewelry like '%plugs%'
group by s.ProductID, p.title

SQL Server: Duplicate columns in joined table, but distinct row info

So I have joined two tables to identify claims and their corresponding reversals if there are any.
The following is a simplified explanation as to what I have done: Join where MbrNo is the same in both tables, and where Amount=-Amount. So now I have an output table contians duplicate column names:
MbrNo | ClaimType | Amount | MbrNo | ClaimType | Amount
xyz | Medicine | R 300 | xyz | Reversal | - R300
I can not input this in a table as column names are not unique.
But I would like to
1. Format this table to look as follows
MbrNo | ClaimType | Amount
xyz | Medicine | R 300
xyz | Reversal | - R300
with t as
(
select *,
count(*) over(partition by [MbrNo], [DepNo], [PracticeNo], [DisciplineCd], [ServiceDt],[PayAmt]) as rownum
from Claims
)
Select * from
(Select * from t where PayAmt<0) a
left outer join
(Select * from t where PayAmt>0) b
on a.[MbrNo]=b.[MbrNo]
and a.[DepNo]=b.[DepNo]
and a.[PracticeNo]=b.[PracticeNo]
and a.[DisciplineCd]=b.[DisciplineCd]
and a.[ServiceDt]=b.[ServiceDt]
and a.[PayAmt]=-b.[PayAmt]
Basically I want to put the 2nd table in the joined table underneath the first table.
Please help:(
If I've understood your requirements correctly then I think you want the UNION operator. See if this gets you going in the right direction.
with t as
(
select *,
count(*) over(partition by [MbrNo], [DepNo], [PracticeNo], [DisciplineCd], [ServiceDt],[PayAmt]) as rownum
from Claims
)
Select t.* from t where PayAmt < 0
union all
select b.* from
(Select * from t where PayAmt < 0) a
inner join
(Select * from t where PayAmt > 0) b
on a.[MbrNo] = b.[MbrNo]
and a.[DepNo] = b.[DepNo]
and a.[PracticeNo] = b.[PracticeNo]
and a.[DisciplineCd] = b.[DisciplineCd]
and a.[ServiceDt] = b.[ServiceDt]
and a.[PayAmt] = -b.[PayAmt]

Resources