I am trying to create a script that will group a tables contents by a series of columns that I have taken MAX() of.
It difficult to describe without the scenario:
I have a table of bookings which I have to create a table from of single customer, the customers are split down by res system, each res system requires a different grouping using different columns.
I.E:
277000 bookings for Ipcos
289300 bookings for Daph
300000 bookings for Tard
They are all stored in same table and I want to take the max of all columns except a couple which I have to cast into integers and sum() other columns up.
My problem comes when I have to group by the casted value I have created and the Min() value I have created.
I tried joining table onto each other but that didn't work, can someone point me in the right direction please as getting very frustrated.
Code for selecting Ipcos
SELECT
SUM(CAST(TotalCost AS MONEY)) AS TotalCost,
NettCost = Null,
Paid = Null,
Balance = Null,
Discount = Null,
Commission = Null,
Max(Adults) AS Adults,
Max(Children)AS Children,
Max(Infants) AS Infants,
Max(PAX) AS PAX,
Sum(CAST(Duration AS int)) AS Duration,
MAX([IdentityValue]) AS IdentityValue,
MAX([FileName]) AS FileName,
MAX([SheetName]) AS SheetName,
MAX([LineNum]) AS LineNum,
MAX([BookingDate]) AS BookingDate,
CAST([DepartureDate] AS Int) AS IntDepartureDate,
CAST([BookingDate] AS int) AS IntBookingDate,
MIN(LineNum) AS MinLineNum
FROM
Booking
WHERE ResID = 3
GROUP BY
MinLineNum, IntBookingDate, IntDepartureDate
the other systems are very similar to this
any help would be fantastic cheers in adavnce
You can use the over ( partition by ) statement in order tou group by different columns. Look here Over clause
Related
I have a table like below:
How can I have a column like below using Transact-SQL (Order By Date)?
I'm using SQL Server 2016.
The thing you need is called an aggregate windowing function, specifically SUM ... OVER.
The problem is that a 'running total' like this only makes sense if you can specify the order of the rows deterministically. The sample data does not include an attribute that could be used to provide this required ordering. Tables, by themselves, do not have an explicit order.
If you have something like an entry date column, a solution like the following would work:
DECLARE #T table
(
EntryDate datetime2(0) NOT NULL,
Purchase money NULL,
Sale money NULL
);
INSERT #T
(EntryDate, Purchase, Sale)
VALUES
('20180801 13:00:00', $1000, NULL),
('20180801 14:00:00', NULL, $400),
('20180801 15:00:00', NULL, $400),
('20180801 16:00:00', $5000, NULL);
SELECT
T.Purchase,
T.Sale,
Remaining =
SUM(ISNULL(T.Purchase, $0) - ISNULL(T.Sale, 0)) OVER (
ORDER BY T.EntryDate
ROWS UNBOUNDED PRECEDING)
FROM #T AS T;
Demo: db<>fiddle
Using ROWS UNBOUNDED PRECEDING in the window frame is shorthand for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. The behaviour of ROWS is different w.r.t duplicates (and generally better-performing) than the default RANGE. There are strong arguments to say that ROWS ought to have been the default, but that is not what we were given 🙂.
For more information see How to Use Microsoft SQL Server 2012's Window Functions by Itzik Ben-Gan, and his excellent book on the topic.
I have many tables in my database and each one has one or two fields which is DATE field. This is increasing my database size so I am thinking to store all DATE fields in one table and add relationship to all tables. Is it possible and a good idea or not?
My database, example:
Old design
tblCustomer = > CustomerID, Surname, Name, DateFirstVisit, DateStopped
tblOrder = > OrderID, CustomerID, DateOrder, Order, DateShiped
tblPayment = > PaymentID, CustomerID, DatePayment, Price, DateCheck
New design
tblCustomer = > CustomerID, Surname, Name, DateInID, DateOutID
tblOrder = > OrderID, CustomerID, DateInID, Order, DateOutID
tblPayment = > PaymentID, CustomerID, DateInID, Price, DateOutID
tblDateIn = > DateInID, DateIn
tblDateOut = > DateOutID, DateOut
Can I combine tblDateIn and tblDateOut?
Thank you...
Technically, yes, you can further normalize your database this way. You could go so far as to have a Dates table that just has every date in it and use those dates by reference to a DateID, but this is over-normalization.
In addition to making simple queries more complicated because you will have to join to the dates table every time, I think you'll find that you don't save that much space and might possibly use more space. I don't know for certain what Access uses, but dates are typically stored internally as decimal values or an integer representing a count of seconds since a starting date. In any case, the space you would save in your tables by having an integer key versus Access' internal date value would be tiny and likely offset by having additional tables and indexes involved in foreign keys.
In the database on which I am attempting to create a FullText Search I need to construct a table with its column names coming from one column in a previous table. In my current implementation attempt the FullText indexing is completed on the first table Data and the search for the phrase is done there, then the second table with the search results is made.
The schema for the database is
**Players**
Id
PlayerName
Blacklisted
...
**Details**
Id
Name -> FirstName, LastName, Team, Substitute, ...
...
**Data**
Id
DetailId
PlayerId
Content
DetailId in the table Data relates to Id in Details, and PlayerId relates to Id in Players. If there are 1k rows in Players and 20 rows in Details, then there are 20k rows in Data.
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK]) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, '"Some phrase like team name and player name"')
AS KT ON c. DataID = KT.[KEY]
GROUP BY c.PlayerID
)
…
Then a table is made by selecting the rows in one column. Similar to a pivot.
…
SELECT rc.Rnk,
c.PlayerID,
PlayerName,
TeamID,
…
(SELECT Content FROM dbo.Data data WHERE DetailID = 1 AND data.PlayerID = c.PlayerID) AS [TeamName],
…
FROM dbo.Players c
JOIN RankedPlayers rc ON c. PlayerID = rc. PlayerID
ORDER BY rc.Rnk DESC
I can return a ranked table with this implementation, the aim however is to be able to produce results from weighted columns, so say the column Playername contributes to the rank more than say TeamName.
I have tried making a schema bound view with a pivot, but then I cannot index it because of the pivot. I have tried making a view of that view, but it seems the metadata is inherited, plus that feels like a clunky method.
I then tried to do it as a straight query using sub queries in the select statement, but cannot due to indexing not liking sub queries.
I then tried to join multiple times, again the index on the view doesn't like self-referencing joins.
How to do this?
I have come across this article http://developmentnow.com/2006/08/07/weighted-columns-in-sql-server-2005-full-text-search/ , and other articles here on weighted columns, however nothing as far as I can find addresses weighting columns when the columns were initially row data.
A simple solution that works really well. Put weight on the rows containing the required IDs in another table, left join that table to the table to which the full text search had been applied, and multiply the rank by the weight. Continue as previously implemented.
In code that comes out as
DECLARE #Weight TABLE
(
DetailID INT,
[Weight] FLOAT
);
INSERT INTO #Weight VALUES
(1, 0.80),
(2, 0.80),
(3, 0.50);
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK] * ISNULL(cw.[Weight], 0.10)) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, 'Karl Kognition C404') AS KT ON c.DataID = KT.[KEY]
LEFT JOIN #Weight cw ON c.DetailID = cw.DetailID
GROUP BY c.PlayerID
)
SELECT rc.Rnk,
...
I'm using a temporary table here for evidence of concept. I am considering adding a column Weights to the table Details to avoid an unnecessary table and left join.
I am looking to retrieve only the second (duplicate) record from a data set. For example in the following picture:
Inside the UnitID column there is two separate records for 105. I only want the returned data set to return the second 105 record. Additionally, I want this query to return the second record for all duplicates, not just 105.
I have tried everything I can think of, albeit I am not that experience, and I cannot figure it out. Any help would be greatly appreciated.
You need to use GROUP BY for this.
Here's an example: (I can't read your first column name, so I'm calling it JobUnitK
SELECT MAX(JobUnitK), Unit
FROM JobUnits
WHERE DispatchDate = 'oct 4, 2015'
GROUP BY Unit
HAVING COUNT(*) > 1
I'm assuming JobUnitK is your ordering/id field. If it's not, just replace MAX(JobUnitK) with MAX(FieldIOrderWith).
Use RANK function. Rank the rows OVER PARTITION BY UnitId and pick the rows with rank 2 .
For reference -
https://msdn.microsoft.com/en-IN/library/ms176102.aspx
Assuming SQL Server 2005 and up, you can use the Row_Number windowing function:
WITH DupeCalc AS (
SELECT
DupID = Row_Number() OVER (PARTITION BY UnitID, ORDER BY JobUnitKeyID),
*
FROM JobUnits
WHERE DispatchDate = '20151004'
ORDER BY UnitID Desc
)
SELECT *
FROM DupeCalc
WHERE DupID >= 2
;
This is better than a solution that uses Max(JobUnitKeyID) for multiple reasons:
There could be more than one duplicate, in which case using Min(JobUnitKeyID) in conjunction with UnitID to join back on the UnitID where the JobUnitKeyID <> MinJobUnitKeyID` is required.
Except, using Min or Max requires you to join back to the same data (which will be inherently slower).
If the ordering key you use turns out to be non-unique, you won't be able to pull the right number of rows with either one.
If the ordering key consists of multiple columns, the query using Min or Max explodes in complexity.
I have 2 database tables called Spend, and VendorSpend. The columns used in the Spend table are called VendorID, VendorName, RecordDate, and Charges. The VendorSpend table contains VendorID and VendorName but with distinct data (one record for each unique VendorID). I need a simple way to add a column to the VendorSpend table called Aug2015, this column will contain the SUM of each Vendor's charges within that month time period. It will be calculated based on this query:
Select Sum(Charges)
from Spend
where RecordDate >= '2015-08-01' and RecordDate <= '2015-08-31'
Keep in mind this will need to be called whenever new data is inserted into the Spend table and the VendorSpend table will need to update based on the new data. This will happen every month so actually a new column will need to be added and the data be calculated every month.
Any assistance is greatly appreciated.
Create a user-defined function that you pass a VendorID and Date to and which does your SELECT:
Select Sum(Charges)
from Spend
where VendorID=#VendorID
AND DATEDIFF(month, RecordDate, #Date) = 0
Now personally, I would stop right there and use the function to select your data at query time, rather than adding a new column to your table.
But treating your question as academic, you can create a computed column called [Aug2015] in VendorSpend that passes [VendorID] and '08/01/2015' to this function and it will contain the desired result.