SQL Server Distinct on Earliest Date in a Timestamp Column - sql-server

I have a table called 'Audit' in SQL Server 2005 like this:
Name | Last Logged On Date
--------| -----------------------
Joe | 2012-02-01 00:00:00.000
Joe | 2012-02-02 00:00:00.000
Bloggs | 2012-03-01 00:00:00.000
Bloggs | 2012-03-02 00:00:00.000
I want to only get the distinct on the first time the person logged on.
So in other words, I want to return:
Name | First Logged On Date
--------| -----------------------
Joe | 2012-02-01 00:00:00.000
Bloggs | 2012-03-01 00:00:00.000
How would I achieve this?
Help!!!

If I understand your question right, it should work for you
SELECT Name, MIN([Last Logged On Date]) AS [First Logged On Date]
FROM Audit
GROUP BY Name

Related

Microsoft SQL create a summary table based on other group of tables

My question is a bit simple, there are many answers to it but I have a question more about the query itself for certain conditions.
I have a table like this :
Client | Date | Employee | Last Record | Trained
JOE | April 2020 | John Doe | May 2019 | TRUE
JOE |February 2020| John Doe | May 2019 | TRUE
JOE | May 2 019 | John Doe | May 2019 | FALSE
Now I watn to make a simple SQL summary table saying :
Client | Date | Inactive | Trained
JOE | April 2020 | 1 | 1
JOE |February 2020 | 1 | 1
JOE | May 2019 | 0 | 0
So basically do a count of Employees grouped by client and date, with the condition that the difference of date and last record is greater than, lets say 1 month and also in another column count the number of employees with a TRUE condition.
So my question is basically that, hwo would I go about creating a summary table where I want to set conditions per column, such as a date difference or if its true in a column.
Before you say Use a view, I need to create this table for performance reason since I am querying the first table which has millions of rows for a report program. However it is simple and better to query instead a table that holds a summary or counts with conditions.

Why does adding TOP 10 to DISTINCT SELECT timeout when DISTINCT SELECT does not?

When I run a basic SELECT DISTINCT query (NOTE: this occurs only when using a View not a Table)
SELECT DISTINCT
[DateField]
FROM MyData
I quickly (< 1 sec) get my 190 rows back. But when I add TOP 10 (either before or after the DISTINCT) it runs until it times out (5 minutes +).
SELECT TOP 10 DISTINCT
[DateField]
FROM MyData
This also happens if I put the SELECT DISTINCT query as a subquery (just testing alternatives).
SELECT TOP 10
FOO.[DateField]
FROM
(SELECT DISTINCT [DateField]
FROM MyData) FOO
Here is a sample of the data returned by the SELECT DISTINCT query.
DateField
2016-12-01 00:00:00.000
2016-09-01 00:00:00.000
2016-11-01 00:00:00.000
2017-11-29 00:00:00.000
2017-07-01 00:00:00.000
2016-08-01 00:00:00.000
2017-04-24 00:00:00.000
2016-03-01 00:00:00.000
2017-03-01 00:00:00.000
2016-07-01 00:00:00.000
2016-02-01 00:00:00.000
2016-04-01 00:00:00.000
2017-01-01 00:00:00.000
2016-06-01 00:00:00.000
2016-05-01 00:00:00.000
2018-02-28 00:00:00.000
Thanks in advance!

Query to show difference over time in the data

Lets say I have a table that shows attendence to a lecture. The table is very simple, it only contains the date of the lecture, and the attendees.
2016-10-10 | Adam
2016-10-10 | Mike
2016-10-10 | David
2016-10-11 | Adam
2016-10-14 | Adam
2016-10-14 | David
What I would like is a query to show what percentage of the attendees for each lecture that was present on the previous lecture. Can this be done in a effective way?
The expected result would be something like this:
2016-10-11 | 1.00
2016-10-14 | 0.50
2016-10-10 would be left out since it does not have a previous lecture.

MS SQL Server 2005 - Grouping Similar Data

I am trying to find a solution to the following type of groupings:
My Data
Formula # Date
1 2016-01-02 12:05:00
1 2016-01-02 12:07:00
2 2016-01-02 12:10:00
2 2016-01-02 12:15:00
3 2016-01-02 12:25:00
3 2016-01-02 12:30:00
3 2016-01-02 12:50:00
3 2016-01-02 12:55:00
2 2016-01-02 13:05:00
2 2016-01-02 13:25:00
2 2016-01-02 13:40:00
And I am trying to get a result like this:
Formula Count Start Date End Date
1 2 2016-01-02 12:05:00 2016-01-02 12:07:00
2 2 2016-01-02 12:10:00 2016-01-02 12:15:00
3 4 2016-01-02 12:25:00 2016-01-02 12:55:00
2 3 2016-01-02 13:05:00 2016-01-02 13:40:00
I've tried various things and while I can roll up the similar formula numbers, I cannot seem to get it to sort out to get the results in the format I've listed. I'm also not sure at all how to get the starting and ending date of the groups of data..
Any thoughts or help would be greatly appreciated..
The below query will group by the formula and provide the count as well as min and max dates. Based on the data in the table there is no way to know that there are supposed to be 2 separate sets of data for Formula #2, so it is being grouped into one row.
SELECT
[Formula]
,COUNT([Formula]) AS [COUNT]
,MIN([Date]) AS [MIN_DATE]
,MAX([Date]) AS [MAX_DATE]
FROM
#test_table
GROUP BY
[Formula]

How to negate or minimize repeating database records(rows) with several "many to many" relationships

I am pretty new to database development and architecture. My only experience has been in college and now my project requires me to use that knowledge, however my project seems a lot more complicated with many more intricacies than what I studied.
A brief overview: My task is to basically turn paper work that was previously done by hand, into a quick computer application, which I will do in Java but thats far off now. I know I will need a database set up to accomplish my task since these reports are frequently edited. The report is a Labor Report. Basically, it shows who was working on a specific job, what days and how many hours on those days, as well as their total hours, pay rate, and total amount.
I believe my current problem lies within the fact that it seems like I'm going to have several "many to many" relationships, perhaps even nested, which is what is throwing my head for a spin as I try to organize information into entity relationship diagrams and tables. (I know that there are normally much more measured and organized stages to development but I don't have that experience and I'm essentially a one man team on this)
Contract Personnel with be selected out of a pool of Employees.
A Labor Contract can have 1 to 10 personnel (For sake of space on the final printed version, jobs requiring more laborers will have another Labor Contract.)
Each personnel must have 1 Title (foreman, mechanic, etc.) These titles can change from job to job. Joe Smith can be a mechanic on job A but a foreman on job B.
Each personnel must also have on record the number of hours they worked on each day of the week; and may have overtime and double overtime. (One Labor Record per week).
I am trying to avoid repeated data, or at least keep it to a minimum but I am struggling on figuring out how to do that in this situation. The tricky thing, at least in my mind, is figuring out how to handle the fact that different employees can work several jobs at once, under different titles, and different pay rates, and recording different types of hours (straight time, OT, double OT) on each day of the week.
Can anyone make suggestions?
I hope that I have supplied adequate information and apologize if I didn't or wasn't detailed enough. Please remember to keep in mind I'm a newbie to this type of work.
First thing, take a deep breath! It looks to me like you have a pretty good handle on this, maybe more than you think! This is not at all to try and design your project, and I'm sure you'll have lots of details to deal with, but maybe this will give an idea of how you might face these many many-to-many relationships swimming around in your head.
EMPLOYEES
---------
emp_id
emp_name
emp_address
JOBS
----
job_id
job_description
EMPLOYEE_JOBS
-------------
ej_id -- primary key
emp_id -- fk to employees table
job_id -- fk to jobs table
ej_title -- employee title for this job
ej_rate -- employee pay rate for this job
EMPLOYEE_JOB_HOURS
------------------
ejh_id -- primary key
ej_id -- fk to employee_jobs table
ejh_date
ejh_normal_hours -- hours worked by the employee on this job on this date, etc.
ejh_overtime_hours
ejh_double_overtime_hours
Following is a basic outline you could use to get started. Your final solution will be different based on your exact needs.
You'll need a table to store contract information. My example just shows a description but I'm sure you'll have much more than that.
contracts
id unsigned int(P)
description varchar(50)
+----+-------------+
| id | description |
+----+-------------+
| 1 | Contract A |
| 2 | Contract B |
| .. | ........... |
+----+-------------+
You'll need a table that links contracts and employees and shows what title the employee has for the given contract. In my example you can see that for Contract A John Q Public is a Foreman and Mary Jane Smith is a Mechanic. For Contract B their titles are reversed, John is a Mechanic and Mary is a Foreman. contract_id and employee_id are foreign keys to their respective tables and together they form the primary key. If it's possible that John and Mary get paid different rates for the same title (for example John get 25.00/hour as Foreman while Mary gets 20.00/hour) you would add a column here instead of using the rate in the titles table.
contracts_employees
contract_id unsigned int(F contracts.id)--\_(P)
employee_id unsigned int(F employees.id)--/
title_id varchar(15)(F titles.id)
+-------------+-------------+----------+
| contract_id | employee_id | title_id |
+-------------+-------------+----------+
| 1 | 1 | Foreman |
| 1 | 2 | Mechanic |
| 2 | 1 | Mechanic |
| 2 | 2 | Foreman |
| ........... | ........... | ........ |
+-------------+-------------+----------+
You'll need a table for employees (you could call this personnel if you prefer). You'll probably store a lot more than just their names...
employees
id unsigned int(P)
first_name varchar(30)
middle_name varchar(30)
last_name varchar(30)
...
+----+------------+-------------+-----------+-----+
| id | first_name | middle_name | last_name | ... |
+----+------------+-------------+-----------+-----+
| 1 | John | Quincy | Public | ... |
| 2 | Mary | Jane | Smith | ... |
| .. | .......... | ........... | ......... | ... |
+----+------------+-------------+-----------+-----+
You'll need a table to track hours worked. I just store a beginning and ending date/time, leaving it up to the application to calculate elapsed time. Your application will also need to ensure there is no overlap for employees - an employee should not be able to be working on more than one contract at any given time. Calculation of overtime and double overtime hours is also up to your application. If an employee's pay rate can change at any time (ie in the middle of a contract) you would want to store the pay rate in this table instead of using the rate from contracts_employees or titles.
hours
id unsigned int(P)
contract_id unsigned int(F contracts.id)
employee_id unsigned int(F employees.id)
beg datetime
end datetime
+----+-------------+-------------+---------------------+---------------------+
| id | contract_id | employee_id | beg | end |
+----+-------------+-------------+---------------------+---------------------+
| 1 | 1 | 1 | 2014-01-01 08:00:00 | 2014-01-01 17:00:00 |
| 2 | 1 | 2 | 2014-01-01 09:00:00 | 2014-01-01 17:30:00 |
| 3 | 1 | 1 | 2014-01-02 09:00:00 | 2014-01-02 10:00:00 |
| 4 | 1 | 2 | 2014-01-02 08:00:00 | 2014-01-02 09:00:00 |
| 5 | 2 | 1 | 2014-01-02 10:00:00 | 2014-01-02 17:30:00 |
| 6 | 2 | 2 | 2014-01-02 09:00:00 | 2014-01-02 15:00:00 |
| .. | ........... | ........... | ................... | ................... |
+----+-------------+-------------+---------------------+---------------------+
And finally a table to store titles and their related pay rates. If employees can be paid different rates for the same title, you wouldn't need the rate column here, instead you would use the rate stored in the contracts_employees table.
titles
id varchar(15)(P)
rate double
+----------+-------+
| id | rate |
+----------+-------+
| Foreman | 20.00 |
| Mechanic | 15.00 |
| ........ | ..... |
+----------+-------+

Resources