How to design a Db table for attendance - database

I am currently working on a school management system but can't seem to figure out the best way to design my student attendance table.
INFO
School is for 14 weeks and class holds 5 times a week. Students in the school can be up to 2000 per term. Meaning attendance can be up to 14 x 5 x 2000 = 140, 000 per term.
I am developing the application for a desktop using VB.Net and MS Access.
PROGRESS SO FAR
I have so far designed something that I am skeptic about.
table name: attendance
_____________________________________________
| id |std_id | att_week | att_date | status |
''''''''''''''''''''''''''''''''''''''''''''''
| 1 | 0001 | 1 |29/9/2015 | yes |
''''''''''''''''''''''''''''''''''''''''''''''
| 2 | 0002 | 1 |29/9/2015 | yes |
''''''''''''''''''''''''''''''''''''''''''''''
I easily found out that designing it like this can easily yield 140, 000 rows in a term.
I also thought of making the week days as column names, that will easily result in 14 x 5 = 70 columns.
What is the best way to design this said table.

Friend I think you should construct your table like this:
Table would accept only the absentees
id student_id class date
________________________________________
1 11 7a 11/11/2020
2 21 6b 10/12/2020
and so on.....
You could easily retrieve details like
1] total absentees per class
2] total absent of a student in date range
3] Per day report of attendance of student can be easily prepared based on this data
ALSO this would be extremly fast due to less number of record and if you index on class_id and and partition tables in specified date range.
Thank You!

Related

Visualize attendance over time data in a meaningful way

I have data that looks like this
+-----------+-------------+----------+------------+------------+
| Date | Time | Initials | Location 1 | Location 2 |
+-----------+-------------+----------+------------+------------+
| 8/26/2019 | 11:00:00 AM | BI | 39 | 40 |
| 8/26/2019 | 1:30:00 PM | Kk | 12 | 2 |
| 8/27/2019 | 2:30:00 PM | BH | 18 | 37 |
| 8/28/2019 | 3:30:00 AM | BH | 23 | 29 |
+-----------+-------------+----------+------------+------------+
The output should be something very similar to the Google Maps "Popular Times" graph.
I would like to be able to visualize
A graph for each location in this style (attendance over time via hour), which is the average attendance per day of the week
I would also like to be able to specify a given date ex: 8/26/2019 and pull up the exact data for that date
So I figure either there can be a different graph for every location, or maybe have the various locations data show as different colored bars on the graph.
Ultimately I have this data in a spreadsheet and I'm not sure what would even be the best tool to use to report this data. I looked into data studio and google analytics and just using charts inside the sheet.
However the issue seems to be:
Since the data can be both various dates and various times. I'm not sure how or which tools to use to group the data by a given day, or average the data for a given day of the week. I tried using pivot tables but I'm not sure how to report based on that.
which tools to use to group the data by a given day, or average the data for a given day of the week
=QUERY(QUERY(A2:E,
"select A,count(A),sum(D),sum(E),sum(D)+sum(E),avg(D),avg(E),avg(D)+avg(E),max(D)+max(E),min(D)+min(E)
where A is not null
group by A", 0),
"offset 1", 0)
=QUERY(A2:E,
"select A,count(A),sum(D),sum(E),sum(D)+sum(E),avg(D),avg(E),avg(D)+avg(E),max(D)+max(E),min(D)+min(E)
where A is not null
group by A
pivot C", 0)
need to figure out how to take this input and arrange by Day of the week
=ARRAYFORMULA(IF(A2:A, TEXT(A2:A, "ddd"), ))
Also by hour instead of just by date
=ARRAYFORMULA(IF(A2:A, TEXT(TIME(HOUR(B2:B), 0, 0), "hh:mm:ss"), ))

SSAS - MDX calculated member

I've a fact table that details individual line amounts for orders placed by my organisation. In this fact, at line level, I've included the total order amount to be used, as it's possible we might need that level of detail at some point.
Here's an example of what I've got:-
+------------+------------+---------------+------------+---------------------+
| BookingKey | Booking_ID | Category_FKey | Line_Value | Total_Booking_Value |
+------------+------------+---------------+------------+---------------------+
| 1 | 12 | 8 | 150 | 700 |
| 2 | 12 | 4 | 150 | 700 |
| 3 | 12 | 5 | 300 | 700 |
| 4 | 12 | 4 | 100 | 700 |
+------------+------------+---------------+------------+---------------------+
As you can see, the Total_Booking_Value here is the sum of the Line_Value for the booking in the example (Booking_ID = 12).
The Category_FKey looks up to a Categories dimension.
Using this structure I've created a simple cube and this works fine, mainly.
The issue I have is that I'd like to be able to view the Total Line_Value amount, and somehow include the Total_Booking_Value alongside it.
So, for example I might add the Categories dimension as a filter and want to filter by say Category_FKey = 4.
If this was the case I'd want the aggregates to tell me that the total Line_Value was 250 (for BookingKeys 2 and 4), and the Total_Booking_Value should be 700. Using normal aggregation (ie SUM) I'm getting the Total_Booking_Value as 1400 (obviously - because it's adding 700 * 2 for the two rows the cube would return).
So, the way I see it I'd like to create an MDX calculation that somehow takes the Total_Booking_Value and gives just the value for the Booking in question.
Should this be done using some kind of average, or division by the Distinct number of items? I can't figure this out. I tried something like this:-
create member currentcube.measures.[Calculated Booking Value]
as
[Measures].[Total_Booking_Value] / count(Measures.Booking_ID);
But this isn't working.
Hopefully this makes sense and you can point me in the right direction.
I find it strange that booking_ID is a measure - intuitively it strikes me as something that would be an attribute and therefore a hierarchy - in which case you'd be able to do the count like this:
[Measures].[Total_Booking_Value]
/
COUNT(EXISTING [Booking].[Booking_ID].[Booking_ID].members)
A straightforward solution would be to have two fact tables: one with granularity booking key and one with granularity booking id. The first would contain all columns except total booking value, and the second would contain columns booking id and total booking value.
Then each of both measures would easily be summable.
The reference type between the second fact table and the category dimension could be configures as many-to-many via the first fact table. Thus, you would see the full values of the involved bookings for each selected category, automatically eliminating double counting.

SQL Server database design for evaluations

I'm designing this employee evaluation web page, and was wondering if my current database design is the correct one or if it could be improved.
This is my current design
Table Agenda:
+--------------+----------+----------+-----------+------+-------+-------+
| idEvaluation | Location | Employee | #Employee | Date | Date1 | Date2 |
+--------------+----------+----------+-----------+------+-------+-------+
Date is the date scheduled for the evaluation to be performed.
Date 1 and Date 2 its a period of time to retrieve some metrics from another database.
Table Evaluations:
+--------------+---------+------------+------+----------+
| idEvaluation | Manager | Department | Date | Comments |
+--------------+---------+------------+------+----------+
Table Scores:
+--------------+----------+-------+
| idEvaluation | idFactor | Score |
+--------------+----------+-------+
idFactor relates to another table which contains the factor and a description of it, like I said its this a correct design??
My concern its this, currently there are 60 employees, 11 managers and 12 factors, each employee its evaluated twice a year by every manager, so in the Agenda Table there's not much trouble since its only one record per evaluation (60 employees = 60 records), how ever on the Evaluations Table there are 11 records for every evaluation, so it goes to 660 records (60 employees * 11 managers = 660), and then on the Scores Table it goes even bigger since there are 12 factors for every evaluation, it goes to 7920 records (660 evaluations * 12 factors each = 7920).
Is this normal?? Am I doing it wrong?? Any input its appreciated.
EDIT
Location, Employee, #Employee, Manager and Department are loaded automatically by the vb.net page, they are "imported" from an Active Directory and its checked before insertion so duplicate names, misspelled names, and this sort of thing its not an issue.
The main idea is you dont want to repeat string literals
So if you have
id Department
1 Sales
2 IT
3 Admin
Instead of repeat Sales many time you only use 1 which is smaller so things also get faster.
Second if you have users
id user
1 Jhon Alexander
2 Maria Jhonson
If Jhon decide change his name then you will have to check all tables and change the name. Also there is the problem if two person have same name you wont know which one are you evaluating.
So go for separated table and use the ID.

Database Design - Drop Down Input Box Issue

I'm trying to create a friendship site. The issue I'm having is when a user joins a website they have to fill out a form. This form has many fixed drop down items the user must fill out. Here is an example of one of the drop downs.
Drop Down (Favorite Pets)
Items in Favorite Pets
1. Dog
2. Cat
3. Bird
4. Hampster
What is the best way to store this info in a database. Right now the profile table has a column for each fixed drop down. Is this correct database design. See Example:
User ID | Age | Country | Favorite Pet | Favorite Season
--------------------------------------------------------------
1 | 29 | United States | Bird | Summer
Is this the correct database design? right now I have probably 30 + columns. Most of the columns are fixed because they are drop down and the user has to pick one of the options.
Whats the correct approach to this problem?
p.s. I also thought about creating a table for each drop down but this would really complex the queries and lead to lots of tables.
Another approach
Profile table
ID | username | age
-------------------
1 | jason | 27
profileDropDown table:
ID | userID | dropdownID
------------------------
1 | 1 | 2
2 | 1 | 7
Drop Down table:
ID | dropdown | option
---------------------
1 | pet | bird
2 | pet | cat
3 | pet | dog
4 | pet | Hampster
5 | season | Winter
6 | Season | Summer
7 | Season | Fall
8 | Season | spring
"Best way to approach" or "correct way" will open up a lot of discussion here, which risks this question being closed. I would recommend creating a drop down table that has a column called "TYPE" or "NAME". You would then put a unique identifier of the drop down in that column to identify that set. Then have another column called "VALUE" that holds the drop down value.
For example:
ID | TYPE | VALUE
1 | PET | BIRD
2 | PET | DOG
3 | PET | FISH
4 | SEASON | FALL
5 | SEASON | WINTER
6 | SEASON | SPRING
7 | SEASON | SUMMER
Then to get your PET drop down, you just select all from this table where type = 'PET'
Will the set of questions (dropdowns) to be asked every user ever be changed? Will you (or your successor) ever need to add or remove questions over time? If no, then a table for users with one column per question is fine, but if yes, it gets complex.
Database purists would require two tables for each question:
One table containing a list of all valid answers for that question
One table containing the many to many relation between user and answer to “this” question
If a new question is added, create new tables; if a question is removed, drop those tables (and, of course, adjust all your code. Ugh.) This would work, but it's hardly efficient.
If, as seems likely, all the questions and answer sets are similar, then a three-table model suggests itself:
A table with one row per question (QuestionId, QuestionText)
A table with one row for each answer for each Question (QuestionId, AnswerId, AnswerText)
A table with one row for each user-answered question (UserId, QuestionId, AnswerId)
Adding and removing questions is straightforward, as is identifying skipped or unanswered questions (such as, if you add a new question a month after going live).
As with most everything, there’s a whole lot of “it depends” behind this, most of which depends on what you want your system to do.

Which is a better database schema for a tracking tool?

I have to generate a view that shows tracking across each month. The ultimate view will be something like this:
| Person | Task | Jan | Feb | Mar| Apr | May | June . . .
| Joe | Roof Work | 100% | 50% | 50% | 25% |
| Joe | Basement Work | 0% | 50% | 50% | 75% |
| Tom | Basement Work | 100% | 100% | 100% | 100% |
I already have the following tables:
Person
Task
I am now creating a new table to foreign key into the above 2 tables and i am trying to figure out the pros and cons of creating 1 or 2 tables.
Option 1:
Create a new table with the following Columns:
Id
PersonId
TaskId
Jan2012
Feb2012
Mar2012
Apr2013
or
Option 2:
have 2 seperate tables
One table for just
Id
PersonId
TaskId
and another table for just the following columns
Id
PersonTaskId (the id from table above)
MonthYearKey
MonthYearValue
So an example record would be
| 1 | 13 | Jan2011 | 100% |
where 13 would represent a specific unique Person and Task combination. This second way would avoid having to create new columns to continue over time (which seems right) but i also want to avoid overkill.
which would be a more scalable way to have this schema. Also, any other suggestions or more elegant ways of doing this would be great as well?
You can have a m2m table with data columns. I don't see a reason why you can't just put MonthYearKey, MonthYearValue on the same table with PersonId and TaskId
Id
TaskId
PersonId
MonthYearKey
MonthYearValue
It's possible too that you would want to move the MonthYearKey out into their own table, it really just comes down to common queries and what this data is used for.
I would note, you never want to design a schema where you are adding columns due to time. The first option would require maintenance all the time, and would become very difficult to query also.
Option 2 is definitely more scalable and is not overkill.
Option 1 would require you to add a new column every month and simple date based queries of your data would not be possible, e.g. Show me all people who worked at least 90% in any month last year.
The ultimate view would be generated from a particular query or view of your data.

Resources