Table design for "attendance check" in NoSQL DB - database

I'm trying to use AWS recognition to pass "Attendance" and insert the record (Date&time) in a DynamoDB table for every day that a kid shows up to school, the problem is that I'm not familiar with NoSQL and I'm wondering whats the best possible way to create the table with this in mind.
Some of the attributes that I'm using are:
Enrollment No. (PartionKey)
First Name
Last Name
since the date/attendance is going to be a "dynamic attribute"(whether or not the kid goes to school or not), I'm not sure if I should:
Create a new table for every-day/week or month and only have the enrollment No. as an attribute and have a lambda trigger to put a timestamp when the kid is spotted meaning the kid attended to class (This will have a lot ...a lot of tables, ruining the purpose of dynameDB I believe)
In the same table insert the attendance as an attribute and as a list type (which could be an array for inserting the timestamp every day the kid is spotted)..this option would make the item/table in DynamoDB weight more than it should? causing it to slow down??
Any ideas on a possible way to approach this? is there another way that's more cost and memory-optimized?
I'm not mentioning about the triggers, lambda functions, AWS recognition for this to work since it's out of the scope of this post

The simplest solution is to have one table for all attendance records using enrollmentID as the partition key and day (ISO 8601 date string, like “2019-09-27”) as the range key.
This makes it simple to add an attendance—just insert an enrollmentID-date pair into your table. It’s simple to query when a student attended using a variety of key condition expressions.
All attendance for student 123: enrollmentID = 123
All attendance for student 123 in a given year: enrollmentID = 123 and begins_with(day, “2019”)
All attendance for student 123 in a given month: enrollmentID = 123 and begins_with(day, “2019-09”)
As a bonus, you can also find all the students who attended on a given day by creating a GSI with day as the partition key.
Any additional data (such as first name, last name, etc.) can go in a separate table if you like or in the same table with something like “info” as the sort key value instead a real date.
It is also possible for you to use a list of Booleans to represent a year of attendance. You can use enrollmentID and year as the partition and sort keys.
{
enrollmentID: 123,
year: 2019,
attendance: [ 0, 1, 1, 1, 0, 1... ],
firstName: “John”,
lastName: “Smith”
}
This is a more efficient use of storage, but it limits your query options and it’s easier to accidentally ruin your data with an off-by-one error when indexing into the attendance list.

Related

Database model for a school timetable

I am trying to design a DB model for a school timetable, and have some issues figuring out a model that would work with my requirements.
Domain entities:
Subject - something that is taught over the course of a year. eg: English, Programming, etc.
Group - a group of students.
Lesson - a recurring event where students are taught some specific subject. Example:
every Monday at 10:00 group A is taught Programming.
The original requirements were very basic.
CRUD for Subjects.
CRUD for Groups.
Ability to create recurring events that span some period of time. (note: no editing was allowed after creation).
The current model I use is:
Subjects
id:int
name:string
Groups
id:int
name:string
Lessons
id:int
name:string
startDate:date (ex: event starts recurring from Jan 1 2021)
endDate:date (ex: event recurring ends on Dec 31 2021)
startTime:time (ex: 10:00)
endTime:time (ex: 11:00)
dayOfWeek: flag enum that takes values Monday-Sunday
(we have more fields that are responsible for recurrence, but they are ommited as they are not very relevant to my question).
Currently, entire series of events is stored as a single row in a database.
This was working fine, but now additional requirements were added and I am having issues adjusting my model to accommodate them.
New requirements are:
ability to grade students for each lesson
ability to edit lessons (for example shift start/end time 1 hour for event series; change the day on which event occurs etc)
ability to shift a single event in a series (say teacher woke up sick and needs to move lecture from Monday to Tuesday just for this Monday)
#1 in theory is simple - we create a new table Grades that has fields
lessonId:int
userId:int
date:date
grade:int?
but it won't work once we take into account new requirement #2 (ability to edit lessons).
let's say we have an event that occurs each Monday. an event occurred on Monday, Jan 1st. we graded some students, so we have some grades tied to that date.
then we go to edit our lesson to occur on Tuesdays instead of Mondays.
it is no longer possible to map existing grades to this lesson, as the dates no longer match.
#2 editing lessons seems like a pretty straightforward operation..until you consider grades. basically, the only issue with this requirement is the one described in #1
#3 no idea how to implement this, considering it should be compatible with other requirements.
I need help figuring out the database model that would satisfy those requirements.
From your description I take it that the lessons table also has a subject_id and a groups_id, which you merely forgot to put in your table column list.
Let's look at the tasks:
ability to grade students for each lesson
This is not easy. So far you have groups attending lessons. So to start this, you should add a students table to the database. Then, depending on whether a student can belong to more than one group or not, you'd either have the group ID in the student table or create a bridge table group_student.
Now, what does each "lesson" mean in this requirement? So far a lesson is a recurring event (your example: "every Monday at 10:00 group A is taught Programming"). That would mean you'd want a student_lesson table to be able to store the grade. The only problem I see here is that you could store a student_lesson row for a student that doesn't attend the lesson, if you stick to single IDs. Using composite IDs would solve this. The student_lesson table would have student_id, group_id, lesson_id, and grade. And id would have foreign keys on (student_id, group_id) and on (group_id, lesson_id).
If, however, lesson means a single lesson in the recurring lessons, then you need a single_lesson table, too.
ability to edit lessons (for example shift start/end time 1 hour for event series; change the day on which event occurs etc)
Should be no problem. These are just attributes that can be changed, anyway. Maybe you want a history table to see that the Tuesday lesson took place on Mondays until a month ago, but so far there is no requirement for this.
ability to shift a single event in a series (say teacher woke up sick and needs to move lecture from Monday to Tuesday just for this Monday)
Maybe you already have a single_lesson table because of requirement #1. Then each lesson occurrence already gets its one row with a date and sometimes that Monday would become a Tuesday. You could even store both dates, original date/time, new date/time. Maybe even a text for a reason.
If task #1 doesn't require a single_lesson table, because grades are per recurring lesson, then you only need a single_lesson_exception table for the exceptions where original date/time and new date/time are obligatory this time.

Keep referenced field data changes

I have a table Salary with a column PersonalId and a table Person with a column Name.
In the first table salary data will saved with a PersonalId which relates it to the Person table. In salary bill all data will gather together and Person name will be referenced from Person table.
After 1 year a specific person name will change from Michael to Maic. Now I want the last year salaries bill remain with previous person name Michael and the new salaries bill generate by new name Maic.
How we can do that?
It could depend on what type of operation you need to to most and on how much people change their name, because the number of joins you may need to make could vary a lot.
keep a field in Person that points to the next Person which is a change of name
keep another key in Person that varies only for the physical person
keep a limited number of names in Person that someone could dispose of, with an index of the current name
in another table you keep the relations between the various name of the Person
It could depend on what rules of normalization you follow, for now I'm not thinking about that.
Anyway, with the first case you don't need to change Salary, but to reconstruct the identity of a Person you need multiple requests or at least a stored procedure.
In the second case you still don't need to change Salary because you add a field to Person, but to get all the Salary entries for that physical person you'll need some work, again probably a stored procedure to get the added field and then something that joins all the Salary entries.
The third maybe is the simplest, but also the limited one, and you need in Salary another field that tells the index of the name to use in that entry.
The last case gives you a stable identity, but it may need some work because of the added table, and still there are multiple implementations. You could have salary reference that table instead of Person, or you could consult that table only when you need all the data, but you cannot reference its primary key from Salary because it would not permit to discriminate the name.
Lunadir's right in a certain way -- but all of those approaches are complex, and of rather great difficulty.
The other way -- simpler, and perhaps more correct & robust -- is to keep NAME and PAID_DATE columns in Salary or SalaryPaid, and write the actual name & date paid at the time the payment is made.
Good old batch-processing style -- and it has the benefit of actually capturing the key financial facts, of what payment was made & what name it was made to, which are the actual auditable transaction history.
Do you pay each Salary entry individually, or in bunch (PaySlip or SalaryPaid)? Put the NAME column wherever you record the actual payment & timestamp it occurred.

How to store timetables?

I'm working on a project that must store employees' timetables. For example, one employee works Monday through Thursday from 8am to 2pm and from 4pm to 8pm. Another employee may work Tuesday through Saturday from 6am to 3pm.
I'm looking for an algorithm or a method to store these kind of data in a MySQL database. These data will be rarely accessed so it's not important performance questions.
I've thought to store it as a string but I don't know any algorithm to "encode" and "decode" this string.
As many of the comments indicate, it's usually a poor idea to encode all the data into a string that is basically meaningless to the data base. It's usually better to define the data elements and their relations and represent these structures in the data base. The Wikipedia article on data models is a good overview of what's involved (although it's way more general than what you need). The problem you are describing seems simple enough that you could do this with pencil and paper.
One way to start is to write down a lists of logical relationships between concepts in your problem. For instance, the list might look like this (your rules may be different):
Every employee follows a single schedule.
Every employee has a first and last name, as well as an employee ID. Different employees may have the same name, but each employee's ID is unique to that employee.
A schedule has a start and stop day of the week and a start and stop time of day.
The start and stop time is the same for every day of the schedule.
Several employees may be on the same schedule.
From this, you can list the nouns used in the rules. These are candidates for entities (columns) in the data base:
Employee
Employee ID
Employee first name
Employee last name
Schedule
Schedule start day
Schedule start time
Schedule end day
Schedule end time
For the rules I listed, schedules seem to exist independently of employees. Since there needs be a way of identifying which schedule an employee follows, it makes sense to add one more entity:
Schedule ID
If you then look at the verbs in the rules ("follows", "has", etc.), you start to get a handle on the relationships. I would group everything so far into two relationships:
Employees
ID
first_name
last_name
schedule_ID
Schedules
ID
start_day
start_time
end_day
end_time
That seems to be all that's needed by way of data structures. (A reasonable alternative to start_day and end_day for the Schedules table would be a boolean field for each day of the week.) The next step is to design the indexes. This is driven by the queries you expect to make. You might expect to look up the following:
What schedule is employee with ID=xyz following?
Who is at work on Mondays at noon?
What days have nobody at work?
Since employees and schedules are uniquely identified by their respective IDs, these should be the primary fields of their respective tables. You also probably want to have consistency rules for the data. (For instance, you don't want an employee on a schedule that isn't defined.) This can be handled by defining a "foreign key" relationship between the Employees.schedule_ID field and the Schedules.ID field, which means that Employees.schedule_ID should be indexed. However, since employees can share the same schedule, it should not be a unique index.
If you need to look up schedules by day of week and time of day, those might also be worth indexing. Finally, if you want to look up employees by name, those fields should perhaps be indexed as well.
Assuming you're using PHP:
Store a timetable in a php array and then use serialize function to transform it in a string;
to get back the array use unserialize.
However this form of memorization is almost never a good idea.

Is this an acceptable database design?

My spider sense is tingling, but I've been thinking about it for 2 hours now and I'd like some more feedback from the hivemind.
I'm creating an application for a school. Its supposed to handle students, teachers, courses, honor roles, grades - the works.
I was wondering how to handle the change of years after each year.
Students move up a grade (or don't).
Teachers are assigned to different grades as their homeroom teacher.
Grades are saved for the year.
There's also the matter of auditing. I need to have an easy way to pull up records from last year or the year before. See what teacher gave which course at what grade at what year.
The problem I'm having is how to handle this.
My thought was to create a new clean database for each year as they come along. So at the end of this year, I'd go to the school and create a new database for them named FooSchool2012 and programatically let the end users change the database they want to use via a connection string.
Since I'm using an ORM it's only a matter of changing the connection string as the databases are the same.
But this reeks of bad design and crappy engineering to me.
Usually my gut is right, so hopefully you guys can let me know of some alternatives on how to handle this.
No, I would not create a new table or database for each year. It breaks first normal form. Every table will be a duplicate except for the name. It's a poor design. And a maintenance headache. Who's going to create the new database, load the schema, and then change all the URLs? If you change the schema after a few years, will you have to change all the back editions as well so people can query the historical data?
Nope, not a good design at all.
It's common to move historical information out into reporting/data warehousing databases. But the scheme you're suggesting is reminiscent of old, mainframe, VSAM flat file methodologies. I'd use relational databases the way they were intended to be used.
I'm sure your solution could be made workable, but it does seem a little needlessly complicated. Couldn't you accomplish the same thing in a single database by referencing the school year? You may want to think about which entities make sense to have "effective dates" (i.e., a start and an end time). The 3rd grade teacher may change mid year, for example, but you could handle that with effective dates.
My thought was to create a new clean
database for each year
If you thought about this for two hours, and your best idea was to create a new database for each year, you're the wrong person to design this database.
That's an observation, not a criticism. You just need to learn a lot more of the fundamentals before you tackle a project like this one. You'll just get frustrated, and the school will suffer.
You need to spend A LOT of time on your database design. Think about maintenance in the long run, it needs to be as easy as possible. The best way is to create a relational database, research bridge,validation, and base tables. To answer your question, I would not do a table for every year. The best way is having the student grade data mapped to a specific unique id representing that student's specific course ID.
I would think about creating a table for each of the nouns:
instructor - PK instructorID,instructorName..
(any other 1:1 instructor information)
Student - PK studentID,StudentName..
(any other 1:1 student information)
Course - PK CourseID, CourseName, CourseDescription..
(any other 1:1 course information)
•Teachers are assigned to different grades as their homeroom teacher.
on the 1:1 instructor table you could have a column called HomeroomGrade and then you
update that column with the current grade. If you wanted to keep a history of the grade
you could have the instructor table be a composite key with another column incrementing up
for the current record.
•Students move up a grade (or don't).
You will need another table showing the relationship of students to a unique courses
grades for that year, but first you need to map the instructor to that specific course.
PK InstructorToCourseID
InstructorID - FK
CourseID - FK
Year - FK
then yet another table mapping the unique course to that student with the grade..
PK InstructorToCourseID FK from previous table
PK StudentID - FK from student information table
Grade
Sorry if im general and vague, but this should give you some ideas on the relationships that can be created.

Lookup tables for basic user input?

Data like Birth month, day and year, user's age, Gender/Sex, etc. Should these be stored as text or ID based in database? ID based means they will have lookup values. Usage is for example: User signup will record age, user profiles will have a seeking partner age, etc so age and other data can be used in multiple places. In backend there will be analytic which is pushing me to use lookup tables for even small things like Gender which have only 2-3 values.
You will want to reference the datatypes or database has avilable. For mysql: http://dev.mysql.com/doc/refman/5.0/en/data-types.html
Do not use any of the fields you mentioned as the primary key. Either create an 'id' column or use the user's username.
Here are database for the fields:
birthDate = date
age = tinyint (Technically you don't need this since you can always determine it based on their birthDate and current date. It depends what you are doing)
gender = enum
I wouldn't bother with lookup tables for the fields you mentioned. Birth month, day, year could all be encapsulated in a single date field (w/ date type, not text), and then split out with database functions if you need. Age is just a number, which is all your id field will be, so not much point in making a lookup for that unless you want to actually limit the age range, in which case you could use a check constraint instead of a lookup if you needed to limit to an actual range (age >= 1 and age <= 20 for instance). Gender/sex is the only one I might consider a lookup on, but since there are so few possible values, a check constraint should suffice.
Oh, and I don't know what analytic you're using, but any analysis software worth its salt can produce field domains (lists of unique values in a table's fields) on its own, especially for the fields you mentioned. If you're building your own analytic (not sure, based on your comment to another poster's solution), you can easily make queries to do what you're talking about.

Resources