Designing a database for categories and subcategories - database

Basically I'm trying to figure out how Amazon architected their book section. Check out Amazon's book page here (https://www.amazon.com/s/ref=lp_2_ex_n_1?rh=n%3A283155&bbn=283155&ie=UTF8&qid=1522817105).
We are given several main categories: Arts & Photography, Biographies & Memoirs, etc.
If I click on Biographies & Memoirs for example, I'm lead to a series of sub categories. I.E. Biographies & Memoirs > Historical > Asia > Japan
There are repeating sub-category names for example: History > Asia > Japan
How can I map this kind of information so that it is scalable?
Below is the wrong way to do it...?
Categories table
+----+-----------------------+-----------+
| id | name | parent_id |
+----+-----------------------+-----------+
| 1 | Biographies & Memoirs | null |
| 2 | Historical | 1 |
| 3 | Asia | 2 |
| 4 | History | null |
| 5 | Asia | 4 |
| 6 | Japan | 5 |
| 7 | Japan | 3 |
+----+-----------------------+-----------+
Books
+----+-------------------------------------+----------+
| id | name | category |
+----+-------------------------------------+----------+
| 1 | The Lone Samurai | 7 |
| 2 | The Human Tradition in Modern Japan | 7 |
| 3 | Okinawa: The Last Battle | 6 |
+----+-------------------------------------+----------+
Authors
+----+---------------+----------+
| id | firstname | lastname |
+----+---------------+----------+
| 1 | James M. | Burns |
| 2 | Roy E. | Appleman |
| 3 | Russell A. | Gugeler |
| 4 | John | Stevens |
| 5 | William Scott | Wilson |
| 6 | Anne | Walthall |
+----+---------------+----------+
Authors to books (Many to many)
+---------+-----------+
| book_id | author_id |
+---------+-----------+
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 1 | 5 |
| 2 | 6 |
+---------+-----------+

Related

Generate a new dataset from two existings datasets with conditions

I have two dataset with the same columns, and I would like to create a new one in another sheet with all rows from the first dataset and add to it specific rows from the second one.
My first dataset is like:
| Item Type | Item Numb | Start Date | End date |
---------------------------------------------------
| 1 | 1 | 17/02/2022 | 21/02/2022 |
| 1 | 2 | 19/02/2022 | 24/02/2022 |
| 2 | 1 | 15/02/2022 | 18/02/2022 |
| 2 | 2 | 17/02/2022 | 20/02/2022 |
| 3 | 1 | 21/02/2022 | 25/02/2022 |
And the second one is like:
| Item Type | Item Numb | Start Date | End date |
---------------------------------------------------
| 1 | 2 | 17/02/2022 | 20/02/2022 |
| 2 | 2 | 17/02/2022 | 20/02/2022 |
| 2 | 3 | 20/02/2022 | 23/02/2022 |
| 3 | 1 | 20/02/2022 | 23/02/2022 |
| 4 | 1 | 21/02/2022 | 24/02/2022 |
| 4 | 2 | 23/02/2022 | 28/02/2022 |
So now, I would like in a new sheet to retrieve the rows from the first dataset and add at the end the rows from the second one who are absent.
If a Combination of "Item Type" and "Item Numb" is already imported I don't want to get them from the second dataset, but if this specific combination isn't in the first one so I would like to add the row.
That's what I need as the result:
| Item Type | Item Numb | Start Date | End date |
---------------------------------------------------
| 1 | 1 | 17/02/2022 | 21/02/2022 |
| 1 | 2 | 19/02/2022 | 24/02/2022 |
| 2 | 1 | 15/02/2022 | 18/02/2022 |
| 2 | 2 | 17/02/2022 | 20/02/2022 |
| 3 | 1 | 21/02/2022 | 25/02/2022 |
| 2 | 3 | 20/02/2022 | 23/02/2022 |
| 4 | 1 | 21/02/2022 | 24/02/2022 |
| 4 | 2 | 23/02/2022 | 28/02/2022 |
Thanks in advance for your time folks!
try:
=INDEX(ARRAY_CONSTRAIN(QUERY(SORTN(
{Sheet1!A2:D, Sheet1!A2:A&Sheet1!B2:B;
Sheet2!A2:D, Sheet2!A2:A&Sheet2!B2:B}, 9^9, 2, 5, 1),
"where Col1 is not null", 0), 9^9, 4)

I can't apply multiple conditions in this SQL statement

I don't know why it doesn't give me the answer for students enrolled in Database Systems but not in Operating System Design.
select student.snum, student.sname, enrolled.cname
-> from enrolled
-> inner join student ON enrolled.snum = student.snum
-> where enrolled.cname="Database Systems" AND enrolled.cname<>"Operating System Design";`
+-----------+--------------------+------------------+
| snum | sname | cname |
+-----------+--------------------+------------------+
| 112348546 | Joseph Thompson | Database Systems |
| 115987938 | Christopher Garcia | Database Systems |
| 348121549 | Paul Hall | Database Systems |
| 322654189 | Lisa Walker | Database Systems |
| 552455318 | Ana Lopez | Database Systems |
+-----------+--------------------+------------------+
My student table.
+-----------+--------------------+------------------------+-------+------+
| snum | sname | major | level | age |
+-----------+--------------------+------------------------+-------+------+
| 51135593 | Maria White | English | SR | 21 |
| 60839453 | Charles Harris | Architecture | SR | 22 |
| 99354543 | Susan Martin | Law | JR | 20 |
| 112348546 | Joseph Thompson | Computer Science | SO | 19 |
| 115987938 | Christopher Garcia | Computer Science | JR | 20 |
| 132977562 | Angela Martinez | History | SR | 20 |
| 269734834 | Thomas Robinson | Psychology | SO | 18 |
| 280158572 | Margaret Clark | Animal Science | FR | 18 |
| 301221823 | Juan Rodriguez | Psychology | JR | 20 |
| 318548912 | Dorthy Lewis | Finance | FR | 18 |
| 320874981 | Daniel Lee | Electrical Engineering | FR | 17 |
| 322654189 | Lisa Walker | Computer Science | SO | 17 |
| 348121549 | Paul Hall | Computer Science | JR | 18 |
| 351565322 | Nancy Allen | Accounting | JR | 19 |
| 451519864 | Mark Young | Finance | FR | 18 |
| 455798411 | Luis Hernandez | Electrical Engineering | FR | 17 |
| 462156489 | Donald King | Mechanical Engineering | SO | 19 |
| 550156548 | George Wright | Education | SR | 21 |
| 552455318 | Ana Lopez | Computer Engineering | SR | 19 |
| 556784565 | Kenneth Hill | Civil Engineering | SR | 21 |
| 567354612 | Karen Scott | Computer Engineering | FR | 18 |
| 573284895 | Steven Green | Kinesiology | SO | 19 |
| 574489456 | Betty Adams | Economics | JR | 20 |
| 578875478 | Edward Baker | Veterinary Medicine | SR | 21 |
+-----------+--------------------+------------------------+-------+------+
My enrolled table
+-----------+----------------------------+
| snum | cname |
+-----------+----------------------------+
| 112348546 | Database Systems |
| 115987938 | Database Systems |
| 348121549 | Database Systems |
| 322654189 | Database Systems |
| 552455318 | Database Systems |
| 455798411 | Operating System Design |
| 552455318 | Operating System Design |
| 567354612 | Operating System Design |
| 112348546 | Operating System Design |
| 115987938 | Operating System Design |
| 322654189 | Operating System Design |
| 567354612 | Data Structures |
| 552455318 | Communication Networks |
| 455798411 | Optical Electronics |
| 301221823 | Perception |
| 301221823 | Social Cognition |
| 301221823 | American Political Parties |
| 556784565 | Air Quality Engineering |
| 99354543 | Patent Law |
| 574489456 | Urban Economics |
+-----------+----------------------------+
You need to use the NOT EXISTS as follows:
select s.snum, s.sname, e.cname
from enrolled e
inner join student s ON e.snum = s.snum
where e.cname='Database Systems'
AND not exists
(select 1 from enrolled ee
where ee.snum = e.snum and e.cname = 'Operating System Design');
Compare string with single quotation mark(' ') than double quotation(" "). Your code seems ok to me.
Remember:
Single quotes are for strings.
Double quotes are for tables names and column names.
select student.snum, student.sname, enrolled.cname
from enrolled
inner join student ON enrolled.snum = student.snum
where enrolled.cname='Database Systems'
AND enrolled.cname<>'Operating System Design';
or try this
select student.snum, student.sname, enrolled.cname
from enrolled
inner join student ON enrolled.snum = student.snum
where enrolled.cname='Database Systems'

Multiple outcomes/scenarios

I got a problem that I have already created a solution for, but I'm wondering if there's a better way of solving the problem. Basically I have to create a flag for certain scenarios under a partition of ID and date. My solution involved mapping for all the possible scenarios, then creating "case when" statements for all these scenarios with the specific outcome. Basically, I was the one that created the outcomes. I am wondering if there's another way, something around letting SQL create the outcomes instead of myself.
Thanks a lot!
Background:
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
| ID | Month | Status | Value | Flag | Scenario Number | Scenario Description |
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
| 1 | 1/01/2016 | First | 123 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/01/2016 | Second | 456 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/01/2016 | | 789 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/02/2016 | Second | 123 | Yes | 2 | First does not exist, two second but have different values. Flag these as Yes |
| 1 | 1/02/2016 | Second | 456 | Yes | 2 | First does not exist, two second but have different values. Flag these as Yes |
| 1 | 1/02/2016 | Second | 123 | No | 3 | First does not exist, two second have same values. Do not flag |
| 1 | 1/02/2016 | Second | 123 | No | 3 | First does not exist, two second have same values. Do not flag |
| 1 | 1/03/2016 | Second | 123 | No | 4 | Only one entry of Second exist. Do no flag |
| 1 | 1/04/2016 | | 123 | Yes | 5 | Two blanks for the partition. Flag these as Yes |
| 1 | 1/04/2016 | | 123 | Yes | 5 | Two blanks for the partition. Flag these as Yes |
| 1 | 1/05/2016 | | | No | 6 | Only one entry of blank exist. Do not flag these |
| 1 | 1/06/2016 | First | 123 | Yes | 7 | First exist for the partition. Do not flag |
| 1 | 1/06/2016 | | 456 | Yes | 7 | First exist for the partition. Do not flag |
| 1 | 1/07/2016 | Second | 123 | Yes | 8 | First does not exist and second and blank do not have the same value. Flag these. |
| 1 | 1/07/2016 | | 456 | Yes | 8 | First does not exist and second and blank do not have the same value. Flag these. |
| 1 | 1/07/2016 | Second | 123 | Yes | 8 | First does not exist and second and blank have the same value. Flag these. |
| 1 | 1/07/2016 | | 123 | Yes | 8 | First does not exist and second and blank have the same value. Flag these. |
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
Data:
+----+-----------+-------+----------+---------------+
| ID | Month | Value | Priority | Expected_Flag |
+----+-----------+-------+----------+---------------+
| 1 | 1/01/2016 | 96.01 | | Yes |
| 1 | 1/01/2016 | 96.01 | | Yes |
| 1 | 1/02/2016 | 65.2 | First | No |
| 1 | 1/02/2016 | 3.47 | Second | No |
| 1 | 1/02/2016 | 45.99 | | No |
| 11 | 1/01/2016 | 25 | | No |
| 11 | 1/02/2016 | 74.25 | Second | No |
| 11 | 1/02/2016 | 74.25 | Second | No |
| 11 | 1/02/2016 | 23.25 | | No |
| 24 | 1/01/2016 | 1.25 | First | No |
| 24 | 1/01/2016 | 1.365 | | No |
| 24 | 1/04/2016 | 1.365 | First | No |
| 24 | 1/04/2016 | 1.365 | | No |
| 24 | 1/05/2016 | 1.365 | First | No |
| 24 | 1/05/2016 | 1.365 | First | No |
| 24 | 1/06/2016 | 1.365 | Second | No |
| 24 | 1/06/2016 | 1.365 | Second | No |
| 24 | 1/07/2016 | 1.365 | Second | Yes |
| 24 | 1/07/2016 | 1.365 | | Yes |
| 24 | 1/08/2016 | 1.365 | First | No |
| 24 | 1/08/2016 | 1.365 | | No |
| 24 | 1/09/2016 | 1.365 | Second | No |
| 24 | 1/09/2016 | 1.365 | | No |
| 27 | 1/01/2016 | 0 | Second | Yes |
| 27 | 1/01/2016 | 0 | Second | Yes |
| 27 | 1/02/2016 | 45.25 | Second | No |
| 3 | 1/01/2016 | 96.01 | First | No |
| 3 | 1/01/2016 | 96.01 | First | No |
| 3 | 1/03/2016 | 96.01 | First | No |
| 3 | 1/03/2016 | 96.01 | First | No |
| 35 | 1/01/2016 | | | Yes |
| 35 | 1/01/2016 | | | Yes |
| 35 | 1/02/2016 | | First | No |
| 35 | 1/02/2016 | | Second | No |
| 35 | 1/02/2016 | | | No |
| 35 | 1/02/2016 | | | No |
| 35 | 1/03/2016 | | Second | Yes |
| 35 | 1/03/2016 | | Second | Yes |
| 35 | 1/04/2016 | | Second | No |
| 35 | 1/04/2016 | | Second | No |
+----+-----------+-------+----------+---------------+

Generate variables that move information between rows in hierarchical data with spss syntax

I was wondering if you can help me with the following problem in spss syntax.
My dataset has nested structure.
Data are nested in companies, then each company has 1 or 2 bosses, but in this case I care only about boss 1. At a previous stage in time the boss graded the workers (not all of them). Now, the ID and the grade of the workers is on the row each worker.
I would like to move the information that was obtained during worker's assessment and create new sets of variables for each (worker ID and grade) on the line/row of the boss.
+---------+------+--------+--------------+---------+---------+--------+---------+
| company | boss |workerID|worker's grade|N:workID1|N:grade1 |N:work2 |N:grade2 |
+---------+------+--------+--------------+---------+---------+--------+---------+
| A | 1 | 1 | | 3 | A | 4 | A |
| A | 2 | 2 | | | |
| A | 0 | 3 | A | | |
| A | 0 | 4 | A | | |
| A | 0 | 5 | | | |
| B | 1 | 1 | | 3 | B | 4 | A |
| B | 0 | 2 | | | |
| B | 0 | 3 | B | | |
| B | 0 | 4 | A | | |
| C | 1 | 1 | | 2 | D | -1 | -1 |
| C | 0 | 2 | D | | |
I would like to move the worker's id and the grade that to the row of the boss in the NEW variables, without loosing the existing variables on workerID and worker's grade.
Basically, I will need to feed forward the information into the new variables and to the row of boss EQ 1 separately for each company.
I have no idea how to proceed with this. I assume that I need a loop that creates new variable for each worker ID that has a valid grade and then feeds forward the information from the worker's row to the boss' newly generated variables.
Any suggestions are very wellcome :-)
Take a look at VARSTOCASES (Data > Restructure)

Database design for download presets

Newbie with databases, I would like some advise please..
I have agencies who can download photo's.
Standard each agency can download "medium" & "large" photos.
Now from their account page I would like them to make extra custom presets and manage those.
I looked in the database of some blog software how they handle categories and wrapped my head around this example. Is this the right approach?
Cheers
agency 1 has preset "medium" & "large"
agency 2 has preset "medium", "large" & "Bill custom"
-----------
| presets |
-----------------------------------------------
| preset_id | preset_name | preset_dimensions |
-----------------------------------------------
| 1 | medium | 800x600 |
| 2 | large | 3000x2000 |
| 3 | Bill custom | 640x420 |
-----------------------------------------------
----------------
| preset_assoc |
------------------------------------------------------------
| presassoc_id | presassoc_preset_id | presassoc_agency_id |
------------------------------------------------------------
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
| 5 | 3 | 2 |
------------------------------------------------------------
------------
| agencies |
---------------------------
| agency_id | agency_name |
---------------------------
| 1 | Joe ltd |
| 2 | Bill inc |
---------------------------
The approach is right. Because you have NxN relation (1 agency can have multiple presets, and the same preset could be used by multiple agencies) you need to have a joining table. The only questionable thing is that preset_assoc doesn't have to have presassoc_id because the other 2 columns could be used as a combined primary key.

Resources