How to normalize three identical tables into one table? - database

I have three tables:
cat1:
id(PK)
name
description
cat2:
id(PK)
name
description
cat1_id1(FK)
cat3
id(PK)
name
description
cat2_id(FK)
cat1 has one-to-many cat2, and cat2 has one-to-many cat3.
How do I normalize the three tables into one table?
For example this design:
CREATE TABLE IF NOT EXISTS public."animalGroups_animalgroup"
(
id bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY ,
name text
description text COLLATE pg_catalog."default" NOT NULL,
images character varying(100) COLLATE pg_catalog."default" NOT NULL,
)
CREATE TABLE IF NOT EXISTS public.category_category
(
id bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY ,
name character varying(100) COLLATE pg_catalog."default" NOT NULL,
description text COLLATE pg_catalog."default" NOT NULL,
images character varying(100) COLLATE pg_catalog."default" NOT NULL,
animalgroup_id(FK)
)
CREATE TABLE IF NOT EXISTS public.subcategory_subcategory
(
id bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY ,
name character varying(100) COLLATE pg_catalog."default" NOT NULL,
description text COLLATE pg_catalog."default" NOT NULL,
images character varying(100) COLLATE pg_catalog."default" NOT NULL,
category_id(FK)
)
CREATE TABLE IF NOT EXISTS public.animal_animal
(
id bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY ,
name character varying(100) COLLATE pg_catalog."default" NOT NULL,
description text COLLATE pg_catalog."default" NOT NULL,
images character varying(100) COLLATE pg_catalog."default" NOT NULL,
subcategory_id(FK)
)
animalgroup can have one or more categories
category can have one or more subcategory
subcategory can have one or more animal
How it will look like:
animal group has mammals
mammals (can have more categories) has for example categories cats, dogs
cats category (can have more subcategories) has for example subcategories little cats, big cats
little cats (subcategories can have more animals) has the real cat species ragdoll
Is this design correct?
They have four of the same fields. To add one more field, for example age, then in all four tables I have to add the field age.

Ok you changed your DB design so that would like like this:
SELECT * -- should specify columns here
FROM cat1
LEFT JOIN cat2 on cat1.id = cat2.cat1_id1
LEFT JOIN cat3 on cat2.id = cat3.cat2_id
The difference in naming (cat1_id1 vs cat2_id) is strange -- I think that 1 might be a typo.
original answer below
I'm guessing your tables actually look like this
cat1:
id
cat2id
name
description
cat2:
id
cat3id
name
description
cat3
id
name
description
Where the 1 to many relationship is represented by the id they are related to in the columns I added.
In that case you can join them like this
SELECT * -- should have column list here
FROM cat1
LEFT JOIN cat2 on cat1.cat2id = cat2.id
LEFT JOIN cat3 on cat2.cat3id = cat3.id

Related

Is there a way to display top 10 average ratings of a business type?

I have the following tables in my database scheme
CREATE TABLE public.organization_rating
(
rating integer NOT NULL,
user_id integer NOT NULL,
organization_id integer NOT NULL,
CONSTRAINT organization_rating_pkey PRIMARY KEY (user_id, organization_id),
CONSTRAINT user_id FOREIGN KEY (user_id)
REFERENCES public.users (user_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE,
CONSTRAINT stars CHECK (rating >= 1 AND rating < 5)
)
And
CREATE TABLE public.organization
(
org_id integer NOT NULL DEFAULT nextval('organization_org_id_seq'::regclass),
name character varying(90) COLLATE pg_catalog."default" NOT NULL,
description character varying(90) COLLATE pg_catalog."default" NOT NULL,
email text COLLATE pg_catalog."default" NOT NULL,
phone_number character varying COLLATE pg_catalog."default" NOT NULL,
bt_id integer NOT NULL,
bs_id integer NOT NULL,
CONSTRAINT organization_pkey PRIMARY KEY (org_id),
CONSTRAINT bs_id FOREIGN KEY (bs_id)
REFERENCES public.business_step (bs_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
NOT VALID,
CONSTRAINT bt_id FOREIGN KEY (bt_id)
REFERENCES public.business_type (bt_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
NOT VALID
)
I would like to implement a query that gives me the following:
Top 10 organization ratings per business type
Top 10 organizations per business stage
Top 3 organizations with worst rating
Since the queries appear to be similar, I just have to order DESC or ASC, depending on the requirement, I just need one query to work and I will have the other 2. I tried implementing this query:
CREATE TABLE public.organization
(
org_id integer NOT NULL DEFAULT nextval('organization_org_id_seq'::regclass),
name character varying(90) COLLATE pg_catalog."default" NOT NULL,
description character varying(90) COLLATE pg_catalog."default" NOT NULL,
email text COLLATE pg_catalog."default" NOT NULL,
phone_number character varying COLLATE pg_catalog."default" NOT NULL,
bt_id integer NOT NULL,
bs_id integer NOT NULL,
CONSTRAINT organization_pkey PRIMARY KEY (org_id),
CONSTRAINT bs_id FOREIGN KEY (bs_id)
REFERENCES public.business_step (bs_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
NOT VALID,
CONSTRAINT bt_id FOREIGN KEY (bt_id)
REFERENCES public.business_type (bt_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE CASCADE
NOT VALID
)
Here is my select statement:
SELECT O.org_id, O.bt_id, R.rating
FROM public.organization as O
INNER JOIN public.organization_rating as R ON O.org_id = R.organization_id
WHERE bt_id=1
GROUP by org_id, bt_id, rating
ORDER BY ROUND(AVG(rating)) DESC LIMIT 10
But the output is as follows:
There seems to be an error in which various organizations are being duplicated. These are the real average values of the organizations which are being duplicated:
And
Why are the organizations id being duplicated?
Thanks in advance.
The reason you see duplicated records is because the ratings in organization_rating are per user. There can be several users rating an organization. You should first compute an average rating and then join with the organization table.
You can do something like this for bt_id=1:
with average_rating as (
select organization_id as org_id, avg(rating) as avg_rating
from organization_rating r
group by org_id
)
select org_id, bt_id, avg_rating
from average_rating r
join organization o on o.org_id = r.org_id
where r.bt_id = 1
order by avg_rating desc limit 10;
If you want to get all data in a single query, you could use a window function:
with average_rating as (
select organization_id as org_id, avg(rating) as avg_rating
from organization_rating r
group by org_id
),
ordered_data as (
select org_id, bt_id, avg_rating, row_number() over (partition by bt_id order by avg_rating desc) rank
from average_rating r
join organization o on o.org_id = r.org_id
order by bt_id, avg_rating desc
)
select org_id, bt_id, avg_rating
from ordered_data
where rank <= 10
Thanks to mihai_f87, I was able to construct this query:
with average_rating as (
SELECT organization_id as Organization_ID, ROUND(AVG(rating)) as Rating
FROM organization_rating
GROUP BY organization_id
),
ordered_data as (
select org_id, bt_id, rating, row_number() over (partition by bt_id order by rating desc) rank
from average_rating r
join organization o on o.org_id = r.organization_id
where bt_id = 1
order by bt_id, rating desc
)
select org_id, bt_id, rating
from ordered_data
where rank <= 10
With this query, I was able to search top 10 organizations per business type.
Output

SQL to get teams with no power forward

I store data about basketball teams in my Teams table that looks like this:
CREATE TABLE Teams (
Id varchar(5) NOT NULL PRIMARY KEY,
TeamName varchar(50) NOT NULL
);
And I keep team members in TeamMembers table that looks like this:
CREATE TABLE TeamMembers (
Id int NOT NULL PRIMARY KEY,
TeamId VARCHAR(5) FOREIGN KEY REFERENCES Teams(Id),
LastName varchar(50) NOT NULL,
FirstName varchar(50) NOT NULL,
PositionId int NOT NULL
);
Positions are in another table with INT ID's. For example, Guard: 1, Center: 2 and Power Forward: 3 in this exercise.
I want to get a list of basketball teams with NO power forward.
Something like:
select *
from Teams
where Id not in
(
select TeamId
from TeamMembers
where PositionID = 4
)
When checking if a row doesn't exist, use a NOT EXISTS!.
SELECT
T.*
FROM
Teams AS T
WHERE
NOT EXISTS (
SELECT
'team has no power forward member'
FROM
TeamMembers AS M
WHERE
M.TeamID = T.ID AND
M.PositionID = 3) -- 3: Power Forward

aggregate / count rows grouped by geography (or geometry)

I have a table such as:
Create Table SalesTable
( StuffID int identity not null,
County geography not null,
SaleAmount decimal(12,8) not null,
SaleTime datetime not null )
It has a recording of every sale with amount, time, and a geography of the county that the sale happened in.
I want to run a query like this:
Select sum(SaleAmount), County from SalesTable group by County
But if I try to do that, I get:
The type "geography" is not comparable. It cannot be used in the GROUP BY clause.
But I'd like to know how many sales happened per county. Annoyingly, if I had the counties abbreviated (SDC,LAC,SIC, etc) then I could group them because it would simply be a varchar. But then I use the geography datatype for other reasons.
There's a function to work with geography type as char
try this
Select sum(SaleAmount), County.STAsText() from SalesTable
group by County.STAsText()
I would propose a slightly different structure:
create table dbo.County (
CountyID int identity not null
constraint [PK_County] primary key clustered (CountyID),
Name varchar(200) not null,
Abbreviation varchar(10) not null,
geo geography not null
);
Create Table SalesTable
(
StuffID int identity not null,
CountyID int not null
constraint FK_Sales_County foreign key (CountyID)
references dbo.County (CountyID),
SaleAmount decimal(12,8) not null,
SaleTime datetime not null
);
From there, your aggregate looks something like:
Select c.Abbreviation, sum(SaleAmount)
from SalesTable as s
join dbo.County as c
on s.CountyID = c.CountyID
group by c.Abbreviation;
If you really need the geography column in the aggregate, you're a sub-query or a common table expression away:
with s as (
Select c.CountyID, c.Abbreviation,
sum(s.SaleAmount) as [TotalSalesAmount]
from SalesTable as s
join dbo.County as c
on s.CountyID = c.CountyID
group by c.Abbreviation
)
select s.Abbreviation, s.geo, s.TotalSalesAmount
from s
join dbo.County as c
on s.CountyID = s.CountyID;

Select 3 tables with multiple "on clause" affecting 9 lines with only one register?

I have these 3 tables: Register, Brand and Clothing, where Register saves int numbers from a Brand dropdownlist and a Clothing dropdownlist, but in the select I dont want the int value, but the value who the respective tables represents. (IdBrand 1 = Nike, IdBrand 2 = Adidas). It is "working" but I dont know if I am doing something wrong, cause when I try the code on Sql Server "new query" I have 9 lines resulting even I having JUST ONE REGISTER on the RegisterTable, it is more a "logic" question, is it normal to have a lots of rows on the "select display"?
The Code
select Register.*, Clothing.ClothingName, Brand.BrandName
from Register
inner join Clothing
on RegisterClothingId1 = ClothingId
or RegisterClothingId2 = ClothingId
or RegisterClothingId3 = ClothingId
inner join Brand
on RegisterBrandId1 = BrandId
or RegisterBrandId2 = BrandId
or RegisterBrandId3 = BrandId
I also try with "and" instead "or" but it affects/returns zero lines.
Again, this code "is working". I just do not know if it is normal to have so many lines resulting with JUST ONE register. Because if 1 register gives 9 lines I wonder 100 register will give 900 lines for example.
Thank you.
The table have only 3 Brand columns and 3 Clothing columns that inherats from the other 2 tables
Register Table
CREATE TABLE [dbo].[Register] (
[RegisterId] INT IDENTITY (1, 1) NOT NULL,
[RegisterPersonId] INT NOT NULL,
[RegisterPersonNote] NCHAR (60) NULL,
[RegisterCareerId] INT NOT NULL,
[RegisterEvent] NCHAR (60) NOT NULL,
[RegisterEventYear] INT NOT NULL,
[RegisterImgDressed] VARBINARY (MAX) NOT NULL,
[RegisterClothingId1] INT NOT NULL,
[RegisterBrandId1] INT NOT NULL,
[RegisterClothingName1] NCHAR (60) NOT NULL,
[RegisterClothingImg1] VARBINARY (MAX) NOT NULL,
[RegisterClothingId2] INT NOT NULL,
[RegisterBrandId2] INT NOT NULL,
[RegisterClothingName2] NCHAR (60) NOT NULL,
[RegisterClothingImg2] VARBINARY (MAX) NOT NULL,
[RegisterClothingId3] INT NOT NULL,
[RegisterBrandId3] INT NOT NULL,
[RegisterClothingName3] NCHAR (60) NOT NULL,
[RegisterClothingImg3] VARBINARY (MAX) NOT NULL,
[RegisterYoutube] NCHAR (500) NOT NULL,
[RegisterExternalLink] NCHAR (500) NULL,
[RegisterNote] NCHAR (60) NULL,
[RegisterNote2] NCHAR (60) NULL,
CONSTRAINT [PK_Register] PRIMARY KEY CLUSTERED ([RegisterId] ASC),
CONSTRAINT [FK_Register_Brand1] FOREIGN KEY ([RegisterBrandId1]) REFERENCES [dbo].[Brand] ([BrandId]),
CONSTRAINT [FK_Register_Brand2] FOREIGN KEY ([RegisterBrandId2]) REFERENCES [dbo].[Brand] ([BrandId]),
CONSTRAINT [FK_Register_Brand3] FOREIGN KEY ([RegisterBrandId3]) REFERENCES [dbo].[Brand] ([BrandId]),
CONSTRAINT [FK_Register_Clothing1] FOREIGN KEY ([RegisterClothingId1]) REFERENCES [dbo].[Clothing] ([ClothingId]),
CONSTRAINT [FK_Register_Clothing2] FOREIGN KEY ([RegisterClothingId2]) REFERENCES [dbo].[Clothing] ([ClothingId]),
CONSTRAINT [FK_Register_Clothing3] FOREIGN KEY ([RegisterClothingId3]) REFERENCES [dbo].[Clothing] ([ClothingId])
);
Other two are simple tables, have only BrandId and BrandName, and Clothing Id and ClothingName
I think your Register table needs redesign. Columns with name[RegisterClothingNameX] and [RegisterClothingImgX] should be on the Clothing table unless they are not describing a particular clothing item. Below query should give you the data you require
select
Register.*,
c1.ClothingName AS Clothing1Name,
c2.ClothingName AS Clothing2Name,
c3.ClothingName AS Clothing3Name,
b1.BrandName AS Brand1Name,
b2.BrandName AS Brand2Name,
b3.BrandName AS Brand3Name
from
Register
INNER JOIN Clothing c1 ON RegisterClothingId1 = c1.ClothingId
INNER JOIN Clothing c2 ON RegisterClothingId2 = c2.ClothingId
INNER JOIN Clothing c3 ON RegisterClothingId3 = c3.ClothingId
INNER JOIN Brand b1 ON RegisterBrandId1 = b1.BrandId
INNER JOIN Brand b2 ON RegisterBrandId2 = b2.BrandId
INNER JOIN Brand b3 ON RegisterBrandId3 = b3.BrandId
Your first join will produce 3 rows and all this 3 rows have 3 brand ids (3 rows x 3 Brands/Cols - so the second join will produce 9 records). And this may change as per the values stored in the RegisterClothingId1, RegisterBrandId3

How to implement tag following in database?

I'm making a micro-blogging website where users can follow tags. Like in twitter, users can follow other users.. in my project they can follow tags as well. What should be the database design to implement tag following? User following is easy.
One way is to have like 5 tag ID columns in the table containing posts:
Table: Posts
Columns: PostID, AuthorID, TimeStamp, Content, Tag1,Tag2...Tag5
I will make two comma separated lists: One is for the users the given user is following and the other for the tags the given user is following: $UserFollowingList and $TagFollowingList
and the select query can be like:
select ... from `Posts` where ($Condition1) or ($Condition2) order by `PostID` desc ...
$Condition1 = "`AuthorID` in $UserFollowingList"
$Condition2 = " ( `Tag1` in $TagFollowingList ) or ( `Tag2` in $TagFollowingList ) ... or ( `Tag5` in $TagFollowingList )"
Please suggest a better way? And what if I don't want to limit to 5 tags? I've an idea but want to know what will experience developers like you will do?
you can use one table for who is following who like
CREATE TABLE `followers` (
`targetID` INT(10) UNSIGNED NOT NULL,
`targetType` ENUM('user','tag') NOT NULL,
`userID` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`targetID`, `targetType`),
INDEX `userID` (`userID`)
)
and another one for the tags in each post like
CREATE TABLE `postTags` (
`postID` INT(10) UNSIGNED NOT NULL,
`tag` INT(10) NOT NULL,
PRIMARY KEY (`postID`, `tag`)
)
edit: sorry didn't think about it the 1-st time.
To avoid using a string as targetID there must be a tags table too
CREATE TABLE `tags` (
`tagID` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`tag` VARCHAR(255) NOT NULL DEFAULT '',
PRIMARY KEY (`tagID`)
)
this will give you the posts $currentUser is following
SELECT
p.*
FROM posts p
JOIN posttags pt on pt.postID = p.postID
JOIN followers f ON f.targetType = 'tag' AND f.targetID = pt.tagID
WHERE
f.followerID = $currentUserID
UNION
SELECT
p.*
FROM posts p
JOIN followers f ON f.targetID = p.authorID AND f.targetType = 'user'
WHERE
f.followerID = $currentUserID

Resources