Cost of selection - database

CREATE TABLE ‘country‘ (
‘Code‘ CHAR(3) NOT NULL DEFAULT ’’,
‘Name‘ CHAR(52) NOT NULL DEFAULT ’’,
‘Continent‘ enum(’Asia’,’Europe’,’North America’,
’Africa’,’Oceania’,’Antarctica’,’South America’) NOT NULL
DEFAULT ’Asia’,
‘Region‘ CHAR(26) NOT NULL DEFAULT ’’,
‘SurfaceArea‘ DECIMAL NOT NULL DEFAULT ’0.00’,
‘IndepYear‘ SMALLINT DEFAULT NULL,
‘Population‘ INT NOT NULL DEFAULT ’0’,
‘LifeExpectancy‘ DECIMAL DEFAULT NULL,
‘GNP‘ DECIMAL DEFAULT NULL,
‘GNPOld‘ DECIMAL DEFAULT NULL,
‘LocalName‘ CHAR(45) NOT NULL DEFAULT ’’,
‘GovernmentForm‘ CHAR(45) NOT NULL DEFAULT ’’,
‘HeadOfState‘ CHAR(60) DEFAULT NULL,
‘Capital‘ INT DEFAULT NULL,
‘Code2‘ CHAR(2) NOT NULL DEFAULT ’’,
PRIMARY KEY (‘Code‘));
CREATE TABLE ‘city‘ (
‘ID‘ INT NOT NULL AUTO_INCREMENT,
‘Name‘ CHAR(35) NOT NULL DEFAULT ’’,
‘CountryCode‘ CHAR(3) NOT NULL DEFAULT ’’,
‘District‘ CHAR(20) NOT NULL DEFAULT ’’,
‘Population‘ INT NOT NULL DEFAULT ’0’,
PRIMARY KEY (‘ID‘),
KEY ‘CountryCode‘ (‘CountryCode‘),
FOREIGN KEY (‘CountryCode‘) REFERENCES ‘country‘ (‘Code‘)
);
CREATE TABLE ‘countrylanguage‘ (
‘CountryCode‘ CHAR(3) NOT NULL DEFAULT ’’,
‘Language‘ CHAR(30) NOT NULL DEFAULT ’’,
‘IsOfficial‘ enum(’T’,’F’) NOT NULL DEFAULT ’F’,
‘Percentage‘ DECIMAL NOT NULL DEFAULT ’0.0’,
PRIMARY KEY (‘CountryCode‘,‘Language‘),
FOREIGN KEY (‘CountryCode‘) REFERENCES ‘country‘ (‘Code‘)
);
Assume that table city has 10,000 tuples, country has 800 and countrylanguage has 2,500 tuples.
Assume you have 20 pages of memory buffer. Pages are 1K (1024 bytes).
Assume the query
SELECT c.Name, sum(c.Population), avg(co.GNP)
FROM city c, country co, countrylanguage cl
WHERE c.CountryCode = co.Code and cl.CountryCode = co.Code
and Language = ’English’ and Continent = ’Asia’ and
c.Population > 100,000
GROUP BY c.Name;
Estimate the cost of selection σPopulation>100,000(city) assuming selectivity factor of 4%, if
(a) the file for table city is a heap, no indices.
(b) the file for table city file is sorted on Population.
(c) the file for table city has a clustering index on Population.
(d) the file for table city has a secondary index on Population, with height 6 (each pointer is 8
bytes).
Estimate the size (in tuples and blocks/pages) of the join country ⋊⋉ countrylanguage
I have tried to find an example of how to complete this but I haven't found one that I understand at this time.

Related

SQL Server Component Constraint or Constraint for more than 1 column

I want to create a table with operational details like:
CRAETED_ON DATETIME NOT NULL DEFAULT GETDATE(),
CREATED_BY VARCHAR(10) NOT NULL,
DELETED_ON DATETIME NULL,
DELETED_BY VARCHAR(10) NULL
I want to put constraint IF DELETED_ON is updated then they should provide the DELETED_BY.
BOTH should be NULL or both should not be NULL are allowed. One NULL & other NOT NULL is not allowed.
This can be accomplished with a table level check constraint, assuming you want the constraint to apply to both inserts and updates:
CREATE TABLE dbo.Example(
CREATED_ON DATETIME NOT NULL DEFAULT GETDATE(),
CREATED_BY VARCHAR(10) NOT NULL,
DELETED_ON DATETIME NULL,
DELETED_BY VARCHAR(10) NULL
,CONSTRAINT IF_DELETED_ON CHECK ((DELETED_ON IS NULL AND DELETED_BY IS NULL) OR (DELETED_ON IS NOT NULL AND DELETED_BY IS NOT NULL))
);

How can I change the value of a primary key of a table to be a random number?

I have this table:
CREATE TABLE [dbo].[Word] (
[WordId] INT NOT NULL,
[Name] VARCHAR (20) NOT NULL,
[StatusId] INT DEFAULT ((1)) NULL,
[Syllables] VARCHAR (20) NULL,
[Ascii] AS (ascii([Name])) PERSISTED,
[CategoryId] INT DEFAULT ((1)) NOT NULL,
[GroupId] INT DEFAULT ((1)) NOT NULL,
[LessonId] INT DEFAULT ((1)) NOT NULL,
[CreatedBy] INT DEFAULT ((1)) NOT NULL,
[CreatedDate] DATETIME DEFAULT (getdate()) NOT NULL,
[ModifiedBy] INT DEFAULT ((1)) NOT NULL,
[ModifiedDate] DATETIME DEFAULT (getdate()) NOT NULL,
[Version] ROWVERSION NULL,
PRIMARY KEY CLUSTERED ([WordId] ASC),
CONSTRAINT [FK_WordLesson] FOREIGN KEY ([LessonId]) REFERENCES [dbo].[Lesson] ([LessonId]),
CONSTRAINT [FK_WordWordCategory] FOREIGN KEY ([CategoryId]) REFERENCES [dbo].[WordCategory] ([WordCategoryId]),
CONSTRAINT [FK_WordWordGroup] FOREIGN KEY ([GroupId]) REFERENCES [dbo].[WordGroup] ([WordGroupId])
);
GO
CREATE NONCLUSTERED INDEX [Word_Category_IX]
ON [dbo].[Word]([CategoryId] ASC);
GO
CREATE NONCLUSTERED INDEX [Word_Group_IX]
ON [dbo].[Word]([GroupId] ASC);
GO
CREATE NONCLUSTERED INDEX [Word_Lesson_IX]
ON [dbo].[Word]([LessonId] ASC);
How can I change the value of WordId to be a random number that is between 1 and the maximum value of the INT column?
Note that I understand there's a possibility of the random number being used twice but it's test data so I am not too concerned about that.
I'm going to suggest a different approach to this than the route you are currently going down.
Just use an identity column, it's inbuilt and because this field is the Primary Key you need to ensure you don't get duplicates. The code would look like this.
CREATE TABLE [dbo].[Word] (
[WordId] INT IDENTITY(1,1) NOT NULL,
It will give you a value starting with 1 that increments by 1 each time a new row of data is entered.
Also, looking at your code, you've got the field StatusId as nullable but with a default value, are you sure that you don't want this as a NOT NULL field?
For information, you can use this calculation to get a random number less than a given int value;
DECLARE #RandomInt int; SET #RandomInt = 42
SELECT
#RandomInt Number
,ROUND(RAND()*#RandomInt,0) RandomLessThanInt
You'll get an answer like this;
Number RandomLessThanInt
42 15
It will obviously change every time it's run. You'd have to ensure that the number didn't already exist otherwise you will be attempting to violate the PK constraint and the insert will fail.
If you already have the table populated with data then you could do this
UPDATE TableName
SET FieldName = ROUND(RAND()*FieldName,0)

Figuring out number of dimension table(s) in my case of data warehouse

I am a newbie to data warehousing so go easy on me please.
I am trying to figure out the number of dimensions in this case.
In my transaction database:
I have a table which store Location Codes. Columns are location_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null.
I have a table which store Region Codes. Columns are region_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null.
I have a table which associates Locations and Regions. Columns are assoc_id int not null primary key, location_code int not null, region_code int not null. 1 Location belongs to only 1 Region.
In my data warehouse database user may want to lookup data by location or by region.
Now I am looking to create dimension table(s) in this case.
Wondering should I be creating 2 dimension tables (1 for Location and 1 for Region) this way?
Create 1 dimension table for Location which also has Region with these columns: location_code int not null primary key, location_short_description varchar(10) not null, location_long_description varchar(100) not null, region_code int not null, region_short_description varchar(10) not null, region_long_description varchar(100) not null
Create 1 dimension table for Region which also has Location with these columns: region_code int not null primary key, region_short_description varchar(10) not null, region_long_description varchar(100) not null, location_code int not null, location_short_description varchar(10) not null, location_long_description varchar(100) not null
OR should I be creating 4 dimension tables (1 for Location, 1 for Region, 1 for Location Region association, 1 for Region Location association) this way?
Create 1 dimension table for Location with these columns: location_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null
Create 1 dimension table for Region with these columns: region_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null
Create 1 dimension table for Location Region association with these columns: location_code int not null, region_code int not null
Create 1 dimension table for Region Location association with these columns: region_code int not null, location_code int not null
Or is there another way which makes more sense? If yes please do tell
In the Data Warehousing world, what type of relationship is this called and what is the standard way to handle it?
Thanks
I would model the Location und Region in the same dimension (named according the business usage, for example D_Location, or D_Geography).
Hour number will be in the fact table and fact table F_Hour and D_Location will be connected with a surrogate key (a sequence in Oracle or an identity in Sql server).
All the descriptive column for Region and Location could happily live togheter in D_Location (of course Region will not be normalized but this is how it is normally done).
I think you dont need to track association of location and region in the dimension tables. That association can be in the fact table.
I would create 2 dimension tables D_Location & D_Region and 1 fact table F_Hour.
D_Location:
location_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null
D_Region:
region_code int not null primary key, short_description varchar(10) not null, long_description varchar(100) not null
F_Hour:
hour_id int not null primary key, location_code int not null, region_code int not null, hours decimal(10,2) not null
F_Hour would have 1 FK to D_Location and 1 FK to D_Region.
To get hours for a particular location_code (#location_code):
select h.location_code, l.short_description, l.long_description, sum(h.hours)
from F_Hour h inner join D_Location l on h.location_code = l.location_code
where h.location_code = #location_code
group by h.location_code, l.short_description, l.long_description
order by h.location_code
To get hours for a particular region_code (#region_code):
select h.region_code, r.short_description, r.long_description, sum(h.hours)
from F_Hour h inner join D_Region r on h.region_code = r.region_code
where h.region_code = #region_code
group by h.region_code, r.short_description, r.long_description
order by h.region_code
Does it make sense?

how to make default null in oracle

create table patient (
p_code number(5) primary key,
p_name varchar2(50) not null,
DOB date(15) not null,
p_phone number(30) default null,
st varchar2(20) not null,
city varchar2(15) not null,
state varchar2(15) default null,
zip_code number(10) not null,
w_code number(5) references ward (w_code)
)
it gives me ORA-00907: missing right parenthesis
In all databases, the default value for a column is NULL when you leave out not null. So, you can write:
create table patient (
p_code number(5) not null primary key,
p_name varchar2(50) not null,
DOB date not null,
p_phone number(30),
st varchar2(20) not null,
city varchar2(15) not null,
state varchar2(15),
zip_code number(10) not null,
w_code number(5) references ward (w_code)
)
As a note. Oracle also accepts null and default null for this purpose, so these are also acceptable:
p_phone number(30) null,
p_phone number(30) default null,
The problem with your code was the date(15). date doesn't take a length argument.
By the way, you should be storing phone numbers and zip codes using strings and not numbers. They can have leading zeros.

SQLite, my references doesn't work

I've a problem.
I want to create a table with 3 foreign key
PRAGMA foreign_keys = ON;
CREATE TABLE "Dipendente" ("idDipendente" INTEGER PRIMARY KEY AUTOINCREMENT,
"nome" VARCHAR NOT NULL,
"cognome" VARCHAR NOT NULL ,
"email" VARCHAR NOT NULL UNIQUE ,
"password" VARCHAR NOT NULL ,
"tipo" VARCHAR NOT NULL );
CREATE TABLE "Prodotto" ("idProdotto" INTEGER PRIMARY KEY AUTOINCREMENT,
"nome" VARCHAR NOT NULL UNIQUE,
"qta" INTEGER NOT NULL,
"prezzoUnita" FLOAT NOT NULL );
CREATE TABLE "Fondo" ("idFondo" INTEGER PRIMARY KEY AUTOINCREMENT,
"nome" VARCHAR NOT NULL UNIQUE,
"fondoDisponibile" FLOAT NOT NULL );
CREATE TABLE IF NOT EXISTS "Acquisto" (
`idAcquisto` INTEGER PRIMARY KEY NOT NULL ,
`idDipendente` INTEGER NOT NULL DEFAULT -1,
`idProdotto` INTEGER NOT NULL DEFAULT -1,
`idFondo` INTEGER NOT NULL DEFAULT -1,
`qta` INTEGER NOT NULL ,
`dataAcquisto` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ,
CONSTRAINT `fk_acquisto_dipendente`
FOREIGN KEY (`idDipendente` )
REFERENCES `Dipendente` (`idDipendente` )
ON DELETE SET DEFAULT
ON UPDATE NO ACTION,
CONSTRAINT `fk_acquisto_prodotto`
FOREIGN KEY (`idProdotto` )
REFERENCES `Prodotto` (`idProdotto` )
ON DELETE SET DEFAULT
ON UPDATE NO ACTION,
CONSTRAINT `fk_acquisto_fondo`
FOREIGN KEY (`idFondo` )
REFERENCES `Fondo` (`idFondo` )
ON DELETE SET DEFAULT
ON UPDATE NO ACTION);
CREATE INDEX 'fk_acquisto_dipendente' ON 'Acquisto' ('idDipendente' ASC);
CREATE INDEX 'fk_acquisto_prodotto' ON 'Acquisto' ('idProdotto' ASC);
CREATE INDEX 'fk_acquisto_fondo' ON 'Acquisto' ('idFondo' ASC);
So, I want to set Acquisto (idDipendente, idProdotto, idFondo) to -1 value by defaul, but when I delete the Dipendente's row with idDipendente = 1 in the table Acquisto the field idDipendente is set to 1 yet. I don't know what is the problem.
In the table "Acquisto", you have three foreign key references. Each column of those foreign key references has a default value of -1, which isn't (I presume) a valid value in any of the referenced tables.
In this specific case, if you first insert a row into "Dipendente", and that row has "idDipendente" of -1, then you can delete the row where "idDipendente" equals 1. When you do that, you'll find the default value of -1 in Acquisto.idDependente.
The short story is your foreign key reference doesn't prevent you from declaring a default value of -1, but it does prevent you from using it.
To set the values to NULL instead, you need to do something along these lines.
pragma foreign_keys = on;
create table a (id integer primary key);
insert into a values (1);
create table b (
b_id integer primary key,
a_id integer references a(id)
on delete set null);
insert into b values (1, 1);
delete from a;
select * from b;
b_id a_id
---------- ----------
1

Resources