Best way to design Users and Contacts - database

I have two tables (e.g.):
Users (ID, firstName,middleName, lastName)
Contacts (ID, userID, serialNo, phoneNumber, eMail).
I shall be communicating (sending messages) to Users via phoneNumber or eMail or both and save it in Database (e.g.).
Log (ID, userID, contactID, message, onPhoneOrEmail) where, say, last field stores, say, 'p','e' or 'b', for phoneNumber, eMail, or both.
So, when I check logs, I can get to know that which message was sent to which email/phonenumber.
Problem:
What to do when Users change their contact details?
If I update the Contacts table, I lose Log, because the messages were not sent to the new number.
If I store the number or email in Logs, it would be to much of data to store (on large scale as compared to just one character).
Last: If I add new Contact with +1 serial number (serialNo - field), will it be feasible ? What about performance issues ? (uniqueness is not required, Users can changes number or email as many as times as they want - these are just for communcation).
I read this and this, but could not get an approriate answer regarding performance/methodological issues.
Please guide.
SAMPLE DATA:
USERS
| 1 | John | null | Cena |
CONTACTS
| 1 | 1 | 1 | 123456 | abc#xyz.com
| 2 | 1 | 2 | null | xyz#mnp.com
| 3 | 1 | 3 | 987654 | null

If you say that a User can change his contact detail this means that you inverted the dependency. The User has the Contact, so it is reasonable to associate to a user a contactID and not the opposite. Now, a User can change e.g. phone whenever he wants, and at the same time it make no sense for the same phone number to change its user at some point.
So it would be turned like this:
User(ID, firstName,middleName, lastName, contactID)
Contact(ID,serialNo, phoneNumber, eMail)
Log (ID, userID, message, onPhoneOrEmail).
You don't need both userID and contactID on Log. Remember that one is foreign key for the other (transitive dependency).
EDIT
If you need to store multiple contacts per User, keep your schema but change the Log in
Log (ID, contactID, message, onPhoneOrEmail)
From my point of view, when you need to change contact of a user it means that you will remove one and add another. If you have never sent any message to that contact you are removing, you have no reason to keep it in memory, otherwise, if you need a record you have to maintain the contact information in memory even after you have replaced it (maybe a column saying it is invalid is preferable). This is already the default behavior in mySQL (ON DELETE RESTRICT).

Get rid of your Contact table.
Create a new UserPhone table (PK - ID, FK - User.Id, Phone#, ActiveDate)
Create a new UserEmail table (PK - ID, FK - User.Id, Email,
ActiveDate).
It looks like SerialNumber is just an incrementer for
one User's Contact data. If it is just an incrementer, ActiveDate
should suffice as a replacement.
When phone, email information changes do not update existing record, add new record with today's date instead.
Your Log table will look like (PK - LogID, FK - UserEmail.ID, FK - UserPhone.ID).
No need for the PhoneOrEmail field. That information can be determined by presence of the FKs.
You might have some other design issues but this answer should get you on the right track.

Related

Can I perform conditional DynamoDB calls in a single call?

Lets say I have a table that stores user data. It stores 2 types of data - a UserId (partition key) with attributes (json blob). The other type is a reference to the UserId, based off of values within the attributes, for example, here would be 3 rows of the table:
pk attributes userId
5 | { email: example#example.com, tel: 123456789 } | null
email/example#example.com | null | 5
phone/123456789 | null | 5
This is so I am able to query directly off of values to obtain attributes, without needing to do a scan and filter (a very compute intensive operation on large tables).
My question is: Can I, in a single query, do something like getByPartitionKey(email/example#example.com), obtain the userId, and then use that userID to query for the whole attributes document, without doing 2 individual requests? Something akin to a join in SQL.
Your data model is very wrong, here is how to achieve what you want:
pk
sk
phone
email
other
user123
user123
0293480983
example#example.com
some map {}
SELECT * FROM mytable WHERE PK = 'user123'
This would allow you to get all of the information for a given userId. If you want the same information but this time by email, you create a GSI on the email attribute:
email
pk
sk
phone
other
example#example.com
user123
user123
0293480983
some map {}
SELECT * FROM mytable.myindex WHERE email = 'example#example.com'

Database normalization. Which is better, inserting in one row or multiple row?

I'm currently designing my tables. i have three types of user which is, pyd, ppp and ppk. Which is better? inserting data in one row or in multiple row?
which is better?
or
or any suggestion? thanks
I would go for 3 tables:
user_type
typeID | typeDescription
Main_table
id_main_table | id_user | id_type
table_bhg_i
id_bhg_i | id_main_table | data1 | data2 | data3
Although I see you are inserting IDs for each user , I don't quite understand how are are you going to differentiate between the users , had I designed this DB , I would have gone for tables like
tableName: UserTypes
this table would contain two field first would be ID and second would be type of user
like
UsertypeID | UserType
the UsertypeID is a primary key and can be auto increment , while UserType would be your users pyd ,ppk or so on . Designing in this way would give you flexibility of adding data later on in the table without changing the schema of the table ,
the next you can edit a table for generating multiple users of a particular type, this table would refer the userID of the previous table , this will help you adding new user easily and would remove redundancy
tableName:Users
this table would again contain two fields, the first field would be the id call and the secind field would be the usertypeId try
UserId |UserName | UserTypeID
the next thing you can do is make a table to insert the data , let the table be called DataTable
tableName: DataTable
this table will contain the data of the users and this will reference then easily
DataTabID | DataFields(can be any in number) | UserID(refrences Users table)
these tables would be more than sufficient .If doubts as me in chatbox

Cassandra data model for simple messaging app

I am trying to learn Cassandra and always find the best way is to start with creating a very simple and small application. Hence I am creating a basic messaging application which will use Cassandra as the back-end. I would like to do the following:
User will create an account with a username, email, and password. The
email and the password can be changed at anytime.
The user can add another user as their contact. The user would add a
contact by searching their username or email. The contacts don't need
to be mutual meaning if I add a user they are my contact, I don't
need to wait for them to accept/approve anything like in Facebook.
A message is sent from one user to another user. The sender needs to
be able to see the messages they sent (ordered by time) and the
messages which were sent to them (ordered by time). When a user opens
the app I need to check the database for any new messages for that
user. I also need to mark if the message has been read.
As I come from the world of relational databases my relational database would look something like this:
UsersTable
username (text)
email (text)
password (text)
time_created (timestamp)
last_loggedIn (timestamp)
------------------------------------------------
ContactsTable
user_i_added (text)
user_added_me (text)
------------------------------------------------
MessagesTable
from_user (text)
to_user (text)
msg_body (text)
metadata (text)
has_been_read (boolean)
message_sent_time (timestamp)
Reading through a couple of Cassandra textbooks I have a thought of how to model the database. My main concern is to model the database in a very efficient manner. Hence I am trying to avoid things such as secondary indexes etc. This is my model so far:
CREATE TABLE users_by_username (
username text PRIMARY KEY,
email text,
password text
timeCreated timestamp
last_loggedin timestamp
)
CREATE TABLE users_by_email (
email text PRIMARY KEY,
username text,
password text
timeCreated timestamp
last_loggedin timestamp
)
To spread data evenly and to read a minimal amount of partitions (hopefully just one) I can lookup a user based on their username or email quickly. The downside of this is obviously I am doubling my data, but the cost of storage is quite cheap so I find it to be a good trade off instead of using secondary indexes. Last logged in will also need to be written in twice but Cassandra is efficent at writes so I believe this is a good tradeoff as well.
For the contacts I can't think of any other way to model this so I modelled it very similar to how I would in a relational database. This is quite a denormalized design I beleive which should be good for performance according to the books I have read?
CREATE TABLE "user_follows" (
follower_username text,
followed_username text,
timeCreated timestamp,
PRIMARY KEY ("follower_username", "followed_username")
);
CREATE TABLE "user_followedBy" (
followed_username text,
follower_username text,
timeCreated timestamp,
PRIMARY KEY ("followed_username", "follower_username")
);
I am stuck on how to create this next part. For messaging I was thinking of this table as it created wide rows which enables ordering of the messages.
I need messaging to answer two questions. It first needs to be able to show the user all the messages they have and also be able to show the user
the messages which are new and are unread. This is a basic model, but am unsure how to make it more efficent?
CREATE TABLE messages (
message_id uuid,
from_user text,
to_user text,
body text,
hasRead boolean,
timeCreated timeuuid,
PRIMARY KEY ((to_user), timeCreated )
) WITH CLUSTERING ORDER BY (timeCreated ASC);
I was also looking at using things such as STATIC columns to 'glue' together the user and messages, as well as SETS to store contact relationships, but from my narrow understanding so far the way I presented is more efficient. I ask if there are any ideas to improve this model's efficiency, if there are better practices do the things I am trying to do, or if there are any hidden problems I can face with this design?
In conclusion, I am trying to model around the queries. If I were using relation databases these would be essentially the queries I am looking to answer:
To Login:
SELECT * FROM USERS WHERE (USERNAME = [MY_USERNAME] OR EMAIL = [MY_EMAIL]) AND PASSWORD = [MY_PASSWORD];
------------------------------------------------------------------------------------------------------------------------
Update user info:
UPDATE USERS (password) SET password = [NEW_PASSWORD] where username = [MY_USERNAME];
UPDATE USERS (email) SET password = [NEW_PASSWORD ] where username = [MY_USERNAME];
------------------------------------------------------------------------------------------------------------------------
To Add contact (If by username):
INSERT INTO followings(following,follower) VALUES([USERNAME_I_WANT_TO_FOLLOW],[MY_USERNAME]);
------------------------------------------------------------------------------------------------------------------------
To Add contact (If by email):
SELECT username FROM users where email = [CONTACTS_EMAIL];
Then application layer sends over another query with the username:
INSERT INTO followings(following,follower) VALUES([USERNAME_I_WANT_TO_FOLLOW],[MY_USERNAME]);
------------------------------------------------------------------------------------------------------------------------
To View contacts:
SELECT following FROM USERS WHERE follower = [MY_USERNAME];
------------------------------------------------------------------------------------------------------------------------
To Send Message:,
INSERT INTO MESSAGES (MSG_ID, FROM, TO, MSG, IS_MSG_NEW) VALUES (uuid, [FROM_USERNAME], [TO_USERNAME], 'MY MSG', true);
------------------------------------------------------------------------------------------------------------------------
To View All Messages (Some pagination type of technique where shows me the 10 recent messages, yet shows which ones are unread):
SELECT * FROM MESSAGES WHERE TO = [MY_USERNAME] LIMIT 10;
------------------------------------------------------------------------------------------------------------------------
Once Message is read:
UPDATE MESSAGES SET IS_MSG_NEW = false WHERE TO = [MY_USERNAME] AND MSG_ID = [MSG_ID];
Cheers
Yes it's always a struggle to adapt to the limitations of Cassandra when coming from a relational database background. Since we don't yet have the luxury of doing joins in Cassandra, you often want to cram as much as you can into a single table. In your case that would be the users_by_username table.
There are a few features of Cassandra that should allow you to do that.
Since you are new to Cassandra, you could probably use Cassandra 3.0, which is currently in beta release. In 3.0 there is a nice feature called materialized views. This would allow you to have users_by_username as a base table, and create the users_by_email as a materialized view. Then Cassandra will update the view automatically whenever you update the base table.
Another feature that will help you is user defined types (in C* 2.1 and later). Instead of creating separate tables for followers and messages, you can create the structure of those as UDT's, and then in the user table keep lists of those types.
So a simplified view of your schema could be like this (I'm not showing some of the fields like timestamps to keep this simple, but those are easy to add).
First create your UDT's:
CREATE TYPE user_follows (
followed_username text,
street text,
);
CREATE TYPE msg (
from_user text,
body text
);
Next we create your base table:
CREATE TABLE users_by_username (
username text PRIMARY KEY,
email text,
password text,
follows list<frozen<user_follows>>,
followed_by list<frozen<user_follows>>,
new_messages list<frozen<msg>>,
old_messages list<frozen<msg>>
);
Now we create a materialized view partitioned by email:
CREATE MATERIALIZED VIEW users_by_email AS
SELECT username, password, follows, new_messages, old_messages FROM users_by_username
WHERE email IS NOT NULL AND password IS NOT NULL AND follows IS NOT NULL AND new_messages IS NOT NULL
PRIMARY KEY (email, username);
Now let's take it for a spin and see what it can do. Let's create a user:
INSERT INTO users_by_username (username , email , password )
VALUES ( 'someuser', 'someemail#abc.com', 'somepassword');
Let the user follow another user:
UPDATE users_by_username SET follows = [{followed_username: 'followme2', street: 'mystreet2'}] + follows
WHERE username = 'someuser';
Let's send the user a message:
UPDATE users_by_username SET new_messages = [{from_user: 'auser', body: 'hi someuser!'}] + new_messages
WHERE username = 'someuser';
Now let's see what's in the table:
SELECT * FROM users_by_username ;
username | email | followed_by | follows | new_messages | old_messages | password
----------+-------------------+-------------+---------------------------------------------------------+----------------------------------------------+--------------+--------------
someuser | someemail#abc.com | null | [{followed_username: 'followme2', street: 'mystreet2'}] | [{from_user: 'auser', body: 'hi someuser!'}] | null | somepassword
Now let's check that our materialized view is working:
SELECT new_messages, old_messages FROM users_by_email WHERE email='someemail#abc.com';
new_messages | old_messages
----------------------------------------------+--------------
[{from_user: 'auser', body: 'hi someuser!'}] | null
Now let's read the email and put it in the old messages:
BEGIN BATCH
DELETE new_messages[0] FROM users_by_username WHERE username='someuser'
UPDATE users_by_username SET old_messages = [{from_user: 'auser', body: 'hi someuser!'}] + old_messages where username = 'someuser'
APPLY BATCH;
SELECT new_messages, old_messages FROM users_by_email WHERE email='someemail#abc.com';
new_messages | old_messages
--------------+----------------------------------------------
null | [{from_user: 'auser', body: 'hi someuser!'}]
So hopefully that gives you some ideas you can use. Have a look at the documentation on collections (i.e. lists, maps, and sets), since those can really help you to keep more information in one table and are sort of like tables within a table.
For cassandra or noSQL data modelling beginners, there is a process involved in data modelling your application, like
1- Understand your data, design a concept diagram
2- List all your quires in detail
3- Map your queries using defined rules and patterns, best suitable for cassandra
4- Create a logical design, table with fields derived from queries
5- Now create a schema and test its acceptance.
if we model it well, then it is easy to handle issues such as new complex queries, data over loading, data consistency setc.
After taking this free online data modelling training, you will get more clarity
https://academy.datastax.com/courses/ds220-data-modeling
Good Luck!

Best approach to go: one table for each operation or one table for all operations?

I'm designing the database for a solution. I'm facing the following scenario:
The user can add a product. This product will belong to a specific operation: "SELL", "BUY", etc.
Another user can mark the product as interested. So, I'll have a table to generate the users which are interested in something.
I'm struggling to decide which approach to go:
I can create one table for each operation, something like "ProductSell", "ProductBuy", etc. The same for interested users ("InterestedProductSell", "InterestedProductBuy", etc).
```
User ProductSell ProductBuy InterestBuy InterestSell
____________ ___________ __________ ___________ ____________
Id Id Id ProductId (ProductBuy PK) ProductId (ProductSell PK)
Name Title Title UserId UserId
Username UserId UserId Date Date
```
I can create one table for all operations, with a column named "Operation". Same for interested users.
```
User Operation Product Interest
____________ _________ ___________ __________
Id Id Id ProductId (ProductBuy or ProductSell PK)
Name Name (Buy, sell, etc) Title UserId
Username UserId Date
Operation
```
Can you give me your opinions about these two approach, or even a third approach that I didn't realize? Things like performance, optimization, maintenance, coding... I need another options other than my sight about this.
If it's matter, I'm working with SQL Server.
your 2nd approach of having a separate column for Operation looks good
user Table
uid
name
product Table
pid
name
userproduct Table
uid
pid
operation
time

How to manage user database when number of choices of user is random

I am creating a small playlist program on VB, which contains adduser, deleteuser and also user can modify its playlist.
My stupid question is, how do I manage user playlist? Consider I am using database, where should I add user?
As a new table in Database?
As a new Entry in some kind of Table which contains userID, Name and its undefined number of choices?
If I select option 2, what kind of datatype handles a integer set of undefined size?
Thank you.
You would create 3 tables:
Users table
-----------
userID
email
password
name
Playlist table
--------------
playlistID
userID
trackID
Tracks table
------------
trackID
trackName
You would then create relationship between the tables:
Users.userID 1-* Playlist.userID (1 to many)
Tracks.trackID 1-* Playlist.trackID (1 to many)
Then you would store the users choices in the playlist table.
To see a users tracks you could do:
SELECT Playlist.trackID, Tracks.trackName
FROM Playlist
JOIN Tracks ON Playlist.trackID=Tracks.trackID
WHERE Playlist.userID = 12
ORDER BY Tracks.trackName
This is the basics of relational database system and normalisation of data.
For more information see:
http://www.dreamincode.net/forums/topic/179103-relational-database-design-normalization/

Resources