Database design: merge users from different logging systems (google, facebook, openid ...) - database

Hallo,
I need to merge users from several souces some how, for example facebook, Google, plaxo...
Currently I have this structure in my database:
USERS_MYSITE
mysite_user_id | parameter | value
------------------------------------------
223 | firstname | Tom
223 | lastname | N.
223 | birthdate | 1985-01-30
USERS_FACEBOOK
mysite_user_id | facebook_user_id | parameter | value
-------------------------------------------------------------
223 | 456353453 | fname | Tom
223 | 456353453 | lname | N.
223 | 456353453 | birth | 1985-01-30
USERS_GOOGLE
mysite_user_id | google_user_id | parameter | value
-----------------------------------------------------------
223 | tomtom22 | fn | Tom
223 | tomtom22 | ln | N.
223 | tomtom22 | brt | 1985 JUN 30
USERS_VIEW
mysite_user_id | remote_user_id | site_name | parameter | value
---------------------------------------------------------------------------
223 | 223 | mysite | firstname | Tom
223 | 223 | mysite | lastname | N.
223 | 223 | mysite | birthdate | 1985-01-30
223 | tomtom22 | google | fn | Tom
223 | tomtom22 | google | ln | N.
223 | tomtom22 | google | brt | 1985-01-30
223 | 456353453 | facebook | fname | Tom
223 | 456353453 | facebook | lname | N.
223 | 456353453 | facebook | birth | 1985 JUN 30
Then SELECT FROM USERS_VIEW WHERE mysite_user_id = '223' and i got all user information. After that i can use several transporation arrays, to transform all remote data to my format
Array("firstname" => Array("fn", "fname"), "birthdate" => Array("brt", "birth"), ...)
same goes with values. Next depending on what user selected as his primary data i can show it.
Problem is that I've never done it before, so maybe somebody knows how to do it better. Please share your ideas.
Thank you.

the idea was to create a engine, which could combine many accounts from many different "account holders", with possibility to add new if needed. Plus, give possibility to user customize his account data; to take first name from one source, last name from another and add from profile form. I worried about query speed, cause it's quite risky to make such slow query every time for every user shown on screen. Also we get a big traffic, about 1 million a day, that's 20 million page views, and about 100 000 000 query executions. That's a big count.
Yes, the problem is already solved. I just created another table, with duplicated data :( .
Every time user changes some of his settings a new table take update from structure above. Then we taking data only from that new table, and that method works fine. Already added linked in and twitter to sources list. Currently thinking to export that engine and make it open source. :)

You've probably already solved your problem by now - but in case you still need help, I'm happy to help. This is the kind of problem I like.
But in order to help, can you tell me what final outcome you actually want? You've got a solution that should work and I'm not sure if you're saying that you want it to produce a different outcome or that you want it to produce the current outcome but more efficiently?
If you can clarify that, we can sort this out.

Related

How to pass a scenario outlines with different parameters?

I am using behave to do my tests.
I want to play my scenario outlines with others parameters, a scenario outlines inside a scenario outlines in some words.
I have
Scenario outlines : Test John access
Given John enters
When He could access to <area>
Then the access is <result>
Examples:
| area | result |
| parking | authorized |
| security | refused |
I don't want to do copy this test for each employees.
I want to loop this like :
Scenario outlines : Test user autorization
Given all my employees :
| name |
| John |
| Jack |
| Lisa |
Scenario outlines : Test user access
Given <employee> enters
When He could access to <area>
Then the access is <result>
Examples:
| area | result |
| parking | authorized |
| security | refused |
How i could do it ?
Thanks in advance for your ideas.
The simplest answer would be to also add the employee to the Examples table. You wouldn't have to copy the test, but you would still have to multiply the number of rows if you want to test all areas for each employee:
Examples:
| employee | area | result |
| John | parking | authorized |
| John | security | refused |
| Jack | parking | authorized |
| Jack | security | refused |
| Lisa | parking | authorized |
| Lisa | security | refused |
This, however, also allows you to test different privilege levels per user.

Constraint to prevent overlapping time periods

Temporal tables and time periods are nothing special in the IT world. But somehow it seems my request is rather unique because I cannot find anything useful for it at all.
What I have are two tables, one containing static data and the other one dynamic data for entries of the first table which changes over time. For each row in the second table there are two columns ValidFrom and ValidUntil whereas the latter can be null if there is no planned end of validity.
Simplified, the schema looks like this:
tbStatic
+----+------------+------------+
| Id | Attribute1 | Attribute2 |
+----+------------+------------+
| 1 | foo | bar |
| 2 | baz | foo |
+----+------------+------------+
tbDynamic
+----+------------+---------------------+---------------------+------------+------------+
| Id | tbStaticId | ValidFrom | ValidUntil | Attribute1 | Attribute2 |
+----+------------+---------------------+---------------------+------------+------------+
| 1 | 1 | 2018-01-01 00:00:00 | 2018-01-31 23:59:59 | 1 | 0 |
| 2 | 2 | 2018-04-01 00:00:00 | 2018-04-02 11:59.59 | 2 | 1 |
| 3 | 1 | 2018-02-01 00:00:00 | null | 2 | 1 |
| 4 | 2 | 2018-05-01 00:00:00 | 2018-06-01 00:00:00 | 23 | 15 |
| 5 | 2 | 2018-07-01 01:23:45 | 2018-07-05 23:12:01 | 80 | 12 |
+----+------------+---------------------+---------------------+------------+------------+
As you might have spotted, there is the possibility that we have holes between time periods. What we cannot have is overlapping periods, though. This means it is impossible to have overlapping time periods for the same tbStaticId.
Unfortunately, this is only a requirement until now and although it is enforced in the application using the database, I would prefer having a constraint on the table that prevents new rows to be inserted or existing rows to be updated when they violate this time uniqueness.
As stated, my research up to this point was rather disappointing and that is also the reason I cannot really show any code I've tried yet. The most promising approach I followed yet was to create a function that takes a record or the two time period values and the foreign key as input and determines if they overlap with something else. This function could then be called in a check constraint. But after thinking about the amount of cases to check, I gave up because it seemed unreasonable (especially when considering updates as well, which require additional attention).
So my question is if there is some easy way to constraint time slices in SQL Server, without the use of temporal tables (not available in my SQL Server version)? And if yes, how?

How to design a database for types and categories in Laravel?

As the questions states, what is the best way when designing a database for types and categories?
Scenario:
I have x amount of database-tables e.g. users, feedback, facts and countries, and all these tables have a type-attribute. What I've found is that a lot of people tend to just create type-tables for each and one of these. E.g. user_types, feedback_types, fact_types and country_types.
I'm currently working on a project where I don't want to create a bunch of extra tables just to handle their individual types. Therefore I'm trying to come up with a database-design-solution that fits all tables.
My best thought of solution:
At first I thought I might just create a polymorphic table that has id, type_id, typable_id and typable_type and a types table. Then i figured that I have to specify in the types table which type-attribute belongs to which table. Then it hit me, I can create a self-referencing table where the parent name is the table name.
E.g.
---------------------------------------------
|id | parent_id | name | description |
---------------------------------------------
| 1 | null | feedback | something |
---------------------------------------------
| 2 | 1 | general | something |
---------------------------------------------
| 3 | 1 | bug | something |
---------------------------------------------
| 4 | 1 | improvement | something |
---------------------------------------------
| 5 | null | countries | something |
---------------------------------------------
| 4 | 5 | europe | something |
---------------------------------------------
| 4 | 5 | asia | something |
---------------------------------------------
| etc... |
---------------------------------------------
Is this a ok design? I'm thinking a lot about the parent names in this table, I haven't seen anyone else use table-names as parents.
If thinking about it in a front-end point of view, it's easier to get the correct types depending on which types you're looking for.
Please give me feedback on this. I'm struggling to find a good design.

Reducing amount of http requests by grouping queries

So I have a bunch of chatty http requests in my Angular 1 app which are bottle necking many of the other requests.
Imagine I have a list of Users from a totally different data source and I make calls to 5 different tables such as:
user.signup
+-----+------------+
| uid | date |
+-----+------------+
| 1 | 2016-12-13 |
| 2 | 2016-12-01 |
+-----+------------+
user.favourite_color
+-----+-------+
| uid | color |
+-----+-------+
| 1 | red |
| 5 | blue |
| 7 | green |
+-----+-------+
user.location
+-----+-----------+
| uid | location |
+-----+-----------+
| 2 | uk |
| 3 | france |
| 9 | greenland |
+-----+-----------+
The reason they are in different tables are because the fields are optional.
The way I see it I have 3 options:
Put them in 1 table
So I could just group them all in 1 table and have a bunch of null columns but that just doesn't sit right with me in terms of DB design.
+-----+------------+-----------+-------+
| uid | date | location | color |
+-----+------------+-----------+-------+
| 1 | 2016-12-13 | null | red |
| 2 | 2016-12-01 | uk | null |
| 3 | null | greenland | null |
| 5 | null | null | blue |
+-----+------------+-----------+-------+
Join them all with 1 request
So I could just have one query that joins all these tables but the way I see it they would have to be full joins with the expectation that some uid's wouldn't exist in some tables. e.g.
+------+------------+-------+-----------+-------+-------+
| uid | date | l_uid | location | c_uid | color |
+------+------------+-------+-----------+-------+-------+
| 1 | 2016-12-13 | null | null | 1 | red |
| 2 | 2016-12-01 | 2 | uk | null | null |
| null | null | 3 | greenland | null | null |
| null | nul | null | null | 5 | blue |
+------+------------+-------+-----------+-------+-------+
which is probably even worse!
Change the way the requests are made?
Maybe make some clever changes how the requests are made:
function activate() {
$q.all([requestSignupDate(), requestFaveColor(), requestLocation(), ....])
.then(function (data) {
//do a bunch of stuff with the data
});
}
which I want to change to:
function activate() {
requestUserData();
}
Any suggestions?
This is a typical ORM problem - precisely this database does not provide a better way of storing the user as an entity.
The entity properties are spread out in multiple tables and due to being optional you are taxed to do left joins with multiple tables.
So you have to essentially solve that problem first. You have several options or (non-options without knowing requirements.)
Put them in one table
If you can use nullable columns and refactor your application - this is preferable. I see that your other tables have just one or two more fields. Heavily normalized tables saves some space and with no data repetition other normalization benefits are moot.
Join them all with 1 request
Only if your query stays performant. Can you use left join ?
You would do this if the above option is difficult. Use this only as a quick fix.
Other options To solve Database Problem
Use server side caching if feasible.
If feasible use a different NoSQL database (e.g. MongoDB)
Change the way the requests are made?
Do you really need all the properties upfront ? The web is asynchronous so why not keep things async. Use $q.all only if you need all the properties. For example the user may not even navigate/scroll to certain part of the page to see stuff so why may queries in the first place.
Along with this you can cluster your server side and the database so that these queries fall on multiple machines and load gets distributed. You may get some items retrieved in parallel.
If the number of columns are all that you have and the tables are all that you mentioned i.e. the supplemental tables have fewer properties I would go with Put them in 1 table option.
Why doesn't using nullable fields sit right with you? NULL exists because it's useful, and a profile table with optional values for a fixed set of fields is practically the textbook case for invoking it. If fields can be dynamically redefined (eg swapping out "favorite color" for "favorite food"), it's another story, but that's not a requirement in what you've described.

Can I get text to display as the values in a TABLIX

I would like to use a tablix to display text values but I'm at a loss for how to do this
I have a query that produces data like this
PersonGroup | Person | Question | Answer
----------------------------------------
Manager | Bob | lunch | yes
Manager | Bob | break | yes
Supervisor | Tim | lunch | No
Supervisor | Tim | break | No
I would like to use a tablix to break the data out like this
Question | Managers | Supervisors
| Bob | Phil | Tim | Susan
Lunch | yes | yes | No | yes
Break | yes | no | No | no
So person group is a parent grouping to person. I've set up my tablix like this and when there is only 1 person per person group the text values(yes's and no's) are displayed. If there is one person per persongoup however the data is blank.
In order to answer your question I've recreated your scenario using this dataset:
You can achieve the desired tablix using a matrix report item with the following data arrangement.
It should preview the following tablix:
Let me know if this can help you.

Resources