Traversing and Getting Nodes in Graph without Loop

Traversing and Getting Nodes in Graph without Loop - sql-server

I have a person table which keeps some personal info. like as table below.
+----+------+----------+----------+--------+
| ID | name | motherID | fatherID | sex |
+----+------+----------+----------+--------+
| 1 | A | NULL | NULL | male |
| 2 | B | NULL | NULL | female |
| 3 | C | 1 | 2 | male |
| 4 | X | NULL | NULL | male |
| 5 | Y | NULL | NULL | female |
| 6 | Z | 5 | 4 | female |
| 7 | T | NULL | NULL | female |
+----+------+----------+----------+--------+
Also I keep marriage relationships between people. Like:
+-----------+--------+
| HusbandID | WifeID |
+-----------+--------+
| 1 | 2 |
| 4 | 5 |
| 1 | 5 |
| 3 | 6 |
+-----------+--------+
With these information we can imagine the relationship graph. Like below;
Question is: How can I get all connected people by giving any of them's ID.
For example;
When I give ID=1, it should return to me 1,2,3,4,5,6.(order is not important)
Likewise When I give ID=6, it should return to me 1,2,3,4,5,6.(order is not important)
Likewise When I give ID=7, it should return to me 7.
Please attention : Person nodes' relationships (edges) may have loop anywhere of graph. Example above shows small part of my data. I mean; person and marriage table may consist thousands of rows and we do not know where loops may occur.
Smilar questions asked in :
PostgreSQL SQL query for traversing an entire undirected graph and returning all edges found
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=118319
But I can't code the working SQL. Thanks in advance. I am using SQL Server.

From SQL Server 2017 and Azure SQL DB you can use the new graph database capabilities and the new MATCH clause to answer queries like this, eg
SELECT FORMATMESSAGE ( 'Person %s (%i) has mother %s (%i) and father %s (%i).', person.userName, person.personId, mother.userName, mother.personId, father.userName, father.personId ) msg
FROM dbo.persons person, dbo.relationship hasMother, dbo.persons mother, dbo.relationship hasFather, dbo.persons father
WHERE hasMother.relationshipType = 'mother'
AND hasFather.relationshipType = 'father'
AND MATCH ( father-(hasFather)->person<-(hasMother)-mother );
My results:
Full script available here.
For your specific questions, the current release does not include transitive closure (the ability to loop through the graph n number of times) or polymorphism (find any node in the graph) and answering these queries may involve loops, recursive CTEs or temp tables. I have attempted this in my sample script and it works for your sample data but it's just an example - I'm not 100% it will work with other sample data.

Related

Postgres query chain structure data

Assuming there is a gigantic organization with a crazy way to manage. Each employee has one or multiple managers, managers are employees themselves who have one or multiple managers on top.
employee table
| id | name |managers_id|
| -------- | -------------- |-----------|
| 1 | Smith | 5,6 |
| 2 | Matt | 1 |
| 3 | Bob | 1,2 |
| 4 | Adam | 1,3 |
| 5 | Suzi | 6 |
| 6 | Emily | 23,25 |
| ... | ... | ... |
It is a one-way management chain, no loops, meaning it goes A-B-C-D, A-a-b-C-D etc, no such case as A-B-C-D-A
The query is to get the management chains, say C has two management chains on top:
A-B-C
A-a-b-C
C also has one chain below:
C-D
The level of C along the chains is not a matter.
In theory, there is no limitation on the number of levels, the chain can keep going indefinitely.
I was thinking about 'inheritance' but probably it is not the solution.
Any tips on how to design this postgres dababase, please? Thank you.

What is this data referencing anti-pattern called?

I have a question related to a kind of duplication I see in databases from time to time. To ask this question, I need to set the stage a bit:
Let's say I have a database of TV shows. Its primary table Content stores information at various levels of granularity (Show -> Season -> Episode), using a parent column to denote hierarchy:
+----+---------------------------+-------------+----------+
| ID | ContentName | ContentType | ParentId |
+----+---------------------------+-------------+----------+
| 1 | Friends | Show | [null] |
| 2 | Season 1 | Season | 1 |
| 3 | The Pilot | Episode | 2 |
| 4 | The One with the Sonogram | Episode | 2 |
+----+---------------------------+-------------+----------+
Maybe this isn't ideal, but let's say it's good enough to work with and we're not looking to change it.
Now let's say we need to build a table that defines air dates. We can set these at any level, and they must apply down the hierarchy (e.g., if set at the Season level, it applies to all episodes within that season; if set at the Show level, it applies to all seasons and episodes).
So the original air dates might look like this:
+-------+-----------+------------+
| airId | ContentId | AirDate |
+-------+-----------+------------+
| 71 | 3 | 1994-09-22 |
| 72 | 4 | 1994-09-29 |
+-------+-----------+------------+
Whereas the air date for a streaming service might look like:
+-------+-----------+------------+
| airId | ContentId | AirDate |
+-------+-----------+------------+
| 91 | 1 | 2015-01-01 |
+-------+-----------+------------+
Cool. Everything's fine so far; we're adhering to 4NF (I think!) and we can proceed to our business logic.
Now we get to my question. If we implement our business logic in such a way that disregards the referential hierarchy, and instead duplicates the air dates down the hierarchy, what is this anti-pattern called? e.g., Let's say I set an air date at the Show level like above, but the business logic finds all child elements and creates an entry for each one, resulting in:
+-------+-----------+------------+
| airId | ContentId | AirDate |
+-------+-----------+------------+
| 91 | 1 | 2015-01-01 |
| 92 | 2 | 2015-01-01 |
| 93 | 3 | 2015-01-01 |
| 94 | 4 | 2015-01-01 |
+-------+-----------+------------+
There are some pretty clear problems with this, but please note that my question is not how to fix this. Just, is there a specific term for it? I want to call it something like, "disregarding data relationship" or, "ignoring referential context". Maybe it's not strictly a database anti-pattern, since in my example there's an external actor inserting the excess rows.

Create/Update table in MS Access dynamically

EDIT:
Here's what I have: An Access database made up of 3 tables linked from SQL server. I need to create a new table in this database by querying the 3 source tables. Here are examples of the 3 tables I'm using:
PlanTable1
+------+------+------+------+---------+---------+
| Key1 | Key2 | Key3 | Key4 | PName | MainKey |
+------+------+------+------+---------+---------+
| 53 | 1 | 5 | -1 | Bikes | 536681 |
| 53 | 99 | -1 | -1 | Drinks | 536682 |
| 53 | 66 | 68 | -1 | Balls | 536683 |
+------+------+------+------+---------+---------+
SpTable
+----+---------+---------+
| ID | MainKey | SpName |
+----+---------+---------+
| 10 | 536681 | Wing1 |
| 11 | 536682 | Wing2 |
| 12 | 536683 | Wing3 |
+----+---------+---------+
LocTable
+-------+-------------+--------------+
| LocID | CenterState | CenterCity |
+--- ---+-------------+--------------+
| 10 | IN | Indianapolis |
| 11 | OH | Columbus |
| 12 | IL | Chicago |
+-------+-------------+--------------+
You can see the relationships between the tables. The NewMasterTable I need to create based off of these will look something like this:
NewMasterTable
+-------+--------+-------------+------+--------------+-------+-------+-------+
| LocID | PName | CenterState | Key4 | CenterCity | Wing1 | Wing2 | Wing3 |
+-------+--------+-------------+------+--------------+-------+-------+-------+
| 10 | Bikes | IN | -1 | Indianapolis | 1 | 0 | 0 |
| 11 | Drinks | OH | -1 | Columbus | 0 | 1 | 0 |
| 12 | Balls | IL | -1 | Chicago | 0 | 0 | 1 |
+-------+--------+-------------+------+--------------+-------+-------+-------+
The hard part for me is making this new table dynamic. In the future, rows may be added to the source tables. I need my NewMasterTable to reflect any changes/additions to the source. How do I go about building the NewMasterTable as described? Does this make any sort of sense?

Since an Access table is a necessary requirement, then probably the only way to go about it is to create a set of Update and Insert queries that are executed periodically. There is no built-in "dynamic" feature of Access that will monitor and update the table.
First, create the table. You could either 1) do this manually from scratch by defining the columns and constraints yourself, or 2) create a make-table query (i.e. SELECT... INTO) that generates most of the schema, then add any additional columns, edit necessary details and add appropriate indexes.
Define and save Update and Insert (and optional Delete) queries to keep the table synced. I'm not sharing actual code here, because that goes beyond your primary issue I think and requires specifics that you need to define. Due to some ambiguity with your key values (the field names and sample data still are not sufficient to reveal precise relationships and constraints), it is likely that you'll need multiple Update statements.
In particular, the "Wing" columns will likely require a transform statement.
You may not be able to update all columns appropriately using a single query. I recommend not trying to force such an "artificial" requirement. Multiple queries can actually be easier to understand and maintain.
In the event that you experience "query is not updateable" errors, you may need to define other "temporary" tables with appropriate indexes, into which you do initial inserts from the linked tables, then subsequent queries to update your master table from those.
Finally, and I think this is the key to solving your problem, you need to define some Access form (or other code) that periodically runs your set of "sync" queries. Access forms have a [Timer Interval] property and corresponding Timer event that fires periodically. Add VBA code in the Form_Timer sub that runs all your queries. I would suggest "wrapping" such VBA in a transaction and adding appropriate error handling and error logging, etc.

How do you select from a reference table with exclusivity?

I've got two tables (threads and user_threads). Essentially, a thread is an object with a name, and then a user_thread links a user to a thread. This was to illustrate a many-to-many relationship.
Given this setup, Im trying to figure out how to get threads between exclusively two users.
Threads looks like this
|------------------------|
| id | name |
| 1 | group1 |
| 2 | test group |
|------------------------|
user_threads looks like this
|---------------------------------|
| id | user | thread |
|---------------------------------|
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
| 5 | 3 | 2 |
|---------------------------------|
So the issue that I'm running into is this - Given user 1 and user 2, I would like to return the mutual thread that is exclusive to them.
Querying with 1 and 2 should return thread 1. I've tried using a self join and mixing exclude, but SQL is not in my primary skill set. Is there any way to do this or do I need to restructure my tables?

One way is to select the threads that have both users using a JOIN and then excluding all those that have other users in them also.
SELECT ut1.thread FROM user_threads ut1
JOIN user_threads ut2 ON ut1.thread=ut2.thread
WHERE ut1."user" = 1 AND ut2."user" = 2
AND NOT EXISTS
(SELECT 1 FROM user_threads WHERE thread=ut1.thread AND "user" NOT IN (ut1."user", ut2."user"))
SQL Fiddle

Using Data from One Table in Another Table in Access

Hallo StackOverflow Users
I am struggling with transferring values between Access database tables which I will use in a Delphi program to tally election votes and determine the winning candidates. I have a total of six tables. One is my overall table, tblCandidates which identifies each candidate and contains the amount of votes they received from each party, namely the Grade Heads, the Teachers and the Learners. When it comes to the Learners we have four participating grades, namely the grade 8’s, 9’s, 10’s and 11’s, and each grade again has multiple participating classes, namely class A, B, C, etc.
Now, I have set up tables for each grade that contains all the classes in that grade. I named these tables tblGrX with X being the grade represented by 8 through 11. Each one of these tables has two extra fields, namely a field to identify a candidate and a field that will add up all the votes that candidate received from each of the classes in that grade. Lastly I have another table, tblGrTotals with fields Total_GrX (once again with X being the grade), that will contain all the total votes a candidate received from each grade, adding them up in another field for my tblCandidates table to use in its Total_Learners field.
So in short, I want, for example, tblGrTotals to use the value in the field Total of tblGr8 in its Total_Gr8 field, and then tblCandidates to use the value in field Total of tblGrTotals in its Total_Learners field. Is there any way to keep these values updated between tables like cells are updated in Excel the moment a change is made?
Thank you in advance!

You need to rethink your table design. I guess your background is Excel, and your tables are laid out like you would do in Excel sheets, but a relational database works differently.
Think about the objects you are modelling.
Candidates - that's easy. ID, Name, perhaps additional info that belongs to each candidate. But nothing about votes here.
"Groups that are voting" or Parties. Not so trivial, due to the different types of parties. Still I would put them in one table, with Grade and Class only set for Learners, NULL for Heads and Teachers.
e.g.
+----------+------------+-------+-------+
| Party_ID | Party_Type | Grade | Class |
+----------+------------+-------+-------+
| 1 | Head | | |
| 2 | Teacher | | |
| 3 | Learner | 8 | A |
| 4 | Learner | 8 | B |
| 5 | Learner | 8 | C |
| 6 | Learner | 9 | A |
| 7 | Learner | 9 | B |
| 8 | Learner | 10 | A |
+----------+------------+-------+-------+
Votes: they are a Junction Table between Candidates and Parties.
e.g.
+----------+--------------+-----------+
| Party_ID | Candidate_ID | Num_Votes |
+----------+--------------+-----------+
| 1 | 1 | 5 |
| 1 | 2 | 17 |
| 3 | 1 | 2 |
| 3 | 2 | 6 |
| 3 | 3 | 10 |
+----------+--------------+-----------+
Now: if you want to know the votes of Class 8A:
SELECT Candidate_ID, SUM(Num_Votes)
FROM Parties p INNER JOIN Votes v
ON p.Party_ID = v.Party_ID
WHERE p.Party_Type = 'Learner'
AND p.Grade = 8
AND p.Class = 'A'
GROUP BY Candidate_ID
Or of all Grade 8? Simply omit the p.Class criteria.
For the votes per candidate you join Candidates with Votes.
Edit:
for the votes counting differently, this is an attribute of Party_Type.
We don't have a table for them yet, so create one:
+------------+---------------+
| Party_Type | Multiplicator |
+------------+---------------+
| Head | 4 |
| Teacher | 3 |
| Learner | 1 |
+------------+---------------+
and to count all votes:
SELECT c.Candidate_ID, c.Candidate_Name, SUM(v.Num_Votes * t.Multiplicator) AS SumVotes
FROM Parties p
INNER JOIN Votes v ON p.Party_ID = v.Party_ID
INNER JOIN Party_Types t ON p.Party_Type = t.Party_Type
INNER JOIN Candidates c ON v.Candidate_ID = c.Candidate_ID
GROUP BY c.Candidate_ID, c.Candidate_Name
With a design like this, you don't need to keep updating data from one table into another - you calculate it when and how you need it, and it's always current.
The magic of databases. :)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight