Convert Data in Relational Database to Apache AGE - graph-databases

I have data in structured from Tables (Rows and Columns). The tables have relations as well. I want to use that structured Data that is stored in PostgreSQL to convert it to Graph Data (Edges and Vertices). Is there any way to convert that data directly to vertices and edges?
I was searching for a way to convert that data directly to vertices and edges.

To do this, first you'll have to create a graph model using apache-age extension in postgreSQL.
Detailed conversion steps is well documented in this documentation provided by neo4j.

Consider the example as follow :
SQL command for creating table and INSERT into table
create table person (
person_uid SERIAL NOT NULL PRIMARY KEY,
person_name VARCHAR(50) NOT NULL
);
-- INSERT INTO person
insert into person ( person_name ) values ('Fernanda');
insert into person ( person_name ) values ('Moontasir');
insert into person ( person_name ) values ('Sanjida');
insert into person ( person_name ) values ('Fahmida');
Load AGE
load 'age';
SET search_path TO ag_catalog;
Prepare statement
See the doc for more info about prepare statement
PREPARE create_person AS
SELECT *
FROM cypher('graph_name', $$
CREATE (:Person {name: $name})
$$, $1) as (n agtype);
Dynamic SQL in a pl/pgSQL Function
You could make a pl/pgsql function that takes in your arguments of name and other property and creates a node, such as
CREATE OR REPLACE FUNCTION public.create_person(name text)
RETURNS void
LANGUAGE plpgsql
VOLATILE
AS $BODY$
BEGIN
EXECUTE format('SELECT * FROM cypher(''graph_name'', $$CREATE (:Person {name: %s})$$) AS (a agtype);', quote_ident(name));
END
$BODY$;
Now you can call the function
SELECT public.create_person(sql_person.person_name)
FROM person AS sql_person;
This would create a vertex for every row in person.
Alternatively
As the graph relation I want can vary in different situation you can use the python or any other driver to implement your own logical graph database.

Related

Creating PL/SQL procedure to fill intermediary table with random data

As part of my classes on relational databases, I have to create procedures as part of package to fill some of the tables of an Oracle database I created with random data, more specifically the tables community, community_account and community_login_info (see ERD linked below). I succeeded in doing this for tables community and community_account, however I'm having some problems with generating data for table community_login_info. This serves as an intermediary table between the many to many relationship of community and community_account, linking the id's of both tables.
My latest approach was to create an associative array with the structure of the target table community_login_info. I then do a cross join of community and community_account (there's already random data in there) along with random timestamps, bulk collect that result into the variable of the associative array and then insert those contents into the target table community_login_info. But it seems I'm doing something wrong since Oracle returns error ORA-00947 'not enough values'. To me it seems all columns the target table get a value in the insert, what am I missing here? I added the code from my package body below.
ERD snapshot
PROCEDURE mass_add_rij_koppeling_community_login_info
IS
TYPE type_rec_communties_accounts IS RECORD
(type_community_id community.community_id%type,
type_account_id community_account.account_id%type,
type_start_timestamp_login community_account.start_timestamp_login%type,
type_eind_timestamp_login community_account.eind_timestamp_login%type);
TYPE type_tab_communities_accounts
IS TABLE of type_rec_communties_accounts
INDEX BY pls_integer;
t_communities_accounts type_tab_communities_accounts;
BEGIN
SELECT community_id,account_id,to_timestamp(start_datum_account) as start_timestamp_login, to_timestamp(eind_datum_account) as eind_timestamp_login
BULK COLLECT INTO t_communities_accounts
FROM community
CROSS JOIN community_account
FETCH FIRST 50 ROWS ONLY;
FORALL i_index IN t_communities_accounts.first .. t_communities_accounts.last
SAVE EXCEPTIONS
INSERT INTO community_login_info (community_id,account_id,start_timestamp_login,eind_timestamp_login)
values (t_communities_accounts(i_index));
END mass_add_rij_koppeling_community_login_info;
Your error refers to the part:
INSERT INTO community_login_info (community_id,account_id,start_timestamp_login,eind_timestamp_login)
values (t_communities_accounts(i_index));
(By the way, the complete error message gives you the line number where the error is located, it can help to focus the problem)
When you specify the columns to insert, then you need to specify the columns in the VALUES part too:
INSERT INTO community_login_info (community_id,account_id,start_timestamp_login,eind_timestamp_login)
VALUES (t_communities_accounts(i_index).community_id,
t_communities_accounts(i_index).account_id,
t_communities_accounts(i_index).start_timestamp_login,
t_communities_accounts(i_index).eind_timestamp_login);
If the table COMMUNITY_LOGIN_INFO doesn't have any more columns, you could use this syntax:
INSERT INTO community_login_info
VALUE (t_communities_accounts(i_index));
But I don't like performing inserts without specifying the columns because I could end up inserting the start time into the end time and vice versa if I haven't defined the columns in exactly the same order as the table definition, and if the definition of the table changes over time and new columns are added, you have to modify your procedure to add the new column even if the new column goes with a NULL value because you don't fill up that new column with this procedure.
PROCEDURE mass_add_rij_koppeling_community_login_info
IS
TYPE type_rec_communties_accounts IS RECORD
(type_community_id community.community_id%type,
type_account_id community_account.account_id%type,
type_start_timestamp_login community_account.start_timestamp_login%type,
type_eind_timestamp_login community_account.eind_timestamp_login%type);
TYPE type_tab_communities_accounts
IS TABLE of type_rec_communties_accounts
INDEX BY pls_integer;
t_communities_accounts type_tab_communities_accounts;
BEGIN
SELECT community_id,account_id,to_timestamp(start_datum_account) as start_timestamp_login, to_timestamp(eind_datum_account) as eind_timestamp_login
BULK COLLECT INTO t_communities_accounts
FROM community
CROSS JOIN community_account
FETCH FIRST 50 ROWS ONLY;
FORALL i_index IN t_communities_accounts.first .. t_communities_accounts.last
SAVE EXCEPTIONS
INSERT INTO community_login_info (community_id,account_id,start_timestamp_login,eind_timestamp_login)
values (select community_id,account_id,start_timestamp_login,eind_timestamp_login
from table(cast(t_communities_accountsas type_tab_communities_accounts)) a);
END mass_add_rij_koppeling_community_login_info;

Stored procedure that needs to handle multiple scenarios

I have a rather complicated (at least for me) stored procedure to write that needs to handle multiple scenarios coming from the front end.
The frontend is passing 2 parameters that has values like this
#Levelmarker= (1234515-564546-65454,4654342-154658-56767,5465489-546549-65456)
These are GUIDS that are comma separated.
#`UserNameId= (5797823-65432143-65451213)
GUID of the user that entered this data on the front end
The values need to go to a table that has the following structure:
CREATE TABLE LevelTable
(
LevelId uniqueidentifier NOT NULL
LevelMarker uniqueidentiriet NOT NULL
UserName uniqueidentifier NOT NULL
);
I want the value to go into the table like this:
LevelId Levelmarker UserName
--------------------------------------------------------
NEWID() 1234515-564546-65454 5797823-654321-65451
NEWID() 4654342-154658-56767 5797823-654321-65451
NEWID() 5465489-546549-65456 5797823-654321-65451
Here are the scenarios the stored procedure should handle.
Once the levelmarkers are inserted into the table, if the same user comes back and wants to add additional Levelmarkers, the front end will pass the old values and the new ones as so: (1234515-564546-65454,4654342-154658-56767,5465489-546549-65456,1332245-9852135-7841265).
My stored procedure should recognize that I already have the first three Levelmarkers in the table and should only insert the new ones.
If the same user decides to delete values from before, lets say two values as an example, the front end will pass me the values (1234515-564546-65454,4654342-154658-56767). The stored procedure should recognize that the user has deleted two values and should delete the same values from the table and keep the non deleted ones.
If the user deletes some values and inserts a new ones, then the stored procedure should recognize the ones to delete and insert the new ones.
What is the best approach to this problem?
I think you can do this in a single query, using string_split() and a merge statement:
merge leveltable t
using (
select value levelmarker, #UserNameId username
from string_split(#LeveMarker, ',')
) s
on (s.levelmarker = t.levelmarker and s.username = t.username)
when not matched by target
then insert (leveid, levelmarker, username)
values (newid(), s.levelmarker, s.username)
when not matched by source
then delete
In the using clause, we split the #LevelMarker parameter into new rows, and associate the given #UserNameId. Then, the merge statement checks if each combination already exists in the target table, and creates or deletes rows accordingly.

Get recently created schema in PostgreSQL

I have more than 150 schemas in PostgreSQL, every time when I perform an action a new schema will be created with some random name with numbers. It is hard to find which is the new schema created.
I use \dn to list schemas in PostgreSQL, but it doesn't display schemas in created order. How do I list either recently created schema or schemas sorted by creation date?
Any schema has oid column - numeric unique identifier (that is increased only). So you can use ORDER BY oid DESC
SELECT * FROM pg_namespace ORDER BY oid DESC LIMIT 10;
Pavel's answer is good and may get you everything you need, but if you want more flexibility, I would recommend using event triggers:
CREATE TABLE schema_creation_history
(schema regnamespace primary key,
created timestamp with time zone not null default now()
);
CREATE FUNCTION log_schema_create() RETURNS event_trigger
LANGUAGE plpgsql AS
$$
BEGIN
INSERT INTO schema_creation_history (schema) SELECT objid::regnamespace FROM pg_event_trigger_ddl_commands();
END;
$$
;
CREATE EVENT TRIGGER schema_create_trigger
ON ddl_command_end
WHEN TAG IN ('CREATE SCHEMA')
EXECUTE FUNCTION log_schema_create();
# create schema test;
CREATE SCHEMA
# select * from schema_creation_history ;
schema | created
--------+------------------------------
test | 2019-09-11 12:32:21.16346+00
(1 row)
You could add another trigger for DROP SCHEMA to make sure the table is cleaned up.

Temporary tables in hana

it it possible to write script in hana that crate temporary table that is based
on existing table (with no need to define columns and columns types hard coded ):
create local temporary table #mytemp (id integer, name varchar(20));
create temporary table with the same columns definitions and contain the
same data ? if so ..i ill be glad to get some examples
i am searching the internet for 2 days and i couldn't find anything useful
thanks
Creating local temporary tables based on dynamic structure definition is not supported in SQLScript.
The question would be: for what do you want to use it?
Instead of a local temp. table you can use a table variable in most cases.
By querying sys.table_columns view, you can get the list and properties of source table and build a dynamic CREATE script then Execute to create the table.
You can find SQL codes for a sample case at Create Table Dynamically on HANA Database
For table columns read
select * from sys.table_columns where table_name = 'TABLENAME';
Seems to work in the hana version I have. I'm not sure how to find out what the version.
PROCEDURE "xxx.yyy.zzz::MY_TEST"(
OUT "OUT_COL" NVARCHAR(200)
)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
AS
BEGIN
create LOCAL TEMPORARY TABLE #LOCALTEMPTABLE
as
(
SELECT distinct 'Cola' as out_col
FROM "SYNONYMS1"
);
select * from #LOCALTEMPTABLE ;
DROP TABLE #LOCALTEMPTABLE;
END
The newer HANA version (HANA 2 SPS 04 Patch 5 ( Build 4.4.17 )) supports your request:
create local temporary table #tempTableName' like "tableTypeName";
This should inherit the data types and all exact values from whatever query is in the parenthesis:
CREATE LOCAL COLUMN TEMPORARY TABLE #mytemp AS (
SELECT
"COLUMN1",
"COLUMN2",
"COLUMN3"
FROM MyTable
);
-- Now you can add the rest of your query here as such:
SELECT * FROM #mytemp
I suppose you can just write :
create column table #MyTempTable as ( select * from MySourceTable);
BR,

select a column named by another column in another table

I am implementing a database that will back a role playing game. There are two relevant tables: character and weapon (plus a third table representing a standard many-to-many relationship; plus the level of each specific instance of a weapon). A character has multiple attributes (strength, agility, magic etc.) and each weapon has a base damage, a level (defined in the many-to-many association), and receives a bonus from the associated attribute of the character wielding said weapon (strength for clubs, agility for ranged weapons etc.). The effectiveness of a weapon must be derived from the three tables. The catch is that which column of the character table applies is dependent on the specific weapon being used.
My current paradigm is to perform two select queries, one to retrieve the name of the associated attribute (varchar) from the weapon table and then one - with the previously returned value substituted in - for the value of that attribute from the wielding character. I would like to replace this with a pure sql solution.
I have searched around the nets and found two other questions:
Pivot on Multiple Columns using Tablefunc and PostgreSQL Crosstab Query but neither does quite what I'm looking for. I also found the postgres internal datatype oid [https://www.postgresql.org/docs/9.1/static/datatype-oid.html ], and was able to locate the oid of a specific column, but could not find the syntax for querying the value of the column with that oid.
Table schemeta:
create table character (
id int primary key,
agility int,
strength int,
magic int,
...);
create table weapon (
id int primary key,
damage int,
associated_attribute varchar(32), --this can be another type if it'd help
...);
create table weapon_character_m2m (
id int primary key,
weapon int, --foreign key to weapon.id
character int, --foreign key to character.id
level int);
In my mind, this should be query-able with something like this (ideally resulting in the effective damage of each weapon currently in the player's possession.):
select m2m.level as level,
weapon.associated_attribute as base_attr_name,
character.??? as base_attr,
weapon.damage as base_damage,
base_damage * base_attr * level as effective_attr -- this is the column I care about, others are for clarity via alias
from weapon_character_m2m as m2m
join weapon on weapon.id=m2m.weapon
join character on character.id=m2m.character;
where m2m.character=$d -- stored proc parameter or the like
Most online resources I've found end up suggesting the database be redesigned. This is an option, but I really don't want to have a different table for each attribute to which a weapon might associate (in practice there are nearly 20 attributes that might be associated with weapon classes).
I have heard that this is possible in MSSQL by Foreign Key'ing into an internal system table, but I have no experience with MSSQL, let alone enough to attempt something like that (and I couldn't find a working sample on the internets). I would consider migrating to MSSQL (or any other sql engine) if anyone can provide a working example.
It sounds like you can just use the CASE statement. I know it is in MS SQL...not sure about PostgreSQL.
Something like (on my phone so just estimating the code 🙂):
Select other fields,
case weapon.associated_attribute
when 'agility' then character.agility
when 'strength' then character.strength
when ...
else 0 --unhandled associate_attribute
end as base_attr
from ...
The caveat here is that you will want your character attributes to be the same type which it looks like you do.
EDIT
I worked towards a view based on your feedback and realized that a view would use an unpivot rather than the case statement above but that you could use a function to do it using the CASE structure above. There are many flavours :-) MS SQL also has table-valued functions that you could use to return one attribute type for all characters. Here is the code I was playing with. It contains both a view and a function. You can choose which seems more appropriate.
create table character (
id int primary key,
agility int,
strength int,
magic int,
--...
);
insert character values (1,10,15,20),(2,11,12,13);
create table attribute_type (
attribute_id int primary key,
attribute_name varchar(20)
);
insert attribute_type values (1,'Agility'),(2,'Strength'),(3,'Magic');
create table weapon (
id int primary key,
damage int,
--associated_attribute varchar(32), --this can be another type if it'd help
attribute_id int
--...
);
insert weapon values (1,20,1),(2,30,2);
create table weapon_character_m2m (
id int primary key,
weapon int, --foreign key to weapon.id
character int, --foreign key to character.id
level int);
insert weapon_character_m2m values (1,1,1,4),(2,2,2,5);
go
create view vw_character_attributes
as
select c.id, a.attribute_id, c.attribute_value
from (
select id, attribute_name, attribute_value
from (select id, agility, strength, magic from character) p --pivoted data
unpivot (attribute_value for attribute_name in (agility, strength, magic)) u --unpivoted data
) c
join attribute_type a on a.attribute_name = c.attribute_name
;
go
create function fn_get_character_attribute (#character_id int, #attribute_id int)
returns int
as
begin
declare #attr int;
select #attr =
case #attribute_id
when 1 then c.agility
when 2 then c.strength
when 3 then c.magic
--when ...
else 0 --unhandled associate_attribute
end
from character c
where c.id = #character_id;
return #attr;
end
go
select * from vw_character_attributes;
select m2m.level as level,
at.attribute_name as base_attr_name,
ca.attribute_value as base_attr,
dbo.fn_get_character_attribute(m2m.id, weapon.attribute_id ) function_value,
weapon.damage as base_damage,
weapon.damage * ca.attribute_value * level as effective_attr -- this is the column I care about, others are for clarity via alias
from weapon_character_m2m as m2m
join weapon on weapon.id=m2m.weapon
join vw_character_attributes ca on ca.id=m2m.character and ca.attribute_id = weapon.attribute_id
join attribute_type at on at.attribute_id = weapon.attribute_id
--where m2m.character=$d; -- stored proc parameter or the like

Resources