Creating relationship Neo4J during import if value is 1 - database

Hello all you helpful folks!
I have been tasked with converting our RDBM into a Graph database for testing purposes. I am using Neo4J and have been successful on importing various tables into their appropriate nodes. However, I have run into a slight hiccup when it comes to the department node. Certain department are partnered with a particular department. Within the RDBMS model, this is simple a column named: Is_Partner because this database was originally set up with one partner in mind (Hence the whole: Moving to a Graph database thing).
What I need to do is match all department with the Is_Partner value of 1 and assign a relationship to from the partner who has the value of 1 in Is_Partner and assign it to a specific partner (Edge: ABBR, Value: HR). I have written the script, but it tells me it's successful, but 0 edits are made...
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///Department.csv" AS row
MATCH (partner:Department {DepartmentID: row.DepartmentID})
WHERE row.IS_PARTNER = "1"
MERGE (partner)-[:IS_PARTNER_OF]->(Department{ABBR: 'HR'});
I'm pretty new to Graph Databases, but I know Relational Databases quite well. Any help would be appreciated.
Thank you for your time,
Jim Perry

There are a few problems with your query. If you want to filter on CSV use WITH statement with a WHERE filter. Also you want to MERGE HR department node separately and then MERGE relationship separately.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///Department.csv" AS row
WITH row WHERE row.IS_PARTNER = "1"
MATCH (partner:Department {DepartmentID: row.DepartmentID})
MERGE (dept:Department{ABBR: 'HR'}))
MERGE (partner)-[:IS_PARTNER_OF]->(dept);
If it still return no results/changes, check out if your MATCH statement return anything as this is usually the problem.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///Department.csv" AS row
WITH row WHERE row.IS_PARTNER = "1"
MATCH (partner:Department {DepartmentID: row.DepartmentID})
RETURN partner

Related

Get audit history records of any entity record as per CRM view

I want to display all audit history data as per MS CRM format.
I have imported all records from AuditBase table from CRM to another Database server table.
I want this table records using SQL query in Dynamics CRM format (as per above image).
I have done so far
select
AB.CreatedOn as [Created On],SUB.FullName [Changed By],
Value as Event,ab.AttributeMask [Changed Field],
AB.changeData [Old Value],'' [New Value] from Auditbase AB
inner join StringMap SM on SM.AttributeValue=AB.Action and SM.AttributeName='action'
inner join SystemUserBase SUB on SUB.SystemUserId=AB.UserId
--inner join MetadataSchema.Attribute ar on ab.AttributeMask = ar.ColumnNumber
--INNER JOIN MetadataSchema.Entity en ON ar.EntityId = en.EntityId and en.ObjectTypeCode=AB.ObjectTypeCode
--inner join Contact C on C.ContactId=AB.ObjectId
where objectid='00000000-0000-0000-000-000000000000'
Order by AB.CreatedOn desc
My problem is AttributeMask is a comma separated value that i need to compare with MetadataSchema.Attribute table's columnnumber field. And how to get New value from that entity.
I have already checked this link : Sql query to get data from audit history for opportunity entity, but its not giving me the [New Value].
NOTE : I can not use "RetrieveRecordChangeHistoryResponse", because i need to show these data in external webpage from sql table(Not CRM database).
Well, basically Dynamics CRM does not create this Audit View (the way you see it in CRM) using SQL query, so if you succeed in doing it, Microsoft will probably buy it from you as it would be much faster than the way it's currently handled :)
But really - the way it works currently, SQL is used only for obtaining all relevant Audit view records (without any matching with attributes metadata or whatever) and then, all the parsing and matching with metadata is done in .NET application. The logic is quite complex and there are so many different cases to handle, that I believe that recreating this in SQL would require not just some simple "select" query, but in fact some really complex procedure (and still that might be not enough, because not everything in CRM is kept in database, some things are simply compiled into the libraries of application) and weeks or maybe even months for one person to accomplish (of course that's my opinion, maybe some T-SQL guru will prove me wrong).
So, I would do it differently - use RetrieveRecordChangeHistoryRequest (which was already mentioned in some answers) to get all the Audit Details (already parsed and ready to use) using some kind of .NET application (probably running periodically, or maybe triggered by a plugin in CRM etc.) and put them in some Database in user-friendly format. You can then consume this database with whatever external application you want.
Also I don't understand your comment:
I can not use "RetrieveRecordChangeHistoryResponse", because i need to
show these data in external webpage from sql table(Not CRM database)
What kind of application cannot call external service (you can create a custom service, don't have to use CRM service) to get some data, but can access external database? You should not read from the db directly, better approach would be to prepare a web service returning the audit you want (using CRM SDK under the hood) and calling this service by external application. Unless of course your external app is only capable of reading databases, not running any custom web services...
It is not possible to reconstruct a complete audit history from the AuditBase tables alone. For the current values you still need the tables that are being audited.
The queries you would need to construct are complex and writing them may be avoided in case the RetrieveRecordChangeHistoryRequest is a suitable option as well.
(See also How to get audit record details using FetchXML on SO.)
NOTE
This answer was submitted before the original question was extended stating that the RetrieveRecordChangeHistoryRequest cannot be used.
As I said in comments, Audit table will have old value & new value, but not current value. Current value will be pushed as new value when next update happens.
In your OP query, ab.AttributeMask will return comma "," separated values and AB.changeData will return tilde "~" separated values. Read more
I assume you are fine with "~" separated values as Old Value column, want to show current values of fields in New Value column. This is not going to work when multiple fields are enabled for audit. You have to split the Attribute mask field value into CRM fields from AttributeView using ColumnNumber & get the required result.
I would recommend the below reference blog to start with, once you get the expected result, you can pull the current field value using extra query either in SQL or using C# in front end. But you should concatenate again with "~" for values to maintain the format.
https://marcuscrast.wordpress.com/2012/01/14/dynamics-crm-2011-audit-report-in-ssrs/
Update:
From the above blog, you can tweak the SP query with your fields, then convert the last select statement to 'select into' to create a new table for your storage.
Modify the Stored procedure to fetch the delta based on last run. Configure the sql job & schedule to run every day or so, to populate the table.
Then select & display the data as the way you want. I did the same in PowerBI under 3 days.
Pros/Cons: Obviously this requirement is for reporting purpose. Globally reporting requirements will be mirroring database by replication or other means and won't be interrupting Prod users & Async server by injecting plugins or any On demand Adhoc service calls. Moreover you have access to database & not CRM online. Better not to reinvent the wheel & take forward the available solution. This is my humble opinion & based on a Microsoft internal project implementation.

SSIS - Lookup Failing

I have a process whereby I need to provide 2 files, 1 for New Business and 1 for Adjustments.
Originally I did this all in an SP that outputted the information to temp tables that I then output to a File but this has its drawbacks.
The problem I have is a I use a Lookup to compare yesterday 'Live' Table with Today's Live Table. Any unmatched rows are considered New Policies therefore I output them to the New File - Works Fine.
Adjustments are then done based on the Make/Model/Car Reg and Address1/2/3/4 & PCode of any given policy being different.
What I need to do is use a Lookup to test every case from my currently live table (Table A) against yesterdays live cases (Table B) and check for ONLY where the above details are different.
The problem I am having at the moment is the Lookup is also pulling in cases where the Value is 'NULL' as in the New Business cases, which I don't want to be included as they are already in another file.
Can anyone point me in the right direction?

How to transfer only new records between two different databases (ie. Oracle and MSSQL) using SSIS?

Do you know how to transfer only new records between two different databases (ie. Oracle and MSSQL) using SSIS? There is no problem transfering new data only between two tables in the same database and server, but is this possible to do such operation between completely different servers and databases?
Ps. I know about solution using Lookup but it is not very efficient if anybody needs to check and add a lot of records (50k and more) several times per day. I would like to operate with new data only.
You have several options:
Timestamp based solution
If you have a column which stores the insertation time in the source system, you can select only the new records created since the last load. With the same logic, you can transfer modified records too, just mark the records with the timestamp value when it change.
Sequence based solution
If there is a sequence in the source table, you can load the new records based on that sequence. Query the last value from the destination system, then load avarything which is larger than that value.
CDC based solution
If you have CDC (Change Data Capture) in your source system, you can track the changes and you can load them based on the CDC entries.
Full load
This is the most resource hungry solution: you have to copy all data from the source to the destination. If you do not have any column which marks the new records, you should use this solution.
You have several options to achieve this:
TRUNCATE the destination table and reload it from source
Use a Lookup component to determine which records are missing
Load all data from source to a temporary table and write a query which retrieves the new/changed records.
Summary
If you have at least one column, which marks the new/modified records, you can use it to implement a differential/incremental load with SSIS. If you do not have any clue, which columns/rows are changed, you have to load (or at least query) all of them.
There is no solution which enables a one-query (INSERT .. SELECT) solution using multiple servers without transferring all data. (Please note, that a multi-server query using Linked Servers are transfers the data from the source system).
What about variables? Is it possible to use the same variable between different databases and servers in SSIS?
I would like to transfer last id number from a destination table and transfer it to the source table (different server!).
I can set a variable in a database scope like this:
DECLARE #Last int
SET #Last = (SELECT TOP 1 Id FROM dbo.Table_1 ORDER BY Id DESC)
SELECT *
FROM dbo.Table_2
WHERE ID > #Last;
However it works between two tables in the same database (as a SQL command) only. I can create a variable for a entire SSIS package in Variables --> Add variable, but I don't know it is possible to use the variable in a similar way as above - to keep an information about last id in a destination table and pass it to another table on a source server as data limit.

Recommended way of adding daily data to database

I receive new data files every day. Right now, I'm building the database with all the required tables to import the data and perform the required calculations.
Should I just append each new day's data to my current tables? Each file contains a date column, which would allow for a "WHERE" query in the future if I need to analyze data for one particular day. Or should I be creating a new set of tables for every day?
I'm new to database design (coming from Excel). I will be using SQL Server for this.
Assuming that the structure of the data being received is the same, you should only need one set of tables rather than creating new tables each day.
I'd recommend storing the value of the date column from your incoming data in your database, and also having a 'CreateDate' column in your tables, with a default value of 'GetDate()' so that it automatically gets populated with the current date when the row is inserted.
You may also want to have another column to store the data filename that the row was imported from, but if you're already storing the value of the date column and the date that the row was inserted, this shouldn't really be necessary.
In the past, when doing this type of activity using a custom data loader application, I've also found it useful to create log files to log success/error/warning messages, including some type of unique key of the source data and target database - ie. if coming from an Excel file and going into a database column, you could store the row index from Excel and the primary key of the inserted row. This helps tracking down any problems later on.
You might want to consider having a look at SSIS (SqlServer Integration Services). It's the SqlServer tool for doing ETL activities.
yes, append each day's data to the tables; 1 set of tables for all data.
yes, use a date column to identify the day that the data was loaded.
maybe have another table with a date column and a clob column. The date to contain the load date and the clob to contain the file that you imported.
Good question. You most definitely should have a single set of tables and append the data daily. Consider this: if you create a new set of tables each day, what would, say, a monthly report query look like? A quarterly report query? It would be a mess, with UNIONs and JOINs all over the place.
A single set of tables with a WHERE clause makes the querying and reporting manageable.
You might do a little reading on relational database theory. Wikipedia is a good place to start. The basics are pretty straightforward if you have the knack for it.
I would have the data load into a stage table regardless and append to the main tables after. Once a week i would then refresh all data in the main table to ensure that the data remains correct as per the source.
Marcus

Dynamic SQL statement return value using the current target connection

I'm currently creating my first real life project in Pervasive. The task is to map a certain XML structure containing orders (as in shops and products) to 3 tables I created myself. These tables rest inside a MS-SQL-Server instance.
All of the tables have a unique key called "id", an automatically incremented column. I've dropped this column from all mappings so that Pervasive will not try to fill it itself.
For certain calculations, for a split key in one of the tables and for references to the created records in other tables, I will need the id that the database has just created. For that, I have googled the answer. I can use "select ##identity;" as a statement, and this returns the id that has most recently been created for the current connection. This means that in Pervasive, I will have to execute this statement using the already existing target connection object.
But how to do that? I am quite sure that I will need a JDImport or DJExport object, but how to get one associated with the current connection that Pervasive inserts the records by?
Or is there any other way to handle this auto increment when I need to reference the id in other tables?
Not sure how things work in Pervasive, but you may run into issues with ##identity,. Scope_identity() would probably be safer but may still not work in Pervasive.
Hopefully your tables have a natural key in addition to the generated id, in which case you can select your id based on the natural key. This will avoid any issues you may have with disparate sessions and scope.
If there is anyone looking this post up and wonders about the answer, it's "You can't". Pervasive does not allow access to their very own connection object, the one they use to query the database. Without access to it, you cannot guaranteed fetch the right id. The solution for us was this: We used a stored procedure which we called in the Before-Transformation event that created the header record and returned the id and an optional error message as a table. We executed it and it returns the id we then save and use throughout our mapping.

Resources