Single Datascript Query Challenge in LogSeq - datomic

I'm trying to learn Datascript in the context of LogSeq, and I've stumbled into something I'm not sure how to solve.
The Fundamental Problem
I'm trying to query for a subset of entities that have NOT been referenced by attributes on a different group of filtered entities.
The Background
General LogSeq schema: https://github.com/logseq/logseq/blob/master/deps/db/src/logseq/db/schema.cljs
LogSeq Documentation for Datascript: https://docs.logseq.com/#/page/advanced%20queries
I've got a set of entities, with :block/properties like so:
tags:: contact
list:: C
Other entities have :block/refs to these pages.
I'm trying to create a query that shows me the contacts in a given list (A|B|C) that have NO notes within the past two weeks.
In SQL this would be a straightforward left join, but I'm having trouble translating that to Datalog since the information is in two different entity groups (instead of attributes all on the same entity). I assume there's some sort of not-join to filter out the contacts that have recent refs, but since that data is in other entities, I'm not sure how to structure the query since my implicit joins knock out either one group or the other.
I should add, because this is in LogSeq, I can't do two separate queries and join them in code. It has to be in one go.
Thank you!

Related

Is there a pattern to avoid ever-multiplying link tables in database design?

Currently scoping out a new system. Like many systems, it will be required to store documents and link them to other kinds of item. In this instance a Document object can belong to a Job or it can belong to an Item (which in turn belongs to a Job).
We could do this by having a JobId and an ItemId against a Document and leaving one or the other blank if necessary, but that's going to mean annoying conditional logic in the handling code. So, two link tables seems a better idea.
However, it is likely that we will need to link Documents to other items in the system at some point in the future. There are Company and User objects, for example, and we might want to record Documents against those. There may be more.
That would entail a proliferation of link tables which, while effective, is messy and hard to follow.
This solution is in SQL Server and will be handled in code via Entity Framework.
Are there any design principles that can allow us to hook up Document objects with a variety of other system objects as required in a neater and more flexible way?
You could store two values: the id, and the type of object to which the document is attached. It doesn't allow the use of foreign keys, but is compatible with many application development frameworks.
If you have the partitioning option then you could dedicate different partitions to different object types.
You could also have multiple tables, one for job documents, one for item documents, and get an overview of all of them with a view that UNION ALL's them together. If you need uniqueness in that result set then you could use UUIDs for the primary key, or add an extra column to the view to express from which table the row was read.

Store multiple values in one database field in Access (hear me out)

So I've done extensive searching on this and I can't seem to find a good solution that actually applies to my situation.
I have a list of projects in a table, then a list of people. I want to assign multiple people to one project. Seems pretty common. Obviously, I can't make multiple columns on my projects table for each person, as the people will change fairly frequently.
I need to display this information very quickly in a continuous list of projects (the ultimate way would be a multiple-select combobox as a listbox is too tall, but they don't exist outside of the dreaded lookup fields)
I can think of two ways:
- Store multiple employee IDs delimited by commas in one field in my projects table (I know this goes against good database design). Would require some code to store and retrieve the data.
- Have a separate table for employees assigned to projects (ID, ProjectID, EmployeeID). One to many relationship between projects table and this new table. One to many relationship between employees table and this new table. If a project has 3 employees assigned, it would store 3 records in this table. It seems a bit odd joining both tables in this way, and would also require code to get it to store and retrieve into a control like the one mentioned above).
Does anyone know if there is a better way (including displaying in an easy control) or how you usually tackle this problem?
The usual way to tackle this problem would be with a Junction Table. This is what you describe where you have a separate table maybe called EmployeeProject which has an EmployeeProjectID(PK), EmployeeID(FK) and ProjectID(FK).
In this way you model a Many-to-Many relationship where each project can have many employees involved and each employee can be involved in many projects. It's not actually all that difficult to do the SQL etc. required to pull the information back together again for display.
I would definitely stay away from storing comma-delimited values as this becomes significantly more complicated when you want to display or manipulate the data.
There's a good guide here: http://en.tekstenuitleg.net/articles/software/create-a-many-to-many-relationship-in-access but if you google "many to many junction table" or similar, there are thousands of pages/articles about implementation.

Avoid complex joins in SSRS 2008-r2

I have a report i need to create that has a severe performance problem.
I need to create a catalogue of all courses in our database.
Here is the simplified data model:
Organizational Unit --> contains multiple Courses --> which contains Multiple Activities;
Each activity contains the following:
A list of attached links
A list of prerequisite activities
A list of additional property - value pairs (Cataloging information)
A list of required resource types and quantity for each resource type
A list of Training Objectives
and I wish to create a report that will group everything to look something like that:
After creating the straightforward query which joins all tables together i got almost 6 million rows because of different cartesian products that occur due to joining the activities table with all the 1-many relation table for getting the attached links resources etc...
I was thinking to avoid that in several ways:
sub reports that will list the different lists of items foreach activity.
create an XML field foreach of the described above lists and parse it using VB in my report
use multiple datasets in the report and somehow use lookup functions to list the different values.
Results so far:
sub-report proved to be very inefficient with regards to performance and it took 50% more that what it took with the original 6 million row query.
the xml fields are very efficent DB-wise but it will be trouble to format the data using VB i would very much like to avoid that if possible,
I cannot seem to find the right way to use lookupSet to get a list of attachment names and their links next to them.
so my questions are:
what is the best practice when displaying an entity with a lot of 1-many relations that need to be displayed when dealing with a lot of data and SSRS 2008-r2.
is there a way to join data using lookup function and somehow create "nested tables" that will list 1-many relations
any other suggestion would be very appreciated.
Can you create a drill down report? This looks like a good option to have activities listed and the courses being summarized and then drillable with the detail separated, or similar.
Or you can cache your report if the data change is unimportant?
6m records is a lot for a ssrs report!

Query two custom objects joining on the Name field

I want to create a join on two custom objects joining on the Name field. Normally joins require a lookup or master-detail relationship between the two objects, but I just want to do a text match.
I think this is a Salesforce limitation but I couldn't find any docs on whether this was so. Can anyone confirm this?
Yes, you can make a join (with dot notation or as subquery) only if there's a relationship present. And relationships (lookup or master-detail) can be made only by Id. There are several "mutant fields" (like Task.WhoId) but generally speaking you can't write a JOIN in SOQL and certainly can't use a text column as a foreign key.
http://www.salesforce.com/us/developer/docs/soql_sosl/Content/sforce_api_calls_soql_relationships.htm#relate_query_limits
Relationship queries are not the same as SQL joins. You must have a
relationship between objects to create a join in SOQL.
There are some workarounds though. Why exactly do you need the join?
Apex / SOQL - have a look at SOQL in apex - Getting unmatched results from two object types for example. Not the prettiest thing in the world but it works. If you want to try something really crazy - SOSL that would search your 2 objects at the same time?
Reports - you should have no problem grouping by text field - that means a joined report might give you results you're after. Since Winter'13 joined reports allow charts and exporting, that was quite a limiting factor...
Easy building of links between data - use external ids and upsert operation, especially if you plan to load data from outside SF easily. Check my answer for Can I insert deserialized JSON SObjects from another Salesforce org into my org?
Uniqueness constraints - you can still mark fields as required & unique.
Check against "dictionary" of allowed values - Validation rule with VLOOKUP might do what you're after.

Database design rules to follow for a programmer

We are working on a mapping application that uses Google Maps API to display points on a map. All points are currently fetched from a MySQL database (holding some 5M + records). Currently all entities are stored in separate tables with attributes representing individual properties.
This presents following problems:
Every time there's a new property we have to make changes in the database, application code and the front-end. This is all fine but some properties have to be added for all entities so that's when it becomes a nightmare to go through 50+ different tables and add new properties.
There's no way to find all entities which share any given property e.g. no way to find all schools/colleges or universities that have a geography dept (without querying schools,uni's and colleges separately).
Removing a property is equally painful.
No standards for defining properties in individual tables. Same property can exist with different name or data type in another table.
No way to link or group points based on their properties (somehow related to point 2).
We are thinking to redesign the whole database but without DBA's help and lack of professional DB design experience we are really struggling.
Another problem we're facing with the new design is that there are lot of shared attributes/properties between entities.
For example:
An entity called "university" has 100+ attributes. Other entities (e.g. hospitals,banks,etc) share quite a few attributes with universities for example atm machines, parking, cafeteria etc etc.
We dont really want to have properties in separate table [and then linking them back to entities w/ foreign keys] as it will require us adding/removing manually. Also generalizing properties will results in groups containing 50+ attributes. Not all records (i.e. entities) require those properties.
So with keeping that in mind here's what we are thinking about the new design:
Have separate tables for each entity containing some basic info e.g. id,name,etc etc.
Have 2 tables attribute type and attribute to store properties information.
Link each entity (or a table if you like) to attribute using a many-to-many relation.
Store addresses in different table called addresses link entities via foreign keys.
We think this will allow us to be more flexible when adding, removing or querying on attributes.
This design, however, will result in increased number of joins when fetching data e.g.to display all "attributes" for a given university we might have a query with 20+ joins to fetch all related attributes in a single row.
We desperately need to know some opinions or possible flaws in this design approach.
Thanks for your time.
In trying to generalize your question without more specific examples, it's hard to truly critique your approach. If you'd like some more in depth analysis, try whipping up an ER diagram.
If your data model is changing so much that you're constantly adding/removing properties and many of these properties overlap, you might be better off using EAV.
Otherwise, if you want to maintain a relational approach but are finding a lot of overlap with properties, you can analyze the entities and look for abstractions that link to them.
Ex) My Db has Puppies, Kittens, and Walruses all with a hasFur and furColor attribute. Remove those attributes from the 3 tables and create a FurryAnimal table that links to each of those 3.
Of course, the simplest answer is to not touch the data model. Instead, create Views on the underlying tables that you can use to address (5), (4) and (2)
1 cannot be an issue. There is one place where your objects are defined. Everything else is generated/derived from that. Just refactor your code until this is the case.
2 is solved by having a metamodel, where you describe which properties are where. This is probably needed for 1 too.
You might want to totally avoid the problem by programming this in Smalltalk with Seaside on a Gemstone object oriented database. Then you can just have objects with collections and don't need so many joins.

Resources