How can inform Spring Data REST about my desire for link relations of an embedded document? - spring-data-mongodb

In MongoDB, I have a document collection named Customer which embeds customer-defined labels. My domain objects look like this (including Lombok annotations):
#Document(collection = "Customer")
#Getter
#Setter
public class Customer {
#Id
long id;
#Field("name")
String name;
#Field("labels")
List<CustomerLabel> labels;
}
#Getter
#Setter
public class CustomerLabel {
#Id
#Field("label_id")
long labelId;
#Field("label_name")
String labelName;
}
Right now, the response to GET /customers looks like this:
{
"_links": {
"self": {
"href": "http://localhost:8080/app/customers?page=&size=&sort="
}
},
"_embedded": {
"customers": [
{
"name": "Smith, Jones, and White",
"labels": [
{
"labelName": "General label for Smith, Jones, and White"
}
],
"_links": {
"self": {
"href": "http://localhost:8080/app/customers/285001"
}
}
}
]
},
"page": {
"size": 20,
"totalElements": 1,
"totalPages": 1,
"number": 0
}
}
I would like to "call out" the embedded labels document as a separate link relation, so that the response to GET /customers looks more like this:
{
"_links": {
"self": {
"href": "http://localhost:8080/app/customers?page=&size=&sort="
}
},
"_embedded": {
"customers": [
{
"name": "Smith, Jones, and White",
"_links": [
{
"labels": {
"href": "http://localhost:8080/app/customers/285001/labels"
},
{
"self": {
"href": "http://localhost:8080/app/customers/285001"
}
}]
}
]
},
"page": {
"size": 20,
"totalElements": 1,
"totalPages": 1,
"number": 0
}
}
How can I do this in my application?

I'd argue, this is sort of sub-optimal design. When working with MongoDB, the documents you model are basically aggregates in the terms of Domain Driven Design as the concepts align nicely (accepting eventual consistence between related aggregates / documents etc).
Repositories simulate collections of aggregates (read: documents, in the MongoDB) case. So having a repository for embedded documents (CustomerLabel in your case) doesn't make too much sense actually. Also, persisting both Customers and CustomerLabels into the same collection but also let the former embed the latter looks suspicious to me.
So this seems to be more of a MongoDB schema design question than on on how to expose the documents through Spring Data. That said, I am not quite sure you're really going to get a satisfying answer as - as I've indicated - the question you raise seems to mask a more fundamental challenge in your codebase.

I ended up implementing a ResourceProcessor which removes the labels from the Resource object returned from the Repository REST controller. It looks like this:
#Controller
#AllArgsConstructor(onConstructor = #__(#Inject))
class CustomerResourceProcessor implements ResourceProcessor<Resource<Customer>> {
private final #NonNull CustomerLinks customerLinks;
#Override
public Resource<Customer> process(Resource<Customer> resource) {
Customer customer = resource.getContent();
if (customer.getLabels() != null && !customer.getLabels().isEmpty()) {
resource.add(customerLinks.getLabelCollectionLink(resource));
customer.setLabels(null);
}
return resource;
}
}
Then I wrote a LabelController in the manner of the RESTBucks examples I've seen:
#ResponseBody
#RequestMapping(value = "/customers/{customerId}/labels", method = RequestMethod.GET)
Resources<LabelResource> labels(#PathVariable Long customerId) {
List<CustomerLabel> customerLabels = customerRepository.findLabelsByCustomerId(customerId);
return new Resources<>(resourceAssembler.toResources(customerLabels), linkTo(
methodOn(this.getClass()).labels(customerId)).withSelfRel());
}
And the findLabelsByCustomerId method on the CustomerRepository is a custom repository method implementation which only returns the labels field from Mongo.
It all works quite well with a minimal amount of code. The trick was figuring out exactly which code I needed to write. :)

Related

Removing many-to-many relationship via update in Vuex-ORM

I'm wondering if it is possible to remove relationships between two models by simply updating a model on one side of the relationship. Adding new relationships just works fine but removing seems not to be an option but maybe I am just missing something.
It seems logical that insertOrUpdate() does not "delete" nested relationship but maybe there is another function or property so set to get the desired behavior. Unfortunately, searching the docs was not successful.
In my case I configured a belongsToMany (m:n) relationship between model "Process" and model "Dependency". The m-n-model inbetween is "MapProcessDependency"
The Models
// Dependency.js
export default class Dependency extends Model {
static entity = 'dependencies'
static fields () {
return {
id: this.attr(null),
name: this.string('')
}
}
}
// Process.js
export default class Process extends Model {
static entity = 'processes'
static fields () {
return {
id: this.attr(null),
name: this.string(''),
dependencies: this.belongsToMany(
Dependency, MapProcessDependency, 'processId', 'dependencyId'
)
}
}
}
// MapProcessDependency
export default class MapProcessDependency extends Model {
static entity = 'mapProcessesDependencies'
static primaryKey = ['processId', 'dependencyId']
static fields () {
return {
processId: this.attr(null),
dependencyId: this.attr(null)
}
}
}
The Vuex-ORM database
dependencies: [
{
"id": 1,
"name": "Dep1",
"$id": "1"
},
{
"id": 2,
"name": "Dep2",
"$id": "2"
}
],
processes: [
{
"id": 99,
"name": "MyProc",
"dependencies": [ /* managed via vuex-orm in mapProcessesDependencies */],
"$id": "99"
}
],
mapProcessesDependencies: [
"[99,1]": {
"processId": 99,
"dependencyId": 1,
"$id": "[99,1]"
},
"[99,2]": {
"processId": 99,
"dependencyId": 2,
"$id": "[99,2]"
}
]
What I want to achieve
// ...by calling this:
Process.insertOrUpdate({ where: 99, data: {
"id": 99,
"name": "MyProc",
"dependencies": [ 1 ],
} })
...is the following result without manually calling MapProcessDependency.delete([99,2]):
// [...]
mapProcessesDependencies: [
"[99,1]": {
"processId": 99,
"dependencyId": 1,
"$id": "[99,1]"
}
// relationship [99,2] removed
]

Find similar documents/records in database

I have a quite big number of records currently stored in mongodb, each looks somehow like this:
{
"_id" : ObjectId("5c38d267b87d0a05d8cd4dc2"),
"tech" : "NodeJs",
"packagename" : "package-name",
"packageversion" : "0.0.1",
"total_loc" : 474,
"total_files" : 7,
"tecloc" : {
"JavaScript" : 316,
"Markdown" : 116,
"JSON" : 42
}
}
What I want to do is to find similar data record based on e.g., records which have about (+/-10%) the number of total_loc or use some of the same technologies (tecloc).
Can I somehow do this with a query against mongodb or is there a technology that fits better for what I want to do? I am fine with regenerating the data and storing it e.g., in elastic or some graph-db.
Thank you
One of the possibility to solve this problem is to use Elasticsearch. I'm not claiming that it's the only solution you have.
On the high level - you would need to setup Elasticsearch and index your data. There are various possibilities to achieve: mongo-connector, or Logstash and JDBC input plugin or even just dumping data from MongoDB and putting it manually. No limits to do this job.
The difference I would propose initially is to make field tecloc - multivalued field, by replacing { to [, and adding some other fields for line of code, e.g:
{
"tech": "NodeJs",
"packagename": "package-name",
"packageversion": "0.0.1",
"total_loc": 474,
"total_files": 7,
"tecloc": [
{
"name": "JavaScript",
"loc": 316
},
{
"name": "Markdown",
"loc": 116
},
{
"name": "JSON",
"loc": 42
}
]
}
This data model is very trivial and obviously have some limitations, but it's already something for you to start and see how well it fits your other use cases. Later you should discover nested type as one of the possibility to mimic your data more properly.
Regarding your exact search scenario - you could search those kind of documents with a query like this:
{
"query": {
"bool": {
"should": [
{
"term": {
"tecloc.name.keyword": {
"value": "Java"
}
}
},
{
"term": {
"tecloc.name.keyword": {
"value": "Markdown"
}
}
}
],
"must": [
{"range": {
"total_loc": {
"gte": 426,
"lte": 521
}
}}
]
}
}
}
Unfortunately, there is no support for syntax with +-10% so this is something that should be calculated on the client.
On the other side, I specified that we are searching documents which should have Java or Markdown, which return example document as well. In this case, if I would have document with both Java and Markdown the score of this document will be higher.

Add new object inside array of objects, inside array of objects in mongodb

Considering the below bad model, as I am totally new to this.
{
"uid": "some-id",
"database": {
"name": "nameOfDatabase",
"collection": [
{
"name": "nameOfCollection",
"fields": {
"0": "field_1",
"1": "field_2"
}
},
{
"name": "nameOfAnotherCollection",
"fields": {
"0": "field_1"
}
}
]
}
}
I have the collection name (i.e database.collection.name) and I have a few fields to add to it or delete from it (there are some already existing ones under database.collection.fields, I want to add new ones or delete exiting ones).
In short how do I update/delete "fields", when I have the database name and the collection name.
I cannot figure out how to use positional operator $ in this context.
Using mongoose update as
Model.update(conditions, updates, options, callback);
I don't know what are correct conditions and correct updates parameters.
So far I have unsuccessfully used the below for model.update
conditions = {
"uid": req.body.uid,
"database.name": "test",
"database.collection":{ $elemMatch:{"name":req.body.collection.name}}
};
updates = {
$set: {
"fields": req.body.collection.fields
}
};
---------------------------------------------------------
conditions = {
"uid": req.body.uid,
"database.name": "test",
"database.collection.$.name":req.body.collection.name
};
updates = {
$addToSet: {
"fields": req.body.collection.fields
}
};
I tried a lot more but none did work, as I am totally new.
I am getting confused between $push, $set, $addToSet, what to use what not to?, how to?
The original schema is supposed to be as show below, but running queries on it is getting harder n harder.
{
"uid": "some-id",
"database": [
{ //array of database objects
"name": "nameOfDatabase",
"collection": [ //array of collection objects inside respective databases
{
"name": "nameOfCollection",
"fields": { //fields inside a this particular collection
"0": "field_1",
"1": "field_2"
}
}
]
}
]
}

Update array of subdocuments in MongoDB

I have a collection of students that have a name and an array of email addresses. A student document looks something like this:
{
"_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
"name": "John Doe",
"emails": [
{
"label": "private",
"value": "private#johndoe.com"
},
{
"label": "work",
"value": "work#johndoe.com"
}
]
}
The label in the email subdocument is set to be unique per document, so there can't be two entries with the same label.
My problems is, that when updating a student document, I want to achieve the following:
adding an email with a new label should simply add a new subdocument with the given label and value to the array
if adding an email with a label that already exists, the value of the existing should be set to the data of the update
For example when updating with the following data:
{
"_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
}
I would like the result of the emails array to be:
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "work",
"value": "work#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
How can I achieve this in MongoDB (optionally using mongoose)? Is this at all possible or do I have to check the array myself in the application code?
You could try this update but only efficient for small datasets:
mongo shell:
var data = {
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
};
data.emails.forEach(function(email) {
var emails = db.students.findOne({_id: data._id}).emails,
query = { "_id": data._id },
update = {};
emails.forEach(function(e) {
if (e.label === email.label) {
query["emails.label"] = email.label;
update["$set"] = { "emails.$.value": email.value };
} else {
update["$addToSet"] = { "emails": email };
}
db.students.update(query, update)
});
});
Suggestion: refactor your data to use the "label" as an actual field name.
There is one straightforward way in which MongoDB can guarantee unique values for a given email label - by making the label a single separate field in itself, in an email sub-document. Your data needs to exist in this structure:
{
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"name": "John Doe",
"emails": {
"private": "private#johndoe.com",
"work" : "work#johndoe.com"
}
}
Now, when you want to update a student's emails you can do an update like this:
db.students.update(
{"_id": ObjectId("56d06bb6d9f75035956fa7ba")},
{$set: {
"emails.private" : "me#johndoe.com",
"emails.school" : "school#johndoe.com"
}}
);
And that will change the data to this:
{
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"name": "John Doe",
"emails": {
"private": "me#johndoe.com",
"work" : "work#johndoe.com",
"school" : "school#johndoe.com"
}
}
Admittedly there is a disadvantage to this approach: you will need to change the structure of the input data, from the emails being in an array of sub-documents to the emails being a single sub-document of single fields. But the advantage is that your data requirements are automatically met by the way that JSON objects work.
After investigating the different options posted, I decided to go with my own approach of doing the update manually in the code using lodash's unionBy() function. Using express and mongoose's findById() that basically looks like this:
Student.findById(req.params.id, function(err, student) {
if(req.body.name) student.name = req.body.name;
if(req.body.emails && req.body.emails.length > 0) {
student.emails = _.unionBy(req.body.emails, student.emails, 'label');
}
student.save(function(err, result) {
if(err) return next(err);
res.status(200).json(result);
});
});
This way I get the full flexibility of partial updates for all fields. Of course you could also use findByIdAndUpdate() or other options.
Alternate approach:
However the way of changing the schema like Vince Bowdren suggested, making label a single separate field in a email subdocument, is also a viable option. In the end it just depends on your personal preferences and if you need strict validation on your data or not.
If you are using mongoose like I do, you would have to define a separate schema like so:
var EmailSchema = new mongoose.Schema({
work: { type: String, validate: validateEmail },
private: { type: String, validate: validateEmail }
}, {
strict: false,
_id: false
});
In the schema you can define properties for the labels you already want to support and add validation. By setting the strict: false option, you would allow the user to also post emails with custom labels. Note however, that these would not be validated. You would have to apply the validation manually in your application similar to the way I did it in my approach above for the merging.

When do the Facebook graph api endpoints return an array wrapped in a { data: ... } object?

Some Facebook graph api endpoints return arrays like this:
"likes": {
"data": [
{
"id": "000000",
"name": "Somebody"
}
],
"paging": {
"cursors": {
"after": ".....",
"before": "....."
}
}
}
While others return arrays like this:
"actions": [
{
"name": "Comment",
"link": "https://www.facebook.com/000000/posts/00000"
},
{
"name": "Like",
"link": "https://www.facebook.com/00000/posts/00000"
}
]
Does anyone know of where in the documentation Facebook explains when an array is going to be returned wrapped in a { data: [...] } object? As far as I know, facebook just lists everything that is an array as array and doesn't explain when a data object will be returned.
I guess I can assume that if something can be "paged" that it will be in a data structure...
Am I missing some documentation about Facebook data types somewhere?
You are right, they seem to list everything that is an array as array.
For instance, in the documentation for the Post Endpoint, the return type for both "likes" and "actions" is listed as an array.
Likes
...
Array of objects containing the id and name fields.
Requesting with summary=1 will also return a summary object
containing the total_count of likes.
Actions
...
Array of objects containing the name and link
I think that's why you have to query the actual end point, and inspect the JSON (like you are doing) and figure out what is actually coming back.
No wonder the Facebook API is the proud winner of the "Worst API" Award!
Any feed(feeds/posts)(that is every array) you retrieve will be retured in data object, inside this data object the array will be there.Also the friend list is embedded in data object,which will have array of objects. Something like this:
{"data": [
{
"id": "..",
"name": ".."
}
],
"paging": {
"cursors": {
"after": ".....",
"before": "....."
}
}
}
While in all other api requests , which do not return an array, have structure like this:
{
"id": "..",
"name": ".."
}

Resources