Spring Data Mongodb Bulk Operation Example - spring-data-mongodb

Can some one please point me a complete example of Spring Data Mongodb DB bulk operation example.
I am trying to switch to bulk updates using spring data mongodb. Not able to find a good example.
Thank you.

BulkOprations in Spring data mongodb uses bulkWrite() from mongodb.
From mongoDB documentation ->
So When you want to update many entities with different updated in one query you can do that via this bulkOps.
Let us see an example eventhough it may not be an perfect one. Lets consider you have an Employee Collection with employees working in a company. Now After appraisal there will be change in salary for all the employees, and each employee salary change will be different and let's pretend there is no percentage wise hike involved and if you want to update the changes in one go you can use bulkOps.
import java.util.List;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.mongodb.core.BulkOperations;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.query.Query;
import org.springframework.data.mongodb.core.query.Update;
import org.springframework.data.util.Pair;
public class Example {
#Autowired
MongoTemplate mongoTemplate;
public int bulkUpdateEmployee(List<Pair<Query, Update>> updates){
return mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED,"employees",Employee.class).updateMulti(updates).execute().getModifiedCount();
}
}
--------------Here we can prepare the pair of query and update from -------
-------for each employee->
---"query" - id of employee is blabla
---"update"- set salary to xxx

Sharing the code for bulk operations which worked for me
BulkOperations bulkOps = mongoTemplate.bulkOps(BulkMode.UNORDERED, Person.class);
for(Person person : personList) {
Query query = new Query().addCriteria(new Criteria("id").is(person.getId()));
Update update = new Update().set("address", "new Address as per requirement");
bulkOps.updateOne(query, update);
}
BulkWriteResult results = bulkOps.execute();

I think following code is the simple example that anybody can uderstand
Note : Ensure that custom mongo repository is correctly configured.
#Autowired
MongoTemplate mongoTemplate;
public int bulkUpdate(String member)
{
Query query = new Query();
Criteria criteria=Criteria.where("column name").is(member);
query.addCriteria(criteria);
Update update = new Update();
update.set("column name",true);
return mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED, YourModelClass.class,"name of collection").updateMulti(query,update).execute().getModifiedCount();
}

There are some elegant ways to perform the bulkOperations in Spring data mongodb refer
An excerpt from the reference
Starting in version 2.6, MongoDB servers support bulk write commands for insert, update, and delete in a way that allows the driver to implement the correct semantics for BulkWriteResult and BulkWriteException.
There are two types of bulk operations, ordered and unordered bulk operations.
Ordered bulk operations execute all the operations in order and error out on the first write error.
Unordered bulk operations execute all the operations and report any errors. Unordered bulk operations do not guarantee the order of execution.
Sample bulk operation covering most of the features
import com.mongodb.BasicDBObject;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.model.BulkWriteOptions;
import com.mongodb.client.model.DeleteOneModel;
import com.mongodb.client.model.InsertOneModel;
import com.mongodb.client.model.ReplaceOneModel;
import com.mongodb.client.model.UpdateOneModel;
import com.mongodb.client.model.Updates;
import com.mongodb.client.model.WriteModel;
import org.bson.Document;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.mongodb.core.MongoOperations;
import org.springframework.data.mongodb.core.aggregation.Fields;
import org.springframework.stereotype.Repository;
import java.util.Arrays;
import java.util.Date;
import java.util.List;
import static org.springframework.data.mongodb.core.aggregation.Fields.UNDERSCORE_ID;
#Repository
public class BulkUpdateDemoRepository {
#Autowired
private MongoOperations mongoOperations;
public void bulkUpdate() {
MongoCollection<Document> postsCollection = mongoOperations.getCollection("posts");
//If no top-level _id field is specified in the documents, the Java driver automatically
// adds the _id field to the inserted documents.
Document document = new Document("postId", 1)
.append("name", "id labore ex et quam laborum")
.append("email", "Eliseo#gardner.biz")
.append("secondary-email", "Eliseo#gardner.biz")
.append("body", "laudantium enim quasi est");
List<WriteModel<Document>> list = Arrays.asList(
//Inserting the documents
new InsertOneModel<>(document),
//Adding a field in document
new UpdateOneModel<>(new Document(UNDERSCORE_ID, 3),
new Document().append("$set",
new BasicDBObject("postDate", new Date()))),
//Removing field from document
new UpdateOneModel<>(new Document(Fields.UNDERSCORE_ID, 4),
Updates.unset("secondary-email")),
//Deleting document
new DeleteOneModel<>(new Document(Fields.UNDERSCORE_ID, 2)),
//Replacing document
new ReplaceOneModel<>(new Document(Fields.UNDERSCORE_ID, 3),
new Document(Fields.UNDERSCORE_ID, 3)
.append("secondary-email", "Eliseo-updated#gardner.biz")));
//By default bulk Write operations is ordered because the BulkWriteOptions'
// ordered flag is true by default, by disabling that flag we can perform
// the Unordered bulk operation
//ordered execution
postsCollection.bulkWrite(list);
//2. Unordered bulk operation - no guarantee of order of operation - disabling
// BulkWriteOptions' ordered flag to perform the Unordered bulk operation
postsCollection.bulkWrite(list, new BulkWriteOptions().ordered(false));
}
}

Related

Create array from multiple responses

I have web APIs which can create and delete objects, however, to delete an object I need to use its Id which is generated when I create the object (I get the new object in JSON format as a response).
The URL of the delete method is .../delete/{id}.
My question is how can put this Id into an array (I know how to put this id into a variable using regEx) and then use the values in the array in the URL of the delete method so I could create multiple objects in a row and then delete them?
Let's say you have an extractor that extracts the id into id variable.
Add after it a JSR223 Post Processor with following code:
import java.util.List;
import java.util.ArrayList;
def id = vars["id"];
List<String> listIds = (List<String>) vars.getObject("listIds");
if (listIds == null) {
listIds = new ArrayList<String>();
vars.putObject("listIds", listIds);
}
listIds.add(id);
Then at the place where you want to do the call on array add:
Flow Control Action
Add to it as a child a JSR223 PreProcessor with following code:
import java.util.List;
List<String> listIds = (List<String>) vars.getObject("listIds");
vars.put("ids_matchNr", listIds.size());
listIds.eachWithIndex{it,index->
vars.put("ids_"+(index+1), Integer.toString(it));
}
After the Flow Control Action, add a ForEach Controller, with following configuration:

How do I access the explain() method and executionStats when using Spring Data MongoDb v2.x?

It's time to ask the community. I cannot find the answer anywhere.
I want to create a generic method that can trace all my repository queries and warn me if a query is not optimized (aka missing an index).
With Spring Data MongoDb v2.x and higher and with the introduction of the Document API, I cannot figure out how to access DBCursor and the explain() method.
The old way was to do it like this:
https://enesaltinkaya.com/java/how-to-explain-a-mongodb-query-in-spring/
Any advise on this is appreciated.
I know this is an old question but wanted to give input from a similar requirement I had in capacity planning for a cosmos Db project using Java Mongo API driver v2.X.
Summarizing Enes Altınkaya's blog post. With an #autowired MongoTemplate we use runCommand to execute server-side db queries by passing a Document object. Getting to an explain output we parse a Query or Aggregate object into a new Document object and add the entry {"executionStats": true}(or {"executionStatistics": true} for cosmos Db). Then wrap it in an another Document using "explain" as the propery.
For Example:
Query:
public static Document documentRequestStatsQuery(MongoTemplate mongoTemplate,
Query query, String collectionName) {
Document queryDocument = new Document();
queryDocument.put("find", collectionName);
queryDocument.put("filter", query.getQueryObject());
queryDocument.put("sort", query.getSortObject());
queryDocument.put("skip", query.getSkip());
queryDocument.put("limit", query.getLimit());
queryDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", queryDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
Aggregate:
public static Document documentRequestStatsAggregate(MongoTemplate mongoTemplate,
Aggregation aggregate, String collection) {
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collection);
explainAggDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
For the actual monitoring, since Service & Repository classes are MongoTemplate abstractions we can use Aspects to capture the query/aggregate execution details as the applications is running.
For Example:
#Aspect
#Component
#Slf4j
public class RequestStats {
#Autowired
MongoTemplate mongoTemplate;
#After("execution(* org.springframework.data.mongodb.core.MongoTemplate.aggregate(..))")
public void logTemplateAggregate(JoinPoint joinPoint) {
Object[] signatureArgs = joinPoint.getArgs();
Aggregation aggregate = (Aggregation) signatureArgs[0];
String collectionName = (String) signatureArgs[1];
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collectionName);
explainAggDocument.put("executionStatistics", true);
Document dbCommand = new Document();
dbCommand.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(dbCommand);
log.info(explainResult.toJson());
}
}
Outputs something like below after each execution:
{
"queryMetrics": {
"retrievedDocumentCount": 101,
"retrievedDocumentSizeBytes": 202214,
"outputDocumentCount": 101,
"outputDocumentSizeBytes": 27800,
"indexHitRatio": 1.0,
"totalQueryExecutionTimeMS": 15.85,
"queryPreparationTimes": {
"queryCompilationTimeMS": 0.21,
"logicalPlanBuildTimeMS": 0.5,
"physicalPlanBuildTimeMS": 0.58,
"queryOptimizationTimeMS": 0.1
},
"indexLookupTimeMS": 10.43,
"documentLoadTimeMS": 0.93,
"vmExecutionTimeMS": 13.6,
"runtimeExecutionTimes": {
"queryEngineExecutionTimeMS": 1.56,
"systemFunctionExecutionTimeMS": 1.36,
"userDefinedFunctionExecutionTimeMS": 0
},
"documentWriteTimeMS": 0.68
}
// ...
I usually log this out into another collection or write to file.

how to pass dataprovider to any test in testNG when dataset has data not specific to this testcase

I am trying to build a Selenium hybrid framework using TestNG wherein i am getting data from my excel datasheet. I am trying to use DataProvider of testNG, But problem is since my datasheet contains data which belongs to different test case (for eg. 2 rows for add user, 1 rows for modify user, some rows for searching user etc)
since my dataprovider will return all the data from datasheet and passing it to any particular testCase that will run for all row of dataprovider will cause problem (eg. create user will need 5 parameter but the data of edit user will not be sufficient to it).
how can we handle this problem?
Here's how you do this:
Within your .xls file, create a sheet which represents a particular functionality. (For e.g, login, compose, address-book etc., if I were to be taking the example of an emailing application)
Now each sheet would have test data for various test cases, that test out that particular functionality.
In your #Test method, you can create a new custom annotation (this would be a marker annotation), which would indicate the "sheet" name from which the data provider should be retrieving data from. If you are not keen on creating a new custom annotation, then you can make use of the "description" attribute of the #Test annotation to capture this information.
TestNG can natively inject a Method object to your #DataProvider annotated method. Here the Method object that was injected would represent the #Test method for which the data provider is about to be invoked. So now you can retrieve the sheet name, either from the new custom annotation (or) from the description attribute of the #Test annotation to figure out which sheet name to query for data.
That should solve your issue.
Here's a sample that demonstrates the overall idea. You would need to enrich the data provider, such that it uses the sheet name to query data from the excel spreadsheet. My sample just excludes all of that, for the sake of demonstration.
import java.lang.annotation.Retention;
import java.lang.annotation.Target;
import static java.lang.annotation.ElementType.METHOD;
#Retention(java.lang.annotation.RetentionPolicy.RUNTIME)
#Target({METHOD})
public #interface SheetName {
String value() default "";
}
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
import java.lang.reflect.Method;
public class TestClass {
#Test(dataProvider = "dp")
#SheetName("one")
public void test1(String name) {
System.err.println("Name is " + name);
}
#Test(dataProvider = "dp")
#SheetName("two")
public void test2(int age) {
System.err.println("Age is " + age);
}
#DataProvider(name = "dp")
public Object[][] getData(Method method) {
String sheetName = getSheetName(method);
if (sheetName == null) {
// Handle the case, wherein our custom annotation is missing. That means the test perhaps
// expects
// either all of the data, or it could be a error case.
return new Object[][] {{}};
}
if ("one".equalsIgnoreCase(sheetName)) {
return new Object[][] {{"Cedric"}, {"Beust"}};
}
if ("two".equalsIgnoreCase(sheetName)) {
return new Object[][] {{1}, {2}};
}
// Handle the case, wherein we had a valid sheet name, but it represents a sheet that cant be
// found in our
// excel spreadsheet.
return new Object[][] {{}};
}
private String getSheetName(Method method) {
SheetName sheetName = method.getAnnotation(SheetName.class);
if (sheetName == null || sheetName.value().trim().isEmpty()) {
return null;
}
return sheetName.value();
}
}

How do I setup a streamed set of SQL Inserts in Apache Camel

I have a file with over 3 million pipe-delimited rows that I want to insert into a database. Its a simple table (no normalisation required)
Setting up the route to watch for the file, read it in using streaming mode and split the lines is easy. Inserting rows into the table will also be a simple wiring job.
Question is: how can I do this using batched inserts? Lets say that 1000 rows is optimal.. given that the file is streamed how would the SQL component know that the stream had finished. Lets say the file had 3,000,001 records. How can I set Camel up to insert the last stray record?
Inserting the lines one at a time can be done - but this will be horribly slow.
I would recommend something like this:
from("file:....")
.split("\n").streaming()
.to("any work for individual level")
.aggregate(body(), new MyAggregationStrategy().completionSize(1000).completionTimeout(50)
.to(sql:......);
I didn't validate all the syntax, but the plan would be to grab the file split it with streams, then aggregate groups of 1000 and have a timeout to catch that last smaller group. Those aggregated groups could simply make the body a list of strings or whatever format you will need for your batch sql insert.
Here is more accurate example:
#Component
#Slf4j
public class SQLRoute extends RouteBuilder {
#Autowired
ListAggregationStrategy aggregationStrategy;
#Override
public void configure() throws Exception {
from("timer://runOnce?repeatCount=1&delay=0")
.to("sql:classpath:sql/orders.sql?outputType=StreamList")
.split(body()).streaming()
.aggregate(constant(1), aggregationStrategy).completionSize(1000).completionTimeout(500)
.to("log:batch")
.to("google-bigquery:google_project:import:orders")
.end()
.end();
}
#Component
class ListAggregationStrategy implements AggregationStrategy {
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
List rows = null;
if (oldExchange == null) {
// First row ->
rows = new LinkedList();
rows.add(newExchange.getMessage().getBody());
newExchange.getMessage().setBody(rows);
return newExchange;
}
rows = oldExchange.getIn().getBody(List.class);
Map newRow = newExchange.getIn().getBody(Map.class);
log.debug("Current rows count: {} ", rows.size());
log.debug("Adding new row: {}", newRow);
rows.add(newRow);
oldExchange.getIn().setBody(rows);
return oldExchange;
}
}
}
This can be done using the Camel-Spring-batch component. http://camel.apache.org/springbatch.html , the volume of commit per step can be defined by the commitInterval and the orchestration of the job is defined in a spring config. It works quite for well for usecases similar to your requirement.
Here's a nice example from github : https://github.com/hekonsek/fuse-pocs/tree/master/fuse-pocs-springdm-springbatch/fuse-pocs-springdm-springbatch-bundle/src/main

Spring Data MongoDB support bulk insert/save

I have been google for a while, not sure whether Spring Data MongoDB supports for bulk save.
I need to save a collection of documents into mongo as atomic, either all saved or none saved.
Can anyone share a link or some sample code for this?
When you do a save through MongoDB Java driver you can only pass a single document to MongoDB.
When you do an insert, you can pass a single element or you can pass an array of elements. The latter is what will result in a "bulk insert" (i.e. single insert command by client will result in multiple documents being inserted on the server).
However, since MongoDB does not support a notion of transaction, if one of the inserts fails there is no way to indicate that previously inserted documents should be deleted or rolled back.
For the purposes of atomicity, each document insert is a separate operation and there is no supported way to make MongoDB either insert all or none.
If this is something that your application requires there may be other ways to achieve it:
- change your schema so that these are subdocuments of a single parent document
(then there is technically only one "insert" of the parent document)
- write the transaction semantics into your application code
- use a database which natively supports two phase commit transactions.
We have used Spring Data and Mongo Driver to achieve copying data from one database server to another.
import com.mongodb.MongoBulkWriteException;
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.bulk.BulkWriteResult;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.model.BulkWriteOptions;
import com.mongodb.client.model.InsertOneModel;
import com.mongodb.client.model.WriteModel;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Pageable;
import org.springframework.data.domain.Sort;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.query.Criteria;
import org.springframework.data.mongodb.core.query.Query;
import org.springframework.stereotype.Component;
#Component
public class DataCopy{
public void copyData(MongoTemplate sourceMongo,MongoTemplate destinationMongo ){
Class cls = EmployeeEntity.class;
String collectionName = sourceMongo.getCollectionName(cls).get();
MongoCollection<Document> collection = destinationMongo.getCollection(collectionName);
Query findQuery = new Query();
Criteria criteria = new Criteria();
criteria.andOperator(Criteria.where("firstName").is("someName"),
Criteria.where("lastName").is("surname"));
query.addCriteria(criteria);
Pageable pageable = PageRequest.of(0, 10000);
findQuery.with(pageable);
List<?> pagedResult = sourceMongo.find(findQuery, cls).get()
while (!pagedResult.isEmpty()) {
try {
BulkWriteResult result = collection.bulkWrite(
pagedResult.
stream().map(d -> mapWriteModel(d, destinationMongo)).collect(Collectors.toList()),
new BulkWriteOptions().ordered(false));
} catch (Exception e) {
log.error("failed to copy", e);
}
pageable = pageable.next();
findQuery.with(pageable);
pagedResult = sourceMongo.find(findQuery, cls).get();
}
}
}
private WriteModel<? extends Document> mapWriteModel(Object obj,
MongoTemplate mongoTemplate
) {
Document document = new Document();
mongoTemplate.getConverter().write(obj, document);
return new InsertOneModel<>(document);
}
// Code example to create mongo templates for source and target databases
MongoClient targetClient = new MongoClient("databaseUri")
MongoTemplate destinationMongo = new MongoTemplate(targetClient, "databaseName");
Hope this would be helpful to you.

Resources