Spring Data MongoDB support bulk insert/save - spring-data-mongodb

I have been google for a while, not sure whether Spring Data MongoDB supports for bulk save.
I need to save a collection of documents into mongo as atomic, either all saved or none saved.
Can anyone share a link or some sample code for this?

When you do a save through MongoDB Java driver you can only pass a single document to MongoDB.
When you do an insert, you can pass a single element or you can pass an array of elements. The latter is what will result in a "bulk insert" (i.e. single insert command by client will result in multiple documents being inserted on the server).
However, since MongoDB does not support a notion of transaction, if one of the inserts fails there is no way to indicate that previously inserted documents should be deleted or rolled back.
For the purposes of atomicity, each document insert is a separate operation and there is no supported way to make MongoDB either insert all or none.
If this is something that your application requires there may be other ways to achieve it:
- change your schema so that these are subdocuments of a single parent document
(then there is technically only one "insert" of the parent document)
- write the transaction semantics into your application code
- use a database which natively supports two phase commit transactions.

We have used Spring Data and Mongo Driver to achieve copying data from one database server to another.
import com.mongodb.MongoBulkWriteException;
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.bulk.BulkWriteResult;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.model.BulkWriteOptions;
import com.mongodb.client.model.InsertOneModel;
import com.mongodb.client.model.WriteModel;
import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Pageable;
import org.springframework.data.domain.Sort;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.query.Criteria;
import org.springframework.data.mongodb.core.query.Query;
import org.springframework.stereotype.Component;
#Component
public class DataCopy{
public void copyData(MongoTemplate sourceMongo,MongoTemplate destinationMongo ){
Class cls = EmployeeEntity.class;
String collectionName = sourceMongo.getCollectionName(cls).get();
MongoCollection<Document> collection = destinationMongo.getCollection(collectionName);
Query findQuery = new Query();
Criteria criteria = new Criteria();
criteria.andOperator(Criteria.where("firstName").is("someName"),
Criteria.where("lastName").is("surname"));
query.addCriteria(criteria);
Pageable pageable = PageRequest.of(0, 10000);
findQuery.with(pageable);
List<?> pagedResult = sourceMongo.find(findQuery, cls).get()
while (!pagedResult.isEmpty()) {
try {
BulkWriteResult result = collection.bulkWrite(
pagedResult.
stream().map(d -> mapWriteModel(d, destinationMongo)).collect(Collectors.toList()),
new BulkWriteOptions().ordered(false));
} catch (Exception e) {
log.error("failed to copy", e);
}
pageable = pageable.next();
findQuery.with(pageable);
pagedResult = sourceMongo.find(findQuery, cls).get();
}
}
}
private WriteModel<? extends Document> mapWriteModel(Object obj,
MongoTemplate mongoTemplate
) {
Document document = new Document();
mongoTemplate.getConverter().write(obj, document);
return new InsertOneModel<>(document);
}
// Code example to create mongo templates for source and target databases
MongoClient targetClient = new MongoClient("databaseUri")
MongoTemplate destinationMongo = new MongoTemplate(targetClient, "databaseName");
Hope this would be helpful to you.

Related

Create array from multiple responses

I have web APIs which can create and delete objects, however, to delete an object I need to use its Id which is generated when I create the object (I get the new object in JSON format as a response).
The URL of the delete method is .../delete/{id}.
My question is how can put this Id into an array (I know how to put this id into a variable using regEx) and then use the values in the array in the URL of the delete method so I could create multiple objects in a row and then delete them?
Let's say you have an extractor that extracts the id into id variable.
Add after it a JSR223 Post Processor with following code:
import java.util.List;
import java.util.ArrayList;
def id = vars["id"];
List<String> listIds = (List<String>) vars.getObject("listIds");
if (listIds == null) {
listIds = new ArrayList<String>();
vars.putObject("listIds", listIds);
}
listIds.add(id);
Then at the place where you want to do the call on array add:
Flow Control Action
Add to it as a child a JSR223 PreProcessor with following code:
import java.util.List;
List<String> listIds = (List<String>) vars.getObject("listIds");
vars.put("ids_matchNr", listIds.size());
listIds.eachWithIndex{it,index->
vars.put("ids_"+(index+1), Integer.toString(it));
}
After the Flow Control Action, add a ForEach Controller, with following configuration:

Spring Data Mongodb Bulk Operation Example

Can some one please point me a complete example of Spring Data Mongodb DB bulk operation example.
I am trying to switch to bulk updates using spring data mongodb. Not able to find a good example.
Thank you.
BulkOprations in Spring data mongodb uses bulkWrite() from mongodb.
From mongoDB documentation ->
So When you want to update many entities with different updated in one query you can do that via this bulkOps.
Let us see an example eventhough it may not be an perfect one. Lets consider you have an Employee Collection with employees working in a company. Now After appraisal there will be change in salary for all the employees, and each employee salary change will be different and let's pretend there is no percentage wise hike involved and if you want to update the changes in one go you can use bulkOps.
import java.util.List;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.mongodb.core.BulkOperations;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.query.Query;
import org.springframework.data.mongodb.core.query.Update;
import org.springframework.data.util.Pair;
public class Example {
#Autowired
MongoTemplate mongoTemplate;
public int bulkUpdateEmployee(List<Pair<Query, Update>> updates){
return mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED,"employees",Employee.class).updateMulti(updates).execute().getModifiedCount();
}
}
--------------Here we can prepare the pair of query and update from -------
-------for each employee->
---"query" - id of employee is blabla
---"update"- set salary to xxx
Sharing the code for bulk operations which worked for me
BulkOperations bulkOps = mongoTemplate.bulkOps(BulkMode.UNORDERED, Person.class);
for(Person person : personList) {
Query query = new Query().addCriteria(new Criteria("id").is(person.getId()));
Update update = new Update().set("address", "new Address as per requirement");
bulkOps.updateOne(query, update);
}
BulkWriteResult results = bulkOps.execute();
I think following code is the simple example that anybody can uderstand
Note : Ensure that custom mongo repository is correctly configured.
#Autowired
MongoTemplate mongoTemplate;
public int bulkUpdate(String member)
{
Query query = new Query();
Criteria criteria=Criteria.where("column name").is(member);
query.addCriteria(criteria);
Update update = new Update();
update.set("column name",true);
return mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED, YourModelClass.class,"name of collection").updateMulti(query,update).execute().getModifiedCount();
}
There are some elegant ways to perform the bulkOperations in Spring data mongodb refer
An excerpt from the reference
Starting in version 2.6, MongoDB servers support bulk write commands for insert, update, and delete in a way that allows the driver to implement the correct semantics for BulkWriteResult and BulkWriteException.
There are two types of bulk operations, ordered and unordered bulk operations.
Ordered bulk operations execute all the operations in order and error out on the first write error.
Unordered bulk operations execute all the operations and report any errors. Unordered bulk operations do not guarantee the order of execution.
Sample bulk operation covering most of the features
import com.mongodb.BasicDBObject;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.model.BulkWriteOptions;
import com.mongodb.client.model.DeleteOneModel;
import com.mongodb.client.model.InsertOneModel;
import com.mongodb.client.model.ReplaceOneModel;
import com.mongodb.client.model.UpdateOneModel;
import com.mongodb.client.model.Updates;
import com.mongodb.client.model.WriteModel;
import org.bson.Document;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.mongodb.core.MongoOperations;
import org.springframework.data.mongodb.core.aggregation.Fields;
import org.springframework.stereotype.Repository;
import java.util.Arrays;
import java.util.Date;
import java.util.List;
import static org.springframework.data.mongodb.core.aggregation.Fields.UNDERSCORE_ID;
#Repository
public class BulkUpdateDemoRepository {
#Autowired
private MongoOperations mongoOperations;
public void bulkUpdate() {
MongoCollection<Document> postsCollection = mongoOperations.getCollection("posts");
//If no top-level _id field is specified in the documents, the Java driver automatically
// adds the _id field to the inserted documents.
Document document = new Document("postId", 1)
.append("name", "id labore ex et quam laborum")
.append("email", "Eliseo#gardner.biz")
.append("secondary-email", "Eliseo#gardner.biz")
.append("body", "laudantium enim quasi est");
List<WriteModel<Document>> list = Arrays.asList(
//Inserting the documents
new InsertOneModel<>(document),
//Adding a field in document
new UpdateOneModel<>(new Document(UNDERSCORE_ID, 3),
new Document().append("$set",
new BasicDBObject("postDate", new Date()))),
//Removing field from document
new UpdateOneModel<>(new Document(Fields.UNDERSCORE_ID, 4),
Updates.unset("secondary-email")),
//Deleting document
new DeleteOneModel<>(new Document(Fields.UNDERSCORE_ID, 2)),
//Replacing document
new ReplaceOneModel<>(new Document(Fields.UNDERSCORE_ID, 3),
new Document(Fields.UNDERSCORE_ID, 3)
.append("secondary-email", "Eliseo-updated#gardner.biz")));
//By default bulk Write operations is ordered because the BulkWriteOptions'
// ordered flag is true by default, by disabling that flag we can perform
// the Unordered bulk operation
//ordered execution
postsCollection.bulkWrite(list);
//2. Unordered bulk operation - no guarantee of order of operation - disabling
// BulkWriteOptions' ordered flag to perform the Unordered bulk operation
postsCollection.bulkWrite(list, new BulkWriteOptions().ordered(false));
}
}

How do I access the explain() method and executionStats when using Spring Data MongoDb v2.x?

It's time to ask the community. I cannot find the answer anywhere.
I want to create a generic method that can trace all my repository queries and warn me if a query is not optimized (aka missing an index).
With Spring Data MongoDb v2.x and higher and with the introduction of the Document API, I cannot figure out how to access DBCursor and the explain() method.
The old way was to do it like this:
https://enesaltinkaya.com/java/how-to-explain-a-mongodb-query-in-spring/
Any advise on this is appreciated.
I know this is an old question but wanted to give input from a similar requirement I had in capacity planning for a cosmos Db project using Java Mongo API driver v2.X.
Summarizing Enes Altınkaya's blog post. With an #autowired MongoTemplate we use runCommand to execute server-side db queries by passing a Document object. Getting to an explain output we parse a Query or Aggregate object into a new Document object and add the entry {"executionStats": true}(or {"executionStatistics": true} for cosmos Db). Then wrap it in an another Document using "explain" as the propery.
For Example:
Query:
public static Document documentRequestStatsQuery(MongoTemplate mongoTemplate,
Query query, String collectionName) {
Document queryDocument = new Document();
queryDocument.put("find", collectionName);
queryDocument.put("filter", query.getQueryObject());
queryDocument.put("sort", query.getSortObject());
queryDocument.put("skip", query.getSkip());
queryDocument.put("limit", query.getLimit());
queryDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", queryDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
Aggregate:
public static Document documentRequestStatsAggregate(MongoTemplate mongoTemplate,
Aggregation aggregate, String collection) {
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collection);
explainAggDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
For the actual monitoring, since Service & Repository classes are MongoTemplate abstractions we can use Aspects to capture the query/aggregate execution details as the applications is running.
For Example:
#Aspect
#Component
#Slf4j
public class RequestStats {
#Autowired
MongoTemplate mongoTemplate;
#After("execution(* org.springframework.data.mongodb.core.MongoTemplate.aggregate(..))")
public void logTemplateAggregate(JoinPoint joinPoint) {
Object[] signatureArgs = joinPoint.getArgs();
Aggregation aggregate = (Aggregation) signatureArgs[0];
String collectionName = (String) signatureArgs[1];
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collectionName);
explainAggDocument.put("executionStatistics", true);
Document dbCommand = new Document();
dbCommand.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(dbCommand);
log.info(explainResult.toJson());
}
}
Outputs something like below after each execution:
{
"queryMetrics": {
"retrievedDocumentCount": 101,
"retrievedDocumentSizeBytes": 202214,
"outputDocumentCount": 101,
"outputDocumentSizeBytes": 27800,
"indexHitRatio": 1.0,
"totalQueryExecutionTimeMS": 15.85,
"queryPreparationTimes": {
"queryCompilationTimeMS": 0.21,
"logicalPlanBuildTimeMS": 0.5,
"physicalPlanBuildTimeMS": 0.58,
"queryOptimizationTimeMS": 0.1
},
"indexLookupTimeMS": 10.43,
"documentLoadTimeMS": 0.93,
"vmExecutionTimeMS": 13.6,
"runtimeExecutionTimes": {
"queryEngineExecutionTimeMS": 1.56,
"systemFunctionExecutionTimeMS": 1.36,
"userDefinedFunctionExecutionTimeMS": 0
},
"documentWriteTimeMS": 0.68
}
// ...
I usually log this out into another collection or write to file.

how to pass dataprovider to any test in testNG when dataset has data not specific to this testcase

I am trying to build a Selenium hybrid framework using TestNG wherein i am getting data from my excel datasheet. I am trying to use DataProvider of testNG, But problem is since my datasheet contains data which belongs to different test case (for eg. 2 rows for add user, 1 rows for modify user, some rows for searching user etc)
since my dataprovider will return all the data from datasheet and passing it to any particular testCase that will run for all row of dataprovider will cause problem (eg. create user will need 5 parameter but the data of edit user will not be sufficient to it).
how can we handle this problem?
Here's how you do this:
Within your .xls file, create a sheet which represents a particular functionality. (For e.g, login, compose, address-book etc., if I were to be taking the example of an emailing application)
Now each sheet would have test data for various test cases, that test out that particular functionality.
In your #Test method, you can create a new custom annotation (this would be a marker annotation), which would indicate the "sheet" name from which the data provider should be retrieving data from. If you are not keen on creating a new custom annotation, then you can make use of the "description" attribute of the #Test annotation to capture this information.
TestNG can natively inject a Method object to your #DataProvider annotated method. Here the Method object that was injected would represent the #Test method for which the data provider is about to be invoked. So now you can retrieve the sheet name, either from the new custom annotation (or) from the description attribute of the #Test annotation to figure out which sheet name to query for data.
That should solve your issue.
Here's a sample that demonstrates the overall idea. You would need to enrich the data provider, such that it uses the sheet name to query data from the excel spreadsheet. My sample just excludes all of that, for the sake of demonstration.
import java.lang.annotation.Retention;
import java.lang.annotation.Target;
import static java.lang.annotation.ElementType.METHOD;
#Retention(java.lang.annotation.RetentionPolicy.RUNTIME)
#Target({METHOD})
public #interface SheetName {
String value() default "";
}
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
import java.lang.reflect.Method;
public class TestClass {
#Test(dataProvider = "dp")
#SheetName("one")
public void test1(String name) {
System.err.println("Name is " + name);
}
#Test(dataProvider = "dp")
#SheetName("two")
public void test2(int age) {
System.err.println("Age is " + age);
}
#DataProvider(name = "dp")
public Object[][] getData(Method method) {
String sheetName = getSheetName(method);
if (sheetName == null) {
// Handle the case, wherein our custom annotation is missing. That means the test perhaps
// expects
// either all of the data, or it could be a error case.
return new Object[][] {{}};
}
if ("one".equalsIgnoreCase(sheetName)) {
return new Object[][] {{"Cedric"}, {"Beust"}};
}
if ("two".equalsIgnoreCase(sheetName)) {
return new Object[][] {{1}, {2}};
}
// Handle the case, wherein we had a valid sheet name, but it represents a sheet that cant be
// found in our
// excel spreadsheet.
return new Object[][] {{}};
}
private String getSheetName(Method method) {
SheetName sheetName = method.getAnnotation(SheetName.class);
if (sheetName == null || sheetName.value().trim().isEmpty()) {
return null;
}
return sheetName.value();
}
}

Add field to App Engine-hosted database

I'm currently developing a mobile application who uses a Google App Engine-hosted web service.
But i'm facing an issue. I just want to add a field in one my database's table.
App Engine doesn't use classic SQL syntax, but GQL. So i cannot use the ALTER TABLE statement. How can i do this with GQL ? I looked for a solution on the web, but there's not a lot of help.
public MyEntity() {
}
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
private Key idStation;
private String name;
private double longitude;
private double latitude;
private java.util.Date dateRefresh = new Date(); //the field i want to add in DB
So, now when i create a "MyEntity" object, it should add the "dateRefresh" field into the database... I create my object like this:
MyEntity station = new MyEntity();
station.setName("test");
station.setLatitude(0);
station.setLongitude(0);
station.setDateRefresh(new Date("01/01/1980"));
DaoFactory.getStationDao().addStation(station);
addStation method:
#Override
public MyEntity addStation(MyEntity station) {
EntityManager em = PersistenceManager.getEntityManagerFactory().createEntityManager();
try {
em.getTransaction().begin();
em.persist(station);
em.getTransaction().commit();
} finally {
if(em.getTransaction().isActive()) em.getTransaction().rollback();
em.close();
}
return station;
}
The field "dateRefresh" is never created into my DB...
Someone to help me please ?
Thanks in advance
Just add another field to your data structure, maybe providing a default clause, and that's all. For example, if you have a UserAccount:
class UserAccount(db.Model):
user = db.UserProperty()
user_id = db.StringProperty()
you may easily add:
class UserAccount(db.Model):
user = db.UserProperty()
user_id = db.StringProperty()
extra_info = db.IntegerProperty(default=0)
timezone = db.StringProperty(default="UTC")
and let it go.
While the datastore kinda mimics tables, data is stored on a per entity basis. There is no schema or table.
All you need to do is update your model class, and new entities will be saved with the structure (fields) of the new entity.
Old entities and indexes, however, are not automatically updated. They still have the same fields as they had when they were originally written to the datastore.
There's two ways to do this. One is to make sure your code can handle situations where your new properties are missing, ie make sure no exceptions are thrown, or handle the exceptions properly when you're missing the properties.
The second way is to write a little function (usu a mapreduce function) to update every entity with appropriate or null values for your new properties.
Note that indexes are not updated unless the entity is written. So if you add a new indexed property, old entities won't show up when you query for the new property. In this case, you must use the second method and update all the entities in the datastore so that they are indexed.

Resources