How do I access the explain() method and executionStats when using Spring Data MongoDb v2.x? - spring-data-mongodb

It's time to ask the community. I cannot find the answer anywhere.
I want to create a generic method that can trace all my repository queries and warn me if a query is not optimized (aka missing an index).
With Spring Data MongoDb v2.x and higher and with the introduction of the Document API, I cannot figure out how to access DBCursor and the explain() method.
The old way was to do it like this:
https://enesaltinkaya.com/java/how-to-explain-a-mongodb-query-in-spring/
Any advise on this is appreciated.

I know this is an old question but wanted to give input from a similar requirement I had in capacity planning for a cosmos Db project using Java Mongo API driver v2.X.
Summarizing Enes Altınkaya's blog post. With an #autowired MongoTemplate we use runCommand to execute server-side db queries by passing a Document object. Getting to an explain output we parse a Query or Aggregate object into a new Document object and add the entry {"executionStats": true}(or {"executionStatistics": true} for cosmos Db). Then wrap it in an another Document using "explain" as the propery.
For Example:
Query:
public static Document documentRequestStatsQuery(MongoTemplate mongoTemplate,
Query query, String collectionName) {
Document queryDocument = new Document();
queryDocument.put("find", collectionName);
queryDocument.put("filter", query.getQueryObject());
queryDocument.put("sort", query.getSortObject());
queryDocument.put("skip", query.getSkip());
queryDocument.put("limit", query.getLimit());
queryDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", queryDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
Aggregate:
public static Document documentRequestStatsAggregate(MongoTemplate mongoTemplate,
Aggregation aggregate, String collection) {
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collection);
explainAggDocument.put("executionStatistics", true);
Document command = new Document();
command.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(command);
return explainResult;
}
For the actual monitoring, since Service & Repository classes are MongoTemplate abstractions we can use Aspects to capture the query/aggregate execution details as the applications is running.
For Example:
#Aspect
#Component
#Slf4j
public class RequestStats {
#Autowired
MongoTemplate mongoTemplate;
#After("execution(* org.springframework.data.mongodb.core.MongoTemplate.aggregate(..))")
public void logTemplateAggregate(JoinPoint joinPoint) {
Object[] signatureArgs = joinPoint.getArgs();
Aggregation aggregate = (Aggregation) signatureArgs[0];
String collectionName = (String) signatureArgs[1];
Document explainAggDocument = Document.parse(aggregate.toString());
explainAggDocument.put("aggregate", collectionName);
explainAggDocument.put("executionStatistics", true);
Document dbCommand = new Document();
dbCommand.put("explain", explainAggDocument);
Document explainResult = mongoTemplate.getDb().runCommand(dbCommand);
log.info(explainResult.toJson());
}
}
Outputs something like below after each execution:
{
"queryMetrics": {
"retrievedDocumentCount": 101,
"retrievedDocumentSizeBytes": 202214,
"outputDocumentCount": 101,
"outputDocumentSizeBytes": 27800,
"indexHitRatio": 1.0,
"totalQueryExecutionTimeMS": 15.85,
"queryPreparationTimes": {
"queryCompilationTimeMS": 0.21,
"logicalPlanBuildTimeMS": 0.5,
"physicalPlanBuildTimeMS": 0.58,
"queryOptimizationTimeMS": 0.1
},
"indexLookupTimeMS": 10.43,
"documentLoadTimeMS": 0.93,
"vmExecutionTimeMS": 13.6,
"runtimeExecutionTimes": {
"queryEngineExecutionTimeMS": 1.56,
"systemFunctionExecutionTimeMS": 1.36,
"userDefinedFunctionExecutionTimeMS": 0
},
"documentWriteTimeMS": 0.68
}
// ...
I usually log this out into another collection or write to file.

Related

EF Core 3.1 Fail to query on Json Serialized Object

I used json serialization to store list on ids in a field
Model:
public class Video
{
public int Id { get; set; }
public string Name { get; set; }
public virtual IList<int> AllRelatedIds { get; set; }
}
Context:
modelBuilder.Entity<Video>(entity =>
{
entity.Property(p => p.AllRelatedIds).HasConversion(
v => JsonConvert.SerializeObject(v, new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore }),
v => JsonConvert.DeserializeObject<IList<int>>(v, new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore })
);
});
It works fine, Adding, Editing, Deleting items is easy and in SQL Database it stores as json like
[11000,12000,13000]
Everything is fine BUT!! as soon as want to query on this list I get weird responses.
Where:
_context.Set<Video>().Where(t=>t.AllRelatedIds.contains(11000)) returns null however if I ask to return all AllRelatedIds items some records have 11000 value exp.
Count:
_context.Set<Video>().Count(t=>t.AllRelatedIds.contains(11000)) returns could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to either AsEnumerable(), AsAsyncEnumerable(), ToList(), or ToListAsync().
What's the matter with EF Core? I even tested t=>t.AllRelatedIds.ToList().contains(11000) but made no difference
What I should do? I don't want to have more tables, I used this methods hundreds of times but seems never queried on them.
The Json Serialization/Deserialization happens at application level. EF Core serializes the IList<int> object to value [11000,12000,13000] before sending it to database for storing, and deserializes the value [11000,12000,13000] to IList<int> object after retrieving it from the database. Nothing happens inside the database. Your database cannot operate on [11000,12000,13000] as a collection of number. To the database, its a single piece of data.
If you try the following queries -
var videos = _context.Set<Video>().ToList();
var video = _context.Set<Video>().FirstOrDefault(p=> p.Id == 2);
you'll get the expected result, EF Core is doing it's job perfectly.
The problem is, when you query something like -
_context.Set<Video>().Where(t=> t.AllRelatedIds.Contains(11000))
EF Core will fail to translate the t.AllRelatedIds.Contains(11000) part to SQL. EF Core can only serialize/deserialize it because you told it to (and how). But as I said above, your database cannot operate on [11000,12000,13000] as a collection of integer. So EF Core cannot translate the t.AllRelatedIds.Contains(11000) to anything meaningful to the database.
A solution will be to fetch the list of all videos, so that EF Core can deserialize the AllRelatedIds to IList<int>, then you can apply LINQ on it -
var allVideos = _context.Set<Video>().ToList();
var selectedVideos = allVideos.Where(t=> t.AllRelatedIds.Contains(11000)).ToList();
But isn't fetching ALL videos each time unnecessary/overkill or inefficient from performance perspective? Yes, of course. But as the comments implied, your database design/usage approach has some flaws.

Retrieve Max value from a field using Spring Data and MongoDB

I want to obtain the maximum value of the field code within my User entity, using Spring Data and MongoDB.
I have seen similar examples using as below,
".find({}).sort({"updateTime" : -1}).limit(1)"
But have no idea how to integrate it into my own repository using the #Query annotation.
Any alternative solution, than to return the maximum value of said field is also welcome.
Thank you.
You can write a custom method for your repository.
For example you have:
public interface UserRepository extends MongoRepository<User, String>, UserRepositoryCustom {
...
}
Additional methods for repository:
public interface UserRepositoryCustom {
User maxUser();
}
And then implementation of it:
public class UserRepositoryImpl implements UserRepositoryCustom {
#Autowired
private MongoTemplate mongoTemplate;
#Override
public User maxUser() {
final Query query = new Query()
.limit(1)
.with(new Sort(Sort.Direction.DESC, "updateTime"));
return mongoTemplate.findOne(query, User.class)
}
}
You can use the spring data method syntax like:
public User findTopByOrderByUpdateTimeAsc()
A reference can be found here: https://www.baeldung.com/jpa-limit-query-results#1first-ortop
Use this code in spring to get the latest updated time from mongodb: (mongoTemplate)
public List getTopPosts() {
Query query = new Query();
query.with(Sort.by(Sort.Direction.DESC, "postUploadedTime"));
return mongoTemplate.find(query,Post.class);
}

Is there a way to query solr "leader" directly using solrj?

I'm having a single shard and 1 leader & 1 replica architecture. When using "CloudSolrClient", queries are being distributed to both leader and replica. But is there a way to point it only to leader(using zookeeper) other than finding the leader manually and building the query?
It's possible to get the Shards leader in SolrJ and there are several scenarios where this is useful, like for instance when you need to perform a backup programmatically (see example in Solr in Action book).
Here is the relevant code I use:
private final String COLLECTION_NAME = "myCollection";
private final String ZOOKEPER_CLIENT_TIMEOUT_MS = "1000000"
private Map<String, String> getShardLeaders(CloudSolrServer cloudSolrServer) throws InterruptedException, KeeperException {
Map<String, String> shardleaders = new TreeMap<String, String>();
ZkStateReader zkStateReader = cloudSolrServer.getZkStateReader();
for (Slice slice : zkStateReader.getClusterState().getSlices(COLLECTION_NAME)) {
shardleaders.put(slice.getName(), zkStateReader.getLeaderUrl(COLLECTION_NAME, slice.getName(), ZOOKEPER_CLIENT_TIMEOUT_MS));
}
return shardleaders;
}

Objectify doesn't always return results

I am using Objectify to store data on Google App Engine's datastore. I have been trying to implement a one-to-many relationship between two classes, but by storing a list of parameterised keys. The method below works perfectly some of the time, but returns an empty array other times - does anyone know why this may be?
It will either return the correct list of CourseYears, or
{
"items": [
]
}
Here is the method:
#ApiMethod(name = "getCourseYears") #ApiResourceProperty(ignored = AnnotationBoolean.TRUE)
public ArrayList<CourseYear> getCourseYears(#Named("name") String name){
Course course = ofy().load().type(Course.class).filter("name", name).first().now();
System.out.println(course.getName());
ArrayList<CourseYear> courseYears = new ArrayList<CourseYear>();
for(Key<CourseYear> courseYearKey: course.getCourseYears()){
courseYears.add(ofy().load().type(CourseYear.class).id(courseYearKey.getId()).now());
}
return courseYears;
}
The Course class which stores many CourseYear keys
#Entity
public class Course {
#Id
#Index
private Long courseId;
private String code;
#Index
private String name;
#ApiResourceProperty(ignored = AnnotationBoolean.TRUE)
public List<Key<CourseYear>> getCourseYears() {
return courseYears;
}
#ApiResourceProperty(ignored = AnnotationBoolean.TRUE)
public void setCourseYears(List<Key<CourseYear>> courseYears) {
this.courseYears = courseYears;
}
#ApiResourceProperty(ignored = AnnotationBoolean.TRUE)
public void addCourseYear(Key<CourseYear> courseYearRef){
courseYears.add(courseYearRef);
}
#Load
#ApiResourceProperty(ignored = AnnotationBoolean.TRUE)
List<Key<CourseYear>> courseYears = new ArrayList<Key<CourseYear>>();
...
}
I am debugging this on the debug server using the API explorer. I have found that it will generally work at the start for a few times but if I leave and return to the API and try and run it again, it will not start working again after that.
Does anyone have any idea what might be going wrong?
Many thanks.
You might want to reduce the amount of queries you send to the datastore. Try something like this:
Course course = ofy().load().type(Course.class).filter("name", name).first().now();
ArrayList<CourseYear> courseYears = new ArrayList<CourseYear>();
List<Long> courseIds = new List<>();
for(Key<CourseYear> courseYearKey: course.getCourseYears()){
courseIds.add(courseYearKey.getId());
}
Map<Long, Course> courses = ofy().load().type(CourseYear.class).ids(courseIds).list();
// add all courses from map to you courseYears list
I also strongly recommend a change in your data structure / entities:
In your CourseYears add a property Ref<Course> courseRef with the parent Course and make it indexed (#Index). Then query by
ofy().load().type(CourseYear.class).filter("courseRef", yourCourseRef).list();
This way you'll only require a single query.
The two most likely candidates are:
Eventual consistency behavior of the high replication datastore. Queries (ie your filter() operation) always run a little behind because indexes propagate through GAE asynchronously. See the GAE docs.
You haven't installed the ObjectifyFilter. Read the setup guide. Recent versions of Objectify throws an error if you haven't installed it, so if you're on the latest version, this isn't it.

SQL Server Session Serialization in ASP.Net MVC

I am new to ASP.Net MVC . Any help is greatly appreciated in resolving my problem.
I am using a LINQToSQL db in my MVC application. For one of the auto generated partial class (Example MyClass assume for table MyClass) , I created another Partial class as MyClass and added DataAnnotations Like following...
namespcae NP
{
[MetadaType(typeof(myData))]
[Serializable()]
public partial class MyClass
{
}
public myData
{
[Required]
public string ID { get ; set ;}
// Other properties are listed here
}
}
In my controller class example MyHomeController
I have a code as follows:
List<MyClass> list = new List<MyClass>();
list = dbContext.StoredProcedure(null).ToList<MyClass>()
session["data"] = list.
above code works fine if I use inProc session state. But if I use SQLServer mode then I get error as
"Unable to serialize the session state. In 'StateServer' and
'SQLServer' mode, ASP.NET will serialize the session state objects,
and as a result non-serializable objects or MarshalByRef objects are
not permitted. The same restriction applies if similar serialization
is done by the custom session state store in 'Custom' mode. "
Can anyone tell me what I am doing wrong here..?. I can see the data is getting populated in ASPState database tables. By application throws error as follows.
Just mark as Serializable all classes whose instances you want to store in Session.
Finally I was able to resolve the issue.
Solution:
Add the below statement before querying the database. In my case I was calling LinqToSQl context( dbContext).
dbContext.ObjectTrackingEnabled = false;
Sample Code:
List empList = new List();
dbContext.ObjectTrackingEnabled = false;
empList = dbContext.SomeStoredProcedure().ToList()
Session["employee"] = empList.

Resources