I have TTL index in collection fct_in_ussd as following
db.fct_in_ussd.createIndex(
{"xdr_date":1},
{ "background": true, "expireAfterSeconds": 259200}
)
{
"v" : 2,
"key" : {
"xdr_date" : 1
},
"name" : "xdr_date_1",
"ns" : "appdb.fct_in_ussd",
"background" : true,
"expireAfterSeconds" : 259200
}
with expiry of 3 days. Sample document in collection is as following
{
"_id" : ObjectId("5f4808c9b32ewa2f8escb16b"),
"edr_seq_num" : "2043019_10405",
"served_imsi" : "",
"ussd_action_code" : "1",
"event_start_time" : ISODate("2020-08-27T19:06:51Z"),
"event_start_time_slot_key" : ISODate("2020-08-27T18:30:00Z"),
"basic_service_key" : "TopSim",
"rate_event_type" : "",
"event_type_key" : "22",
"event_dir_key" : "-99",
"srv_type_key" : "2",
"population_time" : ISODate("2020-08-27T19:26:00Z"),
"xdr_date" : ISODate("2020-08-27T19:06:51Z"),
"event_date" : "20200827"
}
Problem Statement :- Documents are not getting removed from collection. Collection still contains 15 days old documents.
MongoDB server version: 4.2.3
Block compression strategy is zstd
storage.wiredTiger.collectionConfig.blockCompressor: zstd
Column xdr_date is also part of another compound index.
Observations as on Sep 24
I have 5 collections with TTL index.
It turns out that data is getting removed from one of the collection and rest of the collections remains unaffected.
Daily insertion rate is ~500M records (including 5 collections).
This observation left me confused.
TTL expiration thread run on single. Is it too much data for TTL to expire ?
Related
I started using ElasticSearch in my ReactJS project but I'm having some troubles with it.
When I search, I'd like to have my results ordered based on this table
Full-Hit
Start-hit
Sub-hit
Fuzzy-hit
category
1
3
5
10
trade name
2
4
6
11
official name
7
8
9
12
The definition (the way I see it, unless I'm wrong) are like this:
Full-hit
examples:
Term "John" has a full-hit on "John doe"
Term "John" doesn't have a full-hit on "JohnDoe"
Start-hit
examples:
Term "John" has a start-hit on "Johndoe"
Term "Doe" doesn't have a start-hit on "JohnDoe"
sub-hit
examples:
Term "baker" has a sub-hit on "breadbakeries"
Term "baker" doesn't have a sub-hit on "de backer"
fuzzy-hit
From my understanding fuzzy-hit is when the queried word has 1 mistake or 1 letter is missing
examples:
Term "bakker" has a fuzzy-hit on "baker"
Term "bakker" doesn't have a fuzzy-hit on "bakers"
I found out that we can boost fields like this
fields = [
`category^3`,
`name^2`,
`official name^1`,
];
But that is not based on the full-, start-, sub-, or fuzzy-hit
Is this doable in ReactJS with Elasticsearch?
I need to understand your problem.
In a nutshell
1."If a full-hit is found in the category field, then we should boost it by 1".
If a full-hit is found in the official_name field we should boost by 7..
and so on for all the 12 possibilities?
If this is what you want, you are going to need 12 seperate queries, all covered under one giant bool -> should clause.
I won't write out the query for you, but I will give you some pointers, on how to structure the 4 subtypes of the queries.
Full Hit
{
"term" : {"field" : "category/tradE_name/official_name", "value" : "the_search_term"}
}
Start-hit
{
"match_phrase_prefix" : {"category/trade_name/official_name" : "the search term"}
}
Sub-hit
{
"regexp" : {
"category/official/trade" : {"value" : "*term*"}
}
}
Fuzzy
{
"fuzzy" : {
"category/trade/official" : {"value" : "term"}
}
}
You will need one giant bool
{
"query" : {
"bool" : {
"should" : [
// category field queries, 4 total clauses.
{
}
// official field queries, 4 clauses, to each clauses assign the boost as per your table. that's it.
]
}
}
}
To each clause, assign a boost as per your table.
That;s it.
HTH.
Im a student just starting out on NoSQL and its just not clicking with me. im a little confused on a few points.
Any help would be greatly appreciated
1.Can documents belong to multiple collections?
2.Have I the correct syntax here for creating the Collection?
The pic is the collection er and a is just a snippet of the full er.
db.Animal.insert ( {
“animal_ID” : “XXXXXXX “,
“common_name” : “Red Squirrel”,
“IUCN” : “Least Concern (declining)”,
“photo” : “qs451xkx6qf4j”,
“extinct” : {
“when” : “null “,
“reason” : “null”
},
“invasive” : {
“threat_level” : “null”,
“threat” : “null”,
“how_to_help” : “null”
},
“native” : {
“endangerment” : “population declining“,
“how_to_help” : “providing a little extra food, planting some red squirrel-friendly shrubs and reporting any red or grey squirrel activity”
},
“Fact_sheet” : “{
“fact_id” : “ “,
“animal_id” : “ XXXXXXX “,
“order” : “ Rodentia “,
“family” : “Sciuridae “ ,
“species” : “Sciurus vulgaris “ ,
“size” : “body length 19 to 23 cm, tail length 15 to 20 cm “ ,
“weight” : “250 to 340 g “ ,
“lifespan” : “3 years , 7 to 10 in captivity “ ,
“extra” : “In Norse mythology, Ratatoskr is a red squirrel who runs up and down with messages in the world tree, Yggdrasil, and spreads gossip “ ,
“habitat” : { [
“name” : “woodland “,
“description” : “a low-density forest forming open habitats with plenty of sunlight and limited shade “
]
});
Can documents belong to multiple collections?
In MongoDB, no. In other databases, I don't know.
2.Have I the correct syntax here for creating the Collection?
To create a collection you would use https://docs.mongodb.com/manual/reference/method/db.createCollection/. This call also permits you to pass various collection options.
You are inserting a document. In MongoDB when a document is inserted, if the destination collection doesn't exist, it is created automatically by the server.
Below is the firebase Database of a child node of a particular user under the "users" node:
"L1Bczun2d5UTZC8g2LXchLJVXsh1" : {
"email" : "orabbz#yahoo.com",
"fullname" : "orabueze yea",
"teamname" : "orabbz team",
"total" : 0,
"userName" : "orabbz#yahoo.com",
"week1" : 0,
"week10" : 0,
"week11" : 0,
"week12" : 0,
"week2" : 0,
"week3" : 17,
"week4" : 0,
"week5" : 20,
"week6" : 0,
"week7" : 0,
"week8" : 0,
"week9" : 10
},
IS there a way to add up all the values of Weeks 1 down to week 12 and have the total sum in the total key?
I am curently thinking of bringing all the values of week 1 - week 12 brought into the angular js scope then adding up the values and then posting the total back in the firebase databse key total. But this sounds too long winded. is there a shorter solution?
As far as I know, Firebase DB doesn't have any DB functions as you'd have in SQL. So the options you have is to get the data and calculate it in Angular as you say. Or update a counter when the weeks are added to the user (when writing to the DB). Then just read the counter later
Is there a juttle program I can run to view all unique fields within a given query? I'm trying comb through a list of events of the same type that have a ton of different fields.
I know I could just use the #table sink and scroll right but, I'd like to view unique fields in a list if possible.
Hacky but works:
events -from :5 minutes ago: -to :now: | head 1 | #logger -display.style 'pretty'
You get:
{
"bytes" : 7745,
"status" : "200",
"user_agent" : "Mozilla/5.0 (iPhone; CPU iPhone OS 511 like Mac OS X) AppleWebKit/534.46 (KHTML like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3",
"version" : "1.1",
"ctry" : "US",
"ident" : "-",
"message" : "194.97.17.121 - - [2014-02-25T09:00:00-08:00] \"GET /black\" 200 7745 \"http://google.co.in/\" \"Mozilla/5.0 (iPhone; CPU iPhone OS 511 like Mac OS X) AppleWebKit/534.46 (KHTML like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3\"",
"auth" : "-",
"verb" : "GET",
"url" : "/black",
"source_host" : "www.jut.io",
"referer" : "http://google.co.in/",
"space" : "default",
"type" : "event",
"time" : "2014-12-11T23:46:21.905Z",
"mock_type" : "demo",
"event_type" : "web"
}
You can use the split proc in combination with reduce by to get this list.
emit -limit 1
|(
put field1 = 1, field2 = 2;
put field2 = 2, field3 = 3;
)| split // break each point into one point for each field, assigning each field name into the point's name field
| reduce by name // get unique list of name field values
| sort name
| #logger
{"name":"field3"}
{"name":"field2"}
{"name":"field1"}
==============================================================
I'm querying a database containing entries as displayed in the example. All entries contain the following values:
_id: unique id of overallitem and placed_items
name: the name of te overallitem
loc: location of the overallitem and placed_items
time_id: time the overallitem was stored
placed_items: array containing placed_items (can range from zero: placed_items : [], to unlimited amount.
category_id: the category of the placed_items
full_id: the full id of the placed_items
I want to extract the name, full_id and category_id on a per placed_items level given a time_id and loc constraint
Example data:
{
"_id" : "5040",
"name" : "entry1",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [],
}
{
"_id" : "5041",
"name" : "entry2",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [
{
"_id" : "5043",
"category_id" : 101,
"full_id" : 901,
},
{
"_id" : "5044",
"category_id" : 102,
"full_id" : 902,
}
],
}
{
"_id" : "5042",
"name" : "entry3",
"loc" : 1,
"time_id" : 20121001,
"placed_items" : [
{
"_id" : "5045",
"category_id" : 101,
"full_id" : 903,
},
],
}
The expected outcome for this example would be:
"name" "full_id" "category_id"
"entry2" 901 101
"entry2" 902 102
"entry3" 903 101
So if placed_items is empty, do put the entry in the dataframe and if placed_items containts n entries, put n entries in dataframe
I tried to work out an RBlogger example to create the desired dataframe.
#Set up database
mongo <- mongo.create()
#Set up condition
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "loc", 1)
mongo.bson.buffer.start.object(buf, "time_id")
mongo.bson.buffer.append(buf, "$gte", 20120930)
mongo.bson.buffer.append(buf, "$lte", 20121002)
mongo.bson.buffer.finish.object(buf)
query <- mongo.bson.from.buffer(buf)
#Count
count <- mongo.count(mongo, "items_test.overallitem", query)
#Note that these counts don't work, since the count should be based on
#the number of placed_items in the array, and not the number of entries.
#Setup Cursor
cursor <- mongo.find(mongo, "items_test.overallitem", query)
#Create vectors, which will be filled by the while loop
name <- vector("character", count)
full_id<- vector("character", count)
category_id<- vector("character", count)
i <- 1
#Fill vectors
while (mongo.cursor.next(cursor)) {
b <- mongo.cursor.value(cursor)
order_id[i] <- mongo.bson.value(b, "name")
product_id[i] <- mongo.bson.value(b, "placed_items.full_id")
category_id[i] <- mongo.bson.value(b, "placed_items.category_id")
i <- i + 1
}
#Convert to dataframe
results <- as.data.frame(list(name=name, full_id=full_uid, category_id=category_id))
The conditions work and the code works if I would want to extract values on an overallitem level (i.e. _id or name) but fails to gather the information on a placed_items level. Furthermore, the dotted call for extracting full_id and category_id does not seem to work. Can anyone help?