Which is better, list function or query? - database

I am designing local database for flutter app. I find Hive and ObjectBox are best fit for my app.
But hive doesn't has builtin query function, you need to use list function like where, find, etc. While ObjectBox can query and return filtered list. So which one is better in term of performance.

ObjectBox is known for its improved performance over Hive, as well as over Firebase, SQFLite and Moor.
You will have to get some very large datasets and big queries in your app to easily notice the difference though.
If ObjectBox is the easiest to use for you, it's definitely the right one to choose.
Edit:
In light of #SoulCRYSIS'sfurther questioning, converting lists to json in flutter is quite easy:
var json = jsonEncode(myList.map((e) => e.toJson()).toList());

Related

Are there any nosql database could do search(like lucene) on map/reduce

I'm using cloudant which I could use mapreduce to project view of data and also it could search document with lucene
But these 2 feature is separate and cannot be used together
Suppose I make a game with userdata like this
{
name: ""
items:[]
}
Each user has item. Then I want to let user find all swords with quality +10. With cloudant I might project type and quality as key and use query key=["sword",10]
But it cannot make query more complex than that like lucene could do. To do lucene I need to normalize all items to be document and reference it with owner
I really wish I could do a lucene search on a key of data projection. I mean, instead of normalization, I could store nested document as I want and use map/reduce to project data inside document so I could search for items directly
PS. If that database has partial update by scripting and inherently has transaction update feature that would be the best
I'd suggest trying out elasticsearch.
Seems like your use case should be covered by the search api
If you need to do more complex analytics elasticsearch supports aggregations.
I am not at all sure that I got the question correctly, but you may want to take a look at riak. It offers a solr-based search, which is quite well documented. I have used it in the past for distributed search over a distributed key-value index and it was quite fast.
If you use this, you will also need to take a look at the syntax of solr queries, so I add it here to save you some time. However, keep in mind that not all of those solr query functionalities were available in riak (at least that was when I used it).
There are several solutions that would do the job. I can give my 2 cents proposing the well established MongoDB. With MongoDB you can create a text-Index on a given field and then do a full text Search as explained here. The feature is in MongoDb since version 2.4 and the syntax is well documented on MongoDB docs.

How to integrate Elasticsearch in my search

We have an ad search website and all the searches are being done through entity framework directly querying the sql server database.
It was working very well when the database had around 1000 ads, but now it is reaching 300k and lots of users searching. The searches now are very slow (using raw sql didn't help much) and I was instructed to consider Elasticsearch.
I've been some tutorials and I get the idea of how it works now, but what I don't know is:
Should I stop using sql server to store the ads and start using Elasticsearch instead? What about all the other related data? Is Elasticsearch an alternative to sql server?
Each Ad has some related data stored in different tables, how would I load it to Elasticsearch? As a single json element?
I read a lot of "billions of data" handled by Elasticsearch, so I don't think I would have performance problems with 300k rows in it, correct?
Would anybody explain me better these questions?
1- You could still use it; you don't want to search over the complete database, rigth? Just over the ads. It works with a no-sql format, so it is very scalable. It also works with json's so you have an easy form to access it.
2- When indexing data, you should try to add the complete necessary data in the same document(sql row), which is a single json, but in a limited way. Storage is cheap, but computing time isn't.
To index your data, you could either use filebeat, a program a bit similar to logstash, or create your own solution like, making a program that reads data from your db, and then passes it to elasticsearch in bulks.
3- Correct, 300k rows is a small quantity, but it also depends on the memory from where you are hosting elasticsearch.
Hope this helps.

Which NoSql for visualization extensive app

I am about to work on a app that will be showing a lot of visualizations. It is an data read-only application, there will be negligible write operations. We have a lot of data(JSON, CSV), depending on the usecase we will have to filter to a subset and send it to the UI for visualization.
What kind of NoSQL would you recommend and please do specify the reasons?Thanks!
P.S: Some of the devs are recommending ElasticSearch. I am not sure if we should go for a document store or a key-value in the first place.
If you're visualizing log data, I'd use logstash in combination with elasticsearch and kibana. There's also commercial ways to protect your data and more coming. I'm working on k3bana which will visualize data with X3DOM and D3.js. Good luck!
I used Redis (with Jedis) to store key-value pairs in one case.

Which database should I use for inhomogenous objects with large blobs and redundancy in data?

I am looking into storing data which I have up to now been storing in single files in a database. I am looking for some advice which database or type of database I should be using. Here is a list of requirements:
I need to be able to store data in a hierarchical/object oriented fashion, ie have keys like car.chassis.color = red with an arbitrary depth on the keys.
The structure of these objects is not homogeneous, some have fields that others don't.
I would like to be able to perform queries on the keys, but i do not need the ability to do joins, ie no foreign keys necessary.
Some of the values are large binary blobs (~several 10-100MB, never >1GB), and there is significant redundancy, so a built in compression would be good.
Looking for an open source solution.
I want to use it from python.
I am completely unfamiliar with anything but the most basic MySQL databases, so any pointers would be highly appreciated.
Sounds like a document database would fit your needs. The two I would look into are MongoDB and CouchDB (maybe also Couchbase).
Both Couch and Mongo allow for storing data as JSON, which meets your requirement for data with arbitrarily deep keys.
Both databases will also allow you to insert heterogeneous documents. Mongo specifically has an operator called $exists to check to see if a field exists in a given document.
I would give the nod to Mongo over Couch for ad-hoc querying. I just find it easier.
Neither supports joins well. It's possible with both with map/reduce functionality, but otherwise it's assumed you won't be doing joins.
Both support adding files. Mongo uses gridfs (http://docs.mongodb.org/manual/applications/gridfs/) and Couch uses attachments (http://wiki.apache.org/couchdb/HTTP_Document_API#Attachments).
Mongo has a Python driver (http://docs.mongodb.org/ecosystem/drivers/python/) and Couch works via HTTP, so you only need something like curl in Python.
CouchDB is getting a lot of attention lately, but Mongo has more momentum right now.

Storing and processing high data volume

Good day!
I have 350GB unstructured data disaggregated by 50-80 columns.
I need to store this data in NoSQL database and do a variety of selection and map / reduce queries filtered by 40 columns.
I would like to use mongodb, so I have a certain question: is this database able to cope with this task and what do I need to implement its architecture within the existing provider hetzner.de?
Yes, large datasets are easy.
Perhaps Apache Hadoop is also worth looking at. It is aimed at handling/analyzing large/huge amounts of data.
mongodb is a very scalable and flexible database, if used properly. It can store as much data as you need, but the bottom line is whether you can query your data efficiently.
comments:
You will need to make sure you have the proper indexes in place and that a fair amount of them can fit in RAM.
In order to achieve that you may need to use sharding to split the working set
current mapreduce is easy to use, can iterate over all your data but it is rather slow to process. It should become faster in next mongodb and there will also be a new aggregation framework to complement mapreduce.
Bottom line is that you should not take mongodb as a magical store that will be perfect out of the box, make sure you read the good docs and materials :)

Resources