Cloud based NoSQL database service for sensor data [closed] - database

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We are new to NoSQL and now are starting on a project that aims to record sensor data from many different sensors, each recording a timestamp - value pair, into a cloud based database. The amount of sensors should scale, so the solution should be able handle the sizes of hundreds of millions or possibly even billion(s) writes a year.
Each sensor has its own table with key(timestamp) - value and sensor metadata is in its own table.
The system should support search functions such as the most recent values (fast data retrieval) of certain sensor types and values from time frame of sensors in certain areas (from metadata).
So the question is which cloud database service would be most suited to our needs?
Thanks in advance.

Couchbase is a great option for this type of use case.

Try Apache Cassandra. DataStax provide easy to install packages that includes some useful extras.

I wholeheartedly agree with #Ben that this isn't an answerable question; nevertheless, I would at least consider the reasons for choosing a simple k/v type store over a typical RBDMS. It sounds like this data will likely be aggregated and counted; an RBDMS will typically answer those questions very quickly with correct indexing. 1B writes/yr (or even 30B/yr) is really not that high.

Related

How can I geocode (get latitudes and longitudes) millions of addresses? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have a dataset with millions of U.S. addresses. I would like to geocode this dataset. Yahoo had an API with the most generous rate limit (50K per day, still too low for my purposes), but this is defunct. I don't think any API, unless I can do over 100K requests per day, will suit my needs.
Is there any simple-to-configure software I can download to do this from my own computer?
In particular, to those who have experience with it, will
http://www.datasciencetoolkit.org/developerdocs#setup
suit my needs?
Would an API that supports millions of requests per day suit your needs?
There are few services which do this. In particular, LiveAddress by SmartyStreets can handle that kind of load and is actually built for it. You can upload files (like Excel or CSV, etc, zipped up esp. if you have that many) or query the API (each request can support 100 addresses).
So while the program doesn't get downloaded to your computer, it will actually be faster than a localized, in-house solution, because it scales up and when the load is high. LiveAddress is geo-distributed and is powered by RAM drive servers which spin up more nodes when there's lots of work to do. LiveAddress is known for handling millions of addresses quickly (like in a few hours).
I work at SmartyStreets. We kind of dare you to see how fast you can legitimately query the API or upload and process all your lists. There's plenty of sample code on GitHub for the API or you can (programmatically or manually) upload your list files for batch geocoding.

Designing a database without knowing the details of the data? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Are there cases where database designers are not allowed to know the details of the data? I am looking for real-world examples to learn from — please.
I can't help but tell a story about database nightmares. One of the worst was when Amazon was first growing. Initially they only sold books, then expanded to music, and then to many other things.
For a period of about two years, Amazon would announce a new market every two or three months -- children's clothing, housewares, garden supplies, food, and so on. The database folks were tasked with developing and supporting the systems for the product lines. However, Amazon considered the new product announcements to be highly, highly secret.
In particular, the data warehouse people would be kept further from the loop. Sometimes, they would find out about a new line of business by reading news -- and then have to support it in the data warehouse.
So, they had to develop a flexible database to meet unannounced business needs.
In any business environment, there are new needs that arise. I would suggest a book such as Ralph Kimball's "Data Warehouse Toolkit" for more background on how to develop a fairly robust system.
I am currently working at a company that stores very private personal information. I am not allowed access to the production database. For our development and test environments, we replace all names, addresses, and other personal information with randomly generated information.
Yes, I've often seen databases allow for custom data to be defined by the user. The basic approach is to design a meta data system for your database. Then allow entities associations with custom fields. You wouldn't want to do this for all your data, otherwise you'll just end up with a database in a database, but for dynamically adding a number of custom fields this approach works well.

In what ways can data be store? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I was asked to write a report on different data storage types.
Data can be stored in
Text files.
Different possible data bases:
Oracle db
Microsoft SQL Server
DB2
MySql
PostgreSql
SqLite
excel sheet.
Microsoft access.
Proprietary database.
I was able to gather a little information on this, any help can be appreciated. Please!!
In what ways can data be stored, so that it can be queried using a programming language and data can be extracted by using a programming language.
The real answer is, any structure that persists between application sessions. This includes flat files (text, csv, xml, etc.) and RDBMS (Relational Database Management Systems).
MySql/DB2/Oracle/SQL Server, these are all RDBMS'. Excel sheet, text files, etc. these are flat files.
Each has their own place. For high performance and a lot of Online Transaction Processing (OLTP) you'll want to go with a full-blown RDBMS. For small data that isn't often written to, something like an XML file would suffice.
What you're asking is a gigantic topic that many devote a large portion of their professional careers with. It's impossible to give you an all-encompassing lesson on these.

Do you know some good resources for learning NoSQL databases? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I consider to use some NoSQL database in one of my project. Do you know some good starting points for a newbie in this topic?
Pick your particular "NoSQL" database first -- or at least type of "NoSQL" -- I'm going with the assumption that there is a reason why you want "NoSQL". Do you need object graph traversal? Explicit distributed clustering? Fast write/append? Dumb key/value associations? The selection should be based off more than a "I want something NoSQL" as different approaches can offer significant advantages (along with significant drawbacks) :-)
And, as often, google/wikipedia are a good place to start:
http://en.wikipedia.org/wiki/NoSQL
http://nosql-database.org/ has a long list of alternative databases, grouped in categories by type of technology. It has links to each product's website, lists of books, and forums, news, etc. about NoSQL.
Also see http://nosqlsummer.org/city/krakow. This is the Kraków chapter of a reading club for studying NoSQL concepts. I see from your profile that you live in Kraków.

What BASE database development applications are available? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
What applications/IDEs are out there to develop BASE database systems from?
BASE systems (Basically Available,
Soft state, Eventually consistent) are
an alternative to RDBMS, that work
well with simple data models holding
vast volumes of data. Google's
BigTable, Dojo's Persevere, Amazon's
Dynamo, Facebook's Cassandra are some
examples.
It seems like you are looking into the recently popular NoSQL moniker for "databases". Which also includes MongoDB, Voldemort (must not be named), Hbase, Tokyo Cabinet, and CouchDB. There are a lot of them. I am not sure what your question is?
Each one has its own advantages, implementation difficulty, and performance differences. Although they are all designed to scale. There are some good articles on highscalability.com, http://highscalability.com/blog/tag/nosql
Then there are the systems that are designed to enhance and scale searching from traditional databases (i.e. MySQL). For example, Solr based on Lucene. That's more geared towards full text searching. That falls in the "eventually consistency" since it synchronise with the database periodically.

Resources