Why is mongoDB a database server? - database

I've recently begun learning mongoDB and came across the phrase "mongoDb is a database server."
There was no explanation, and I'm a little confused because servers simply host resources and have an IP address assigned to them.
Pretty sure that I am confused because I don't know enough of anything.
But in summary can someone explain what is meant by "mongoDB is a database server".
I would greatly appreciate it. I've been looking all morning and haven't found a "suitable" answer

The term "server" is ambiguous.
On one hand a "server" can be a (typically big) computer, which runs some software. Typical examples are database or web server. See Server computing
On the other hand a "server" can be a software. Usually this software does not do anything, until a client software connects to it and ask for some activities. Typical example is a web-server and a browser as client. Server and client software can run on the same computer, this is no problem at all. See Client–server model
For better distinction some people use "server" for computer hardware and "service" for the software item. But in general these term are not used consistently and mixed with each other.

Related

What is the role of 'server' in an embedded server dbms? - Confusion over the many 'server'-related terms that pop up when looking for a suited rdbms

I'll give you a bit of background because I don't think my question is clear without it. Aside from that, I don't know much about servers but I think it'll become clear what I'm actually asking because of the background information.
I was/am building a small C++ program to be used by just me (a homework manager, which needed to keep track of tasks, so it relied on tasks and subtasks and needed multiple tables, etc.), so I figured I needed a database. I quickly stumbled upon SQLite, which was perfect for my case in many ways: -it's free, -it only uses .db files which can be interpreted by any software, -it can be embedded, -it's simple (in terms of documentation and libraries), and most importantly: -it is what SQLite.org describes as 'serverless'.
However, I found SQLite's dynamic type system extremely annoying ('why' is besides the point; I might make seperate posts asking questions about this) and I decided to look for a rdbms that has all the pros I mentioned above but also has static typing.
While going down this rabbit-hole of looking for a rdbms to fit my needs, I came across many terms which were all related to how the rdbms is implemented regarding the term 'server' and the like. All the terms are very vagues and one word does not mean the same every instance.
I noticed all of these keywords and contrasts popping up during my search:
Stand-alone vs server/client
Embedded vs... 'not embedded', I guess(?)
Classic serverless vs neo-serverless
Serverless... but in reality cloud-based (I thought clouds were servers(?))
Server vs service
Service vs application
User vs client
I'm as far as to know that a server is a proces that is executing on the background, not to be used by the user directly. But other than that, all these server-related terms are throwing me off.
I want a rdbms that has this 'serverless'ness SQLite.org speaks of. I saw many professional free SQL RDBMS providers which spoke of the ability to have 'embedded servers', which does contain the word 'embedded' but it also contains 'server'. So my question is: when a providers speaks of these 'embedded servers', what does it really mean?
Does it mean that there is one application, and when it runs it opens another application which functions as a server? And when it does so, is that server a service or just another normal application-like process? Or, does it work exactly the same like the serverlessness SQLite mentions, that being: the libraries inside of the compiled project already handle everything to work with only .db files? Does it need any files other than the database and the executable? Does the communication between the application and database file directly come from the code or is another proces used?
(PS: as a side-question: could you help me clear up what all the terms in the list above precisely mean?)
I realise my question might be all over the place, but so is the vocabulary I've come across in this journey. I hope you can understand where my confusion is coming from and can help me clear these points up. Thanks in advance.
By "serverless" SQLLite just means that it's a library, and doesn't run in a seperate process. In this it's like Access/Jet and other older DMBS programs that read and write files directly. The more common term for this is "embedded database".
The more common meaning of "serverless" these days is a cloud-based capability that doesn't require you to install or manage a "server" or VM. As in "We use Azure Functions for serverless compute".
The other DMBS systems are typically called "Client-Server DBMS", where the DMBS runs in a seperate process and the client program communicates with it over a network or some RPC mechanism. Client-Server RDBMS systems can be "bundled" or "embedded" with an application, and may not be running on a seperate computer, but would still be running in a seperate process.

Building Erlang Client for Couch Base Server 1.8 and 2.0

We have used Couchbase Server in our product. Its an Intranet application whose front end is pure JavaScript. We however use Erlang/OTP for the Business Logic, authentication (Mnesia), yaws web server and a bunch of other erlang libraries.Now, we are still using the Couch Base Single Server whose download has been removed from the Couch base site. We have found it very stable. In now, 5 months of running live, it has never gone down. We are running it on top of Ubuntu Server. So, our interest in NoSQL is just beginning. However, as i asked a question and another here about Erlang Client support for Couch base server, i discovered that they say:
Couchbase Server is memcached compatible. This means many existing memcached client libraries and in many cases, the applications already using these libraries, may be used directly with Couchbase Server
So i then started looking around for these memcached compatible libraries and have found a bunch of them: at Google code, Erlang Mc,erlmc, mcache, memcached-client and finally OneCached By Process One (Makers of Ejjabberd XMPP Server). With my great aim (if possible), of implementing my own client for Couch Base server 1.8 and 2.0 , the question follows:
1 . Which of the above memcached Erlang Client libraries is appropriate for use with Couchbase 1.8 and 2.0 ?2. If it is compatible, can i directly use it, or i have to make some changes first ? please do explain the changes ? 3. Is anyone out there feeling the need for Erlang Client support for Couch Base server 2.0 and 1.8 as we do ? How are they working their way around this problem ? I would appreciate it, if a Couch Base insider having membership here on stackoverflow, do tell us if Couch base team has plans of building us an Erlang Client possibly in any near future so that we do not waste our time attempting so, as they are in position of building a much better and efficient client to their own server, than we can. Thanks to all
Couchbase doesn't have any plans to release an erlang client in the near term. We use Erlang in our product and really like Erlang, but don't have time to put together an Erlang client at the moment. If you are interested in developing an Erlang client we are certainly happy to help and will answer any questions you might have. If you send me an email (see my profile) I will help get you in contact with someone at Couchbase who can help answer questions and get you started on development.
Also, I am not an Erlang user so I am unable to answer any of your questions related to memcached Erlang libraries. Hopefully someone who has can help you.
I have tried erlmc. I make heavy use of it for storing 32K binaries and it has worked great so far.

a system design question

I was asking the following question during interviewing in a company working on cloud computing, and did not answer well. Any suggestions on how to analyze this question will be greatly appreciate.
Our company has hundreds of millions of users and we expect zero down time in production, explain techniques and programming practices that help improve redundancy and fail-over capabilities for front-end, middle-tier and back-end services including database services.
This question is very much along the lines of the "Impossible Question" from Joel. There is no right answer to this question.
I would start breaking this down into a list of all possible failure points:
Database Server
Database
Middle Tier
Middle Tier Server
Application
Web Server
Then for each one of them, I would identify reasons for breakage, and how to recover from it without having downtime. The ones that I do not know the answers to, I would profess to as much.
For example, Let's build a list of reasons a Database server goes down. Since we are looking for 100% uptime, we ignore nothing - no matter how far fetched
Hardware goes bad
Power goes down
Network card goes bad
Operating System unexpectedly crashes
O.S. Upgrades break system
Dumb System Admin or DBA
Dumb Janitor
Some Possible solutions (considering SQL Server on Windows back-end)
Lock on door
Database Mirroring (with regular failover testing)
Multiple NICS
Clustering (with regular failover testing)
Get better people
You can basically keep answering this question until the interviewer throws in the towel because there really isn't the One-Right-Answer to this question.
That's a pretty broad question. If they expect zero downtime, tell them to forget about it or turn all of their profits over to building redundancy. Now, if they just want "five 9's, or 99.999% uptime" then we can talk. :)
You can usually answer these kinds of questions with the usual canned blather about building a sustainable, automatic, build environment that includes extensive unit testing. Using design patterns like MVC or similar can help with testability. Perform regular security audits. This is much bigger than just a development question, this is a question about network and server architecture, maintaining secondary and tertiary data centers, etc. These kinds of question really give you a chance to make the interviewer feel important.

Architecture for a machine database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
This might be more of a serverfault.com question but a) it doesn't exist yet and b) I need more rep for when it does :~)
My employer has a few hundred servers (all *NIX) spread across several locations. As I suspect is common we don't really know how many servers we have: more than once I've been surprised to find a server that's been up for 5 years, apparently doing nothing but elevating the earth's temperature slightly. We have a number of databases that store bits of server information -- Puppet, Cobbler, Nagios, Cacti, our load balancers, DNS, various internal spreadsheets and so on but it's all very disparate, incomplete and overlapping. Maintaining this mess costs time and money.
So, I'd like to come up a single database which holds details of what each server is (hardware specs, role, etc) and replaces (or at least supplies data for) the databases mentioned above. The database and web interface are likely to be a Rails app as this is what I have most experience with. I'm more of a sysadmin than a coder.
Has this problem already been solved? I can't find any open source software that really fits the bill and I'm generally not too keen on bloaty, GUI vendor-supplied solutions.
How should I implement the device information collection bit? For instance, it'd be great to the database update device records when disks are added or removed, or when the server serial number changes because HP replace the board. This information comes from many different sources: dmidecode, command-line disk tools, SNMP against the server or its onboard lights-out card, and so on. I could expose all this through custom scripts and net-snmp, or I could run a local poller that reported the information back to the central DB (maybe via a RESTful interface or something). It must be easily extensible.
Have you done this? How? Tell me your experiences, discoveries, mistakes and recommendations!
This sounds like a great LDAP problem looking for a solution. LDAP is designed for this kind of thing: a catalog of items that is optimized for data searches and retrieval (but not necessarily writes). There are many LDAP servers to choose from (OpenLDAP, Sun's OpenDS, Microsoft Active Directory, just to name a few ...), and I've seen LDAP used to catalog servers before. LDAP is very standardized and a "database" of information that is usually searched or read, but not frequently updated, is the strong-suit of LDAP.
My team have been dumping all out systems in to RDF for a month or two now, we have the systems implementation people create the initial data in excel, which is then transformed to N3 (RDF) using Perl.
We view the data in Gruff (http://www.franz.com/downloads.lhtml) and keep the resulting RDF in Allegro (a triple store from the same guys that do Gruff)
It's incredibly simple and flexible - no schema means we simply augment the data on the fly and with a wide variety of RDF viewers and reasoning engines the presentation options are enless.
The best part for me? no coding, just create triples and throw them in the store then view them as graphs.
The collection of detailed machine information is a very frustrating problem (many vendors want to keep it this way). Even if you can spend a large amount of money, you probably will not find a simple solution to this problem. IBM and HP offer products that achieve what you are seeking, but they are very, very, expensive, and will leave a bad taste in your mouth once you realize that probably all you needed was 40-50% of the functionality they offer. You say that you need to monitor *Nix servers...most (if not all) unices support RFC 1514 (windows also supports this RFC as of windows 2000). The Host MIB support defined by RFC 1514 has its drawbacks however. Since it is SNMP based, it requires that SNMP be enabled on the machine, which is typically not the default for unix and windows machines. The reason for this is that SNMP was created before the entire world was using the Internet, and thus the old, crusty nature of its security is of concern. In many environs, this may not be acceptable for security reasons. However, if you are only dealing with machines behind the firewall, this might not be an issue (I suspect this is true in your case). Several years ago, I was working on a product that monitored hundreds of unix and windows machines. At the time, I did extensive research into the mechanics of how to acquire detailed information from each machine such as disk info, running processes, installed software, up-time, memory pressure, CPU and IO load (Including Network) without running a custom client on each machine. This info can be collected in a centralized fashion. As of three or four years ago, the RFC-1514 Host MIB spec was the only "standard" for acquiring detailed real-time machine info without resorting to OS-specific software. Sun and Microsoft announced a WebService based initiative many years ago to address some of this, but I suspect it never received any traction since I cannot at the moment even remember its marketing name.
I should mention that RFC 1514 is certainly no panacea. You are at the mercy of the OS-provided SNMP service, unless you have the luxury of deploying a custom info-collecting client to each machine. The RFC-1514 spec dictates that several parameters are optional, and if your target OS does not implement it, then you are back to custom code to provide the information.
I'm contemplating how to go about this myself, and I think this is one of the key pieces of infrastructure that not having around keeps us in the dark ages. Hopefully this will be a popular question on serverfault.com. :)
It's not just that you could install a single tool to collect this data, because that's not possible cheaply, but ideally you want everything from the hardware up to the applications on the network feeding into this thing.
I think the only approach that makes sense is a modular one. The range of devices and types of information is too disparate to come under a single tool. Also the collection of data needs to be as passive and asynchronous as possible - the reality of running infrastructure means that there will be interruptions and you can't rely on being able to get the data at all times.
I think the tools you've pointed out form something of an ecosystem that could work together - Cobbler can install from bare-metal and hand over to Puppet, which has support for generating Nagios configs, and storing configs in a database; for me only Cacti is a bit opaque in terms of programmatically inserting new devices, templates etc. but I know this is possible.
Ultimately you have to sit down and work out which pieces of information are important for the business you work for, and design a db schema around that. Then, work out how to get the information you need into the db, whether it's from Facter, Nagios, Cacti, or direct snmp calls.
Since you asked about collection of data, I think if you have quite disparate kit (Dell, HP etc.) then it makes sense to create a library to abstract away as much as possible the differences between them, so your scripts just make standard calls such as "checkdiskhealth". When you add new hardware you can add to the library rather than having to write a completely new script.
Sounds like a common problem that larger organizations would have. I know our (50 person company) sysadmin has a little access database of information about every server, license, and piece of hardware installed. He's very meticulous, but when it comes time to replace or repair hardware, he knows everything about it from his little db.
You and your organization could sponsor an open source project to get oyu what you need, and give back to the community so that additional features (that you may not need now) can be developed at no cost to you.
Maybe a simple web service? Just something that accepts a machine name or IP address. When the service gets input, it sticks it in a queue and kicks off a task to collect the data from the machine that notified it. The nature of the task (SNMP interrogation, remote call to a Perl script, whatever) could be stored as part of the machine information in the database. If the task fails, the machine ID stays in the queue and the machine is periodically re-polled until the information is collected. Of course, you also have to have some kind of monitor running on your servers to notice that something has changed and send the notification; hopefully this is easily accomplished with whatever server monitoring software you've already got in place.
There are some solutions from the big vendors for managing monstrous sets of machines - such as some of the Tivoli stuff from IBM. That is probably, however, overkill for mere hundreds of machines.
There are some free software server database solutions but I do not know if they provide hooks to update information automatically from the machines with dmidecode or SNMP. One I heard about (but no personal experience, sorry), is GLPI.
I believe you are looking for Zabbix. It's open source, easy to install and use.
I've installed for a client a few years ago, and if I remember right it has a client application that connects to the zabbix server to update it with the requested information.
I really recommend it: http://www.zabbix.com
Checkout Machdb Its an opensource solution to the problem you are describing.

Is there any way to impersonate a specific database engine while running another one?

This is something I would like to see while doing my day today programming works, But I've never seen such application yet. You input is highly appreciated.
Lets say we have an application that needs MSSQL server as DBMS. And suppose you just need to install it and do something. (i.e You are not going to deply it in production servers etc.)
In such a case it might be an overhead, of installing MSSQL first. I am suggesing something like a software bridge that can use another DBMS to store data. In other words the application "sees" an MSSQL instance but underneath that it might be Access. The bridge sholud do some sort of conversion.
Another example : You have MSSQL but a certain application needs Oracle. You have to purchase Oracle then. But with a something like a bridge, You can put information into your MSSQL DBMS. The bridge listens to port 1521 like Oracle so application "Thinks" there is an oracle installation.
Is it an idea that cannot be implemented?
Are there any such applications?
If so what are they?
Thanks... :)
Adding a Clarification : The application might be from a third party. You don't have any knowledge on internal architecture of that. you just know it uses a certain DBMS. I am trying to use a different DBMS other than the third party software needs.
Application usually don't depend on a specific database server, OR they depend on it for a reason.
If an application asks for oracle, or sql server, or whatever, it's because it relies on the implementation details of this specific vendor to run its SQL, its stored procedures, etc. There's no way you could emulate that with an access database, for example...
If your application just needs to run some very simple SQL (ie basic insert/select statements), it probably uses a standard driver (odbc, ado, etc.), and those drivers can accommodate every major sql database engines. In my experience, "simple applications" don't ask for a specific database vendor.
This is the problem that ODBC was supposed to solve :-) .
But in response to your questions:
Is it an idea that cannot be implemented?
It can be implemented.
It would be tedious and thankless work, and you would have a very limited audiance. In my opinion it's not worth doing.
Are there any such applications?
None that I know of.
If so what are they?
None that I know of.
......
Bringing in Chandrasekar's note in the comments section:
Have a look in a super user's perspective... He has a nice application but he can't use it without some DBMS. But still he is not a programmer to do something. So they need such a product
I agree it has applications, but it has a very limited audiance :) .
What you're proposing is something like the firefox plugin 'ietab', Only you won't have ie installed... so instead of embedding ie, you would need to entirely re-implement ie using firefox's rendering window.
Just my opinion : that's too much effort... It's simpler to just install a second database.
If this application uses ADO to connect to SQL Server and you can modify the connection string, then it's quite easy to use a different database: change the connection string! However, the other database must be able to support all features of SQL Server. Besides, the software was never tested on another database so the application might Crash & Burn.
If you can't change the connection string, or the application doesn't use ADO, things are more complicated and very close to impossible.
I've worked in the past on a project that needed to be reasonable database-independent. The database had to support stored procedures but there weren't any other restrictions. By default, we tried to support both SQL Server and Oracle. (We also supported Interbase but never advertised this.) While we did manage to keep it mostly database-independent, we did have to work around quite a few minor issues. Especially joins in our queries had some nasty problems which we just solved by adding more logic to stored procedures.
"This is the problem that ODBC was supposed to solve :-) ."
And it is the very same problem that SQL was intended to solve too.
It seems to me that the reason why this problem exists is that the world seemingly fails to agree sufficiently on what the data manipulation language/interface ought to look like.
I suspect that if this were solvable, it would already have been done.
The closest I've heard is EnterpriseDB where they have built a layer on top of Postgres so it looks more like Oracle.
But remember these databases have features covered by patents and copyright so there's a limit on how closely a competitor product can imitate the real thing.
It would probably be easier to imitate 'down' than up. For example, MS-Access wouldn't be able to imitate much of the functionality for Oracle or SQL Server, whereas there's a much better chance of SQL Server imitating a simpler DB like Access.
Applications usually DO depend on a specific database server. Every database implements things slightly differently - even MSSQL and Sybase, which have a common ancestor.
Any bridge, however well it attempts to abstract the differences, would leave some exposed. These would be likely to create subtle bugs in the application, which might appear initially to work, but then fail, or worse, corrupt data.
Moreover, the application vendor would not support you in such a case - they'd simply say they don't support that use case, and you should remove the bridge and install a proper instance of whatever database it was intended for.
In short, I don't think it's worth the risk of the application malfunctioning subtly, and being left without support, even if the application isn't especially important. If you dislike the underlying database the application uses, choose a different application.

Resources