I was recently introduced to MongoDB and going through the installation guide I came to know that we have to first run the daemon (mongod) and then we have to connect to the daemon through mongo.exe (for windows) to actually run the commands. I have noticed this is a general structure for most of the DBMS where we have to start a server and then connect to it to run the commands.
Why can't we run DBMS in a single program instance as we do with Python or Node.js? Specifically, why we need the server-client architecture for a DBMS?
Yes you can. It is not strictly needed.
What you describe is how SQLite works. Other "embedded" database engines do the same.
The main advantage of a separated server is that multiple clients can connect to it at the same time. Simultaneously writing and reading to shared data files without a central coordination process (a server) becomes more and more complex and unreliable to a point where a separated, dedicated process is easily justified.
Related
I'm in charge of fixing part of an application that syncs data entities from a DB2 database running on an iSeries to a SQL Server database on Azure. The application seems to run this synchronization just fine when the iSeries and the SQL Server host are on the same local network. However, add in the change in latency when the database host is in Azure and this process slows to an unacceptable level.
I'm looking for ideas on how to proceed. Some additional information:
Some of our clients might have millions of records in some tables.
An iSeries platform is not virtualizable/containerizable as far as I know.
Internet connection speed will vary wildly. Highly dependent on the client.
The current solution uses an ODBC driver to get data from the iSeries. I've started looking at taking a different approach as far as retrieving the data including creating API's (doesn't seem to be the best idea for transferring lots of data) or using Azure Data Services, but, those are really pieces to a bigger puzzle.
Edit: I'm trying to determine what the bottleneck is and how to fix it. I forgot to add that the IT manager (we only have one person in IT... we are a small company) has setup a test environment using a local iSeries and an instance of our software in Azure. He tried analyzing the network traffic using Wireshark and told me we aren't using tons of bandwidth, but, there seems to be alot of TCP "chatter". His conclusion was we need to consider another way to get the data. I'm not convinced and that is part of the reason I'm here asking this question.
More details:
Our application uses data from an ERP system that runs on the iSeries.
Part of our application is the sync process in question.
The process has two components: a full data sync and a delta/changes data sync.
When these processes are run when both the iSeries and our application are on site, the performance isn't great, but, it is acceptable.
The sync process is written in C# and connects to DB2 using an ODBC driver.
We do some data manipulation and verification as part of the sync process. i.e., making sure parent records exist, foreign keys are appropriate, etc.
I have a SQLite database on my local machine and my web services running on the same machine access it using SQLAlchemy like this:
engine = create_engine('sqlite:///{}'.format('mydatabase.db'), echo=True)
We are planning to host our web services on a separate machine from where the database is hosted. How can we make this 'mydabata.db' be accessible for our web services remotely for my web services? Thanks.
From SQLite when to use docs:
Situations Where A Client/Server RDBMS May Work Better
Client/Server Applications
If there are many client programs sending SQL to the same database over a network, then use a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, file locking logic is buggy in many network filesystem implementations (on both Unix and Windows). If file locking does not work correctly, two or more clients might try to modify the same part of the same database at the same time, resulting in corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
A good rule of thumb is to avoid using SQLite in situations where the same database will be accessed directly (without an intervening application server) and simultaneously from many computers over a network.
SQLite works well for embedded system or at least when you use it on the same computer. IMHO you'll have to migrate to one of the larger SQL solutions like PostgreSQL, MariaDB or MySQL. If you've generated all your queries though the ORM (SQLAlchemy) then there will be no problem migrating to another RDBMS. But even if wrote SQL queries too there should not be much problems because all these RDBMSes use very similar dialects (unlike Microsoft's T-SQL). And since SQLite is lite it supports only a subset of what other RDBMSes support so there should not be a problem.
In a server with a single postgres database, is it possible to migrate the whole database onto a different server (running the same OS, etc) without going through the usual time-consuming way of dumping and importing (pg_dump)?
After all, everything must still be in the filesystem?
Assumptions are the postgres service is not running, and the servers are running Ubuntu.
Also, if you want, you can use pg_basebackup which will connect over the network connection and request a copy of all files. This is preferable where the architecture, OS, etc. is not changing. For more complex cases, see barman which will manage this process for you.
We use sql express 2008 r2 for simple reporting and data storage. Data is written typically at the rate of 600 to 3000 records per minute. (one client one connection same machine). We need the optimal performing protocol. (good performance without hogging memory..) I read articles online and they are quite confusing when it comes to deciding between tcp/ip, shared mem and named pipes. To summarize msdn documentation
1) Shared memory has no configurable properties. Shared memory is always tried first, and cannot be moved from the top position of the Enabled Protocols list in the Client Protocols Properties list. Does this mean shared mem is preferred and fastest ?
2)For TCP/IP Sockets, data transmissions are more streamlined and have less overhead. Data transmissions can also take advantage of TCP/IP Sockets performance enhancement mechanisms such as windowing, delayed acknowledgements, and so on. ok. But is it faster than shared mem ?
3)If the server application is running locally on the computer running an instance of Microsoft® SQL Server™ , the local Named Pipes protocol is an option. Local named pipes runs in kernel mode and is extremely fast. When I read this I could not take the confusion anymore and decided to post my question on stack.
So sql Gurus please help me decide. thank you
Shared Memory is the fastest protocol, however if "performance" is your goal, the framework you use to access the database and how you transfer the data in, the sql query you write (or use another method) will all have a much bigger impact on performance.
If you are going to try and use ADO/ADO.NET or another heavy protocol like this most performance will be lost here. If extreme performance is what you are after you'll need to investigate and learn how to communicate with the database engine on a lower level.
See http://www.devart.com/sdac/ as a start.
If the following quote from your answer is actually correct then you your question already contains the answer:
Shared memory is always tried first, and cannot be moved from the top position of the Enabled Protocols list in the Client Protocols Properties list.
Given that your client and database are running on the same machine this means that shared memory will always work (without drastic measures that is). It doesn't matter what protocol you chooe to use - shared memory is the one that will end up being used! :-)
For what its worth the performance completely depends on the implementation used within SQL Server (I'm basing this on my knowledge of the communication methods rather than intimate knowledge of SQL Server), but the order (in terms of performance) almost certainly goes like this (fastest first):
Shared memory (backed up by the fact that SQL Server always uses this when it can)
Named pipes
TCP/IP
I can't find any citation that Shared memory actually is faster than named pipes, but TBH I don't think it matters that much - communication between client and database on a local machine will be incredibly efficient anyway, I doubt that you would be able to notice any performance difference.
Is there any way to access Redis data from relational databases, such as Oracle or SQL Server?
One use case I have in mind is ETL to a data warehouse.
Am trying to understand the question: You have data in a traditional RDBMs, and you want to extract information from here, and load into Redis? Or its the other way around?
Either way, since am not competent to talk about RDBMS, I would expect to create a program (Java in my case), which would extract information from Redis, and upload it to Oracle. There are options to interact with Redis using a Java Client library (JDBC Redis, and JRedis are examples)
You may get a better answer from the community, if you can elaborate on your question.
Well, if you use server side Java object on your ORA (and they can make REST calls, in the very least, if not socket io (don't know)) then you can call Redis from your stored procedures in Oracle.
[edit]
Should add that if you can make socket connections, then just include JRedis jar in your Oracle server's lib so server-side object can create clients.
Should that not be possible -- I would seriously question a DB that lets SProcs and triggers to open generic TCP connections -- then you are left with consuming Web Services.
JRedis doesn't support web services, but nothing is stopping you from wrapping JRedis and exposing whatever command you need as a RESTFul resource. So here, you would run Redis on server R, a java web server (Jetty/Jettison would do fine) running JRedis on server R or R`. Since Redis is single threaded, it is perfectly fine to run it on the same multi-core box as a JVM; its just a matter of resources, and it they are sufficient then you are using the loopback on the connection between Redis and JRedis and that's guaranteed to be faster than traversing network boundaries! But if the loads you require preclude colocation of Redis and JRedis (proxy), then use a second server.
And of course, you are running your DB on server D. So D <=> R` <=> R. You will pay the second hop's latency costs, of course.