I understand that some databases have native support in R (e.g. MySQL) but you can connect to other DBs like MS SQL Server using RODBC. How much speed improvement does one gain for reading/writing with the native drivers vs. RODBC? What other DBs have native drivers in R? Is reading faster or slower than writing generally?
If you're specifically interested in SQL Server, the reference below is a little bit out of date but I imagine it probably still holds.
Using ODBC with Microsoft SQL Server
Performance of ODBC as a Native API
One of the persistent rumors about ODBC is that it is inherently slower than a native DBMS API. This reasoning is based on the assumption that ODBC drivers must be implemented as an extra layer over a native DBMS API, translating the ODBC statements coming from the application into the native DBMS API functions and SQL syntax. This translation effort adds extra processing compared with having the application call directly to the native API. This assumption is true for some ODBC drivers implemented over a native DBMS API, but the Microsoft SQL Server ODBC driver is not implemented this way.
The Microsoft SQL Server ODBC driver is a functional replacement of DB-Library. The SQL Server ODBC driver works with the underlying Net-Libraries in exactly the same manner as the DB-Library DLL. The Microsoft SQL Server ODBC driver has no dependence on the DB-Library DLL, and the driver will function correctly if DB-Library is not even present on the client.
Microsoft's testing has shown that the performance of ODBC-based and DB-Library–based SQL Server applications is roughly equal.
It's an empirical question, so why don't measure it for the combination you are interested in?
Public code is not hidden, so why don't you count what other DB interfaces CRAN has? For DBI alone, we have SQLite, MySQL, Postgresql, Oracle; for custom db backends there are things like Vhayu.
Specialised forums exist, so why don't you ask on r-sig-db?
Lastly, as soon as there is an API and a need people tend to combine the two. I have written two different (at-work and hence unreleased) packages to two highly specialised and fast backends.
Related
I have an SQL Server database, and I need to push data into it through vbscript, as well as pull data into Excel. I have found multiple connection strings, but no repository for the benefits of performance and functionality comparing them. The driver options (Provider=) I have found so far are:
{SQL Server} (ODBC)
SQLOLEDB (newer than ODBC, but being deprecated?)
SQLOLEDB.1 (what Excel 2016 uses when clicking 'Get External Data', but not even mentioned on connectionstrings.com... I assume a newer version of the above, but still the deprecated technology?)
SQLNCLI11 (native client, OLE DB)
{SQL Server Native Client 11.0} (native client, ODBC)
Different things I read say that ODBC is better because it has been around longer. And that OLE DB has been around long enough to have the same advantages. And OLE DB was made to work with a certain company's applications. And ODBC was made by the same company. And OLE DB can connect to and from different kinds of applications better. And ODBC works better with databases. And Native is...Native, so must be better... because of the name?
I find multiple questions here on SO floating around with no or partial answers, or having multiple comments claiming the answers are out of date. So, as of now, what the specific differences between these different drivers? Do they have different performance in different circumstances? Do they have different features? Do I need to do profiling to determine the best performance and reliability for my particular use case, or is there a standard "best practice" recommended by Microsoft or some recognized expert? Or are they all basically doing the same thing and as long as it's installed on the target system it doesn't really matter?
ODBC-it is designed for connecting to relational databases.
However, OLE DB can access relational databases as well as nonrelational databases.
There is data in your mail servers, directory services, spreadsheets, and text files. OLE DB allows SQL Server to link to these nonrelational database systems. For instance, if you want to query, through SQL Server, the Active Directory on the domain controller, you couldn't do this with ODBC, because it's not a relational database. However, you could use an OLE DB provider to accomplish that.
http://www.sqlservercentral.com/Forums/Topic537592-338-1.aspx
Does using the ADO.NET Provider that a DB vendor wrote eliminate the need to have any database drivers installed on the machine?
I'm a bit confused on how ADO.NET actually works.
An ADO.Net provider is a database driver.
However, ADO.Net providers are (hopefully) purely managed, so they don't need any installation.
It depends on how they wrote the provider. The provider can be written to include any driver, but it could also be written to expect to talk to a driver that is installed on the machine separately.
For example, Microsoft's own Sql Server provider still expects you to have the "native client" installed on each machine. But system.data.sqlite includes all that as part of the provider for the sqlite database.
There are Ado.NET providers specific to database which are tailored versions of DB drivers.
Eg: SQLClient -tailored version for SQL server family
iAnywhere -tailored version for Sybase db.
And we have ODBC drivers in Ado.NET which is not specific rather generic driver available out of the box.
we're comparing JTDS and Microsoft SQL Server for a Java EE application running on JBoss and we're finding that JTDS is from 30% to 50% faster, benchmarking the application in a high concurrence scenario and keeping exactly the same HW/SW but changing only the driver in the datasource configuration.
While we've seen a lot of favorable options towards JTDS and so we're thinking to go for it I'm still curious:
Why is the JTDS driver so much faster?
Why Microsoft never updated its driver to be fast as JTDS?
Comparison was made using the latest JDBC 3.0 version and the latest JTDS version and using a SQL Server 2008 running on a 16 core installation with dedicated SAN.
I've done similar performance comparisons, with similar results.
There are many potential reasons for performance differences. Some of them are visible in the T-SQL generated by the driver, which you can see with SQL Profiler. Other aspects are more subtle, such as connection management and how the underlying protocol (TDS) is implemented.
I can't say for sure why MS has never updated their driver, but I suspect that part of it is because Java is considered to be a competitive product/platform.
What is the difference between SQL Server Native Client connection and ODBC connection? What are the pros and cons of these two?
Huh? ODBC is officially dead? Someone might want to let Microsoft know that:
Microsoft is Aligning with ODBC for Native Relational Data Access
From the above link:
ODBC is the de-facto industry standard for native relational data access...
and
The commercial release of Microsoft SQL Server, codename 'Denali' will be the last release to support OLE DB.
and finally,
"We encourage you to adopt ODBC in the development of your new and future versions of your application. You don’t need to change your existing applications using OLE DB, as they will continue to be supported on Denali throughout its lifecycle. While this gives you a large window of opportunity for changing your applications before the deprecation goes into effect, you may want to consider migrating those applications to ODBC as a part of your future roadmap. Microsoft is fully committed to making this transition as smooth and easy as possible.""
(emphasis added)
ODBC is useful for times when the underlying database might change but you don't want your code to (assuming the SQL stays the same across technologies). You could connect to an Oracle database one day and switch out to a SQL server database the next. The disadvantage is that you don't get the optimizations that having specific drivers affords you. The SQL Server Native client driver has been proven to be much faster than just using a standard ODBC driver.
What is the difference between SQL Server Native Client connection and ODBC connection?
ODBC is a standardized API.
ODBC drivers are shared libraries that use native protocols (like SQL Server shared memory, or SQL Server TCP/IP) to implement the ODBC interface.
In other words, ODBC is an abstraction that enables code to work against multiple database technologies.
It's similar to Java's JDBC, or Python's DB-API, or GO's database/sql, except ODBC drivers use C functions. Also, they are more frequently installed at a system level.
ODBC has the usual pros and cons of any abstraction.
Pros: Makes code more flexible/portable.
Cons: Adds performance overhead and has fewer features.
It sounds like you know that you will use SQL Server and will always use SQL Server.
In that case, I'd use a native client library if it's available.
SQL Server Native Client is a single dynamic-link library (DLL) containing both the SQL OLE DB provider and SQL ODBC driver for Windows.
SNAC 11 is a single dynamic-link library (DLL) containing both the SQL
OLE DB provider and SQL ODBC driver for Windows. It contains run-time
support for applications using native-code APIs (ODBC, OLE DB and ADO)
to connect to Microsoft SQL Server 2005, 2008, 2008 R2, and SQL Server
2012. A separate SQL ODBC-only driver is available for Linux.
https://blogs.msdn.microsoft.com/sqlreleaseservices/snac-lifecycle-explained/
I usually use Delphi-targeted databases for most of my work (NexusDB typically, lately), but still have bad memories of how painfully slow connecting (and posting) to MS Access was via ADO. I have a new project that may need to target MS SQL Server. For D2007 Pro, what is the best way to connect to MS SQL Server? (Third party components = fine, if that's the best route).
The TADOConnection really isn't that bad. Access was never intended to be a production RDBMS. ADO works much faster with SQL Server than with Access. See http://support.microsoft.com/kb/225048 for some of the reasons why.
The AnyDAC offers great feature set and performance, as simplifies development of the database applications. AnyDAC supports MS SQL Server, MS Access and much more.
The UniDac Component from DevArt / Corelab is your best option
It offers fast performance and you can talk to a number of differrent databases
I always recommended DevArt db components fro their performance and reliability.
You can choose between SDAC(for direct access to sql server) or UniDac (direct access to Sql server, Oracle, MySql,PostgreSql and Interbase/firebird)
if you don't require the advanced components that access specific features of sql server like TMSChangeNotification, TMSTransaction or TMSServiceBroker, then you can go with UniDac so your application will be designed to work with multiple databases.
Devart offer components and dbExpress drivers for accessing SQL Server databases. The also have UniDAC which supports other databases as well.
Da-soft AnyDac supports SQL Server and other databases.
Bob Swart has published Delphi for Win32 VCL Database Development on Lulu, if you need any help.
I use ADO to connect to Sql Server since Delphi 7 and it always worked great