SQL Server CLR vs Python vs R - sql-server

Can somebody please explain the various scenarios that would make one choose SQLCLR vs Python vs R.
I understand that R is a language and a library specifically designed for statistical analysis and data mining so I understand leveraging that capability when appropriate, but can R (on SQL Server) do more and call additional external libraries like CLR assemblies can?
Is Python meant as an eventual replacement for C# SQLCLR? It seems to me, from what I've read, that Python can simply be embedded inside a stored procedure and then interpreted upon execution as opposed the compiled nature of CLR assemblies, but otherwise the capabilities are the same? Are they?

I'll try to answer your questions:
SQLCLR was introduced in SQL Server 2005, as a way to embed CLR (.NET) in the SQL Server engine. E.g with SQLCLR your .NET code is running in the same memory and process space as SQL Server itself. The way it works (simplified) is that you create an assembly and registers it with SQL Server (CREATE ASSEMBLY). You then create "wrapper" T-SQL stored procedures/functions/triggers etc., against your .NET methods, and it is these procs that you execute at runtime.
R was introduced in SQL Server 2016, and Python in SQL Server 2017 in order to give SQL Server machine learning capabilities. As opposed to .NET, neither R nor Python run embedded in SQL Server, but when you call R/Python code inside SQL Server, calls are made out to the R/Python engine sitting outside SQL Server's memory/process space. This is an important distinction between SQLCLR and R/Python:
SQLCLR code executes in-memory/in-process with SQL Server
R/Python executes outside of SQL Server.
As a side note; I have a series of blog-posts discussing the internals of SQL Server R Services (even if the posts talk about R, everything in there are applicable to Python as well).
As for capabilities; R/Python in SQL Server can do no more, no less than what "standalone" R/Python can do: as mentioned above the actual execution of R/Python happens outside of SQL Server as well.
Personally I do not think Python is a replacement for .NET in SQL Server, I see it as an additional tool in your toolbox. Where I work we use both SQLCLR as well as R/Python (in SQL Server). We have 100's of SQLCLR assemblies in our production databases, doing weird and wonderful things (sending messages to RabbitMQ etc., etc.), and IMHO it'd be very hard to replace that with Python, especially seeing that you'd immediately get a perf degradation - compiled code (SQLCLR) vs. interpreted code (R/Python).
Hope this helps.
Niels

Related

Automate migration of stored procedures from SQL Server to Postgres

We are having around 75~ table and 100~ stored procedures. We have created a custom NodeJS app with Sequelize to migrate the tables and its data. But we wanted to migrate the stored procedures too.
The only possible options that we do have is, is to manually convert every stored procedure.
Manually converting each stored procedure is a tedious task. So is there any way other than manually converting the code? I hope someone can guide/help me with this.
FYI:
SQL Server version: 16+
Postgres version: 12+
There soon will be, Amazon is launching an open-source tool under Apache to act as a translation layer between traditional SQL applications and a Postgres database. This translation layer allows your code to operate under its current SQL setup, but it gets translated for the Postgres DB. It's called Babelfish for Postgresql. It's slated for 2021, but it is not currently available. https://babelfish-for-postgresql.github.io/babelfish-for-postgresql/
There is absolutely no possibilities to automatically convert Transact SQL procedures to PG PL/SQL functions because of many lack of functionalities :
PG does not do pessimistic lock that SQL Server uses by default
PG do not support nested transaction that SQL Server support. In
this case the behaviour will be different and the results not the
same.
String data have collations CI/AS by default in SQL Server that PG
do not support completly (ICU collations are not supported for LIKE
and raise an error as an example).
PG does not conform to the SQL Standard regarding the string
datatype. PG use only CHAR/VARCHAR. No NCHAR/NVARCHAR, but strinsg
in PG are NCCHAR/NVARCHAR
PG support function overloading that is not supported in SQL Server.
The function using a generic code with the sql_variant datatype must
be translated into function overloading
PG does not make differnces between function and procedure (which is
a lack of security). SQL Server does it...
There will be many other functionalities that is completly different, and I am writing a series of papers about the differences between PG and SQL Server. The first one is about performances of DBA queries, the secound about COUT performances and the third a complete panorama of functional differences...

Install SQL Server (Express, compact, other?) in standalone pc

Help needed here. I'm a bit lost checking all possible editions and configurations of SQL Server.
What I'd like seems straight forward: a version of SQL Server (ideally 2008, or higher), on a single PC (client+server), with a small footprint. I just want to self train in ddbb's basic administration (user creation, schemas, scripts, copying ddbbs, stored procedures).
These ddbb's won't be used with webs, other users, etc. Just myself, at most with an Access front-end linked to the SQL Server DB.
My doubts are:
Is is better SQL Server Express 2008, Compact Edition (CE), SQL
Lite, something else ??
I would prefer using SQL Lite (seems the
simplest), but my concern is how 'similar' (for things like schemas,
permissions, scripts management, files names, no worries about multiple servers, though) is SQL lite to
a full SQL Server ?
I'd just like to familiarize with the basics in my pc so that when confronted to a real SSIS I can learn it quickly.
Thanks in advance, p.
I'd go with SQL Express if you're planning to learn SQL Server. Although SQLite has a small footprint it is completely different from SQL Server. Queries to get and manipulate data are similar (but not identical in every manner), but everything related to metadata (schemas etc.) is completely different.

What is a SSMS analogue for Oracle?

What is (if there is) the analogue of SQL Server Management Studio for Oracle databases?
Question seems easy but Google didn't know a thing.
I've never worked with Oracle database, now I will have my first encounter soon so I'd like to be a little prepared.
It depends. There are a number of applications that could be used depending on what specifically you are trying to do.
If you are trying to administer the database, you would probably want to use Enterprise Manager. This is a web-based application that lets you monitor and administer the database. It can either be configured to run just on a particular database server (Enterprise Manager Database Control) or it can be configured to allow you to access all the databases and a variety of non-database products running in the organization (Enterprise Manager Grid Control). Grid Control is, obviously, a much more involved install. The Database Control should be installed when you install the database (though some DBAs will turn it off because they don't want to run a HTTP server on their database server).
If you are trying to write and debug PL/SQL code (packages, procedures, functions, triggers, etc.) or to just run some ad-hoc SQL statements, Oracle provides a free tool SQL Developer that can be used. There are a variety of other PL/SQL IDEs out there as well-- Toad from Quest and PL/SQL Developer from AllAroundAutomations are two of the more common ones.
Oracle also has a basic command-line SQL tool SQL*Plus that will exist pretty much wherever you are (in much the same way that vi will be available on just about any Unix machine you log in to). There are lots of DBAs (and a decent number of developers) that prefer to use SQL*Plus rather than using the various GUIs. At a minimum, in most large organizations, DBAs will use SQL*Plus to execute scripts built by the developers as part of the code promotion process so you'll want to have a basic familiarity with SQL*Plus.
You can use Oracle SQL Developer which is free from oracle.
Toad is also available.

Alternatives to SMO, i.e. DML APIs for SQL Server?

I just learned I can't use SMO in ASP.NET without either SQL or a full SMO install on the application server. I'm not allowed to do that, so I can't use SMO.
What other libraries do the same thing as SMO but don't require an MSI installer or COM registrations?
I know I could send DDL to the server inside ADO.NET commands, but that is exactly what I was trying to avoid by using SMO.
What was nice about SMO:
Object oriented API for querying meta-data (columns, data types) that didn't rely on inconsistent COBOL-like DDL.
Didn't require querying undocumented stored procedures, system stored procedures or tables which break every few versions.
Off the top of my head I can think of ADOX and DMO, but both were COM based APIs.
SMO is running T-SQL under the covers. You could prototype in SMO and then watch in profiler to get the T-SQL.
It is probably an EULA violation, but you could redistrib the SMO assemblies side-by-side with your app, nothing to install in that case. I don't think their installer hits the registry. Pretty easy to bust open the SQLServerManagementObjects.msi and find out.

Sql Server x64 and x86 Linked Server

I have a Visual FoxPro table that I need to access from Sql Server. In Sql Server x86, I would just create a linked server. Unfortunately, there is no x64 driver for VFP - so Sql Server x64 can't create a linked server to it.
So far, I've come up with following options - none of which I'm particularly fond of:
Set up an x86 Sql Server to be used as a relay, so that queries go from x64 -> x86 -> VFP.
I don't really care for this, as in addition to being dev, I'm also sysadmin. So, this means I need to patch, maintain, and monitor yet another Sql Server - and possibly yet another server (assuming I don't just use a separate instance).
Also, since the VFP provider doesn't work with 4 part syntax, I have to use OPENQUERY. Thinking of all the single quote escaping that'd need to happen to have an OPENQUERY statement embedded into another OPENQUERY statement makes my head spin....
Create a CLR Table Valued Function, though the assembly would (presumably?) also be x64 - so I'd have to go out of proc (IPC? Webservice?) to actually run queries
Turns out that TVFs require a schema, so this option isn't as clean as I initially thought. I did a spike to get a WCF client into MSSQL, which returns a single column of XML that can then be parsed with the Sql XML datatype functions. It works, and is actually a little bit nicer to query than OPENQUERY since it actually takes variables as parameters. That saves me most of the single quote and EXEC dance.
Of course, WCF inside Sql is wholly unsupported, and smells like a pretty big hack. I have pretty serious reservations on performance and reliability.
Stop making queries from Sql Server to VFP, and rewrite a good bit of client code
Obviously, this is the "right" answer. But, there is a good deal of client code that relies on joins between Sql Server tables and VFP tables. Rewriting this stuff to populate a temp table or do client side joins seems like a rather large burden.
Here's hoping someone can suggest a better alternative, or some similar experiences.
It's a nasty problem, I agree.
SSIS run in 32-bit mode to import the data on a regular basis (perhaps on demand, in a job triggered by the same SP) to a SQL Server native table is another option if you can stand the delay. It would depend on the frequency of data change and problems with chance of slightly out of date data.
I think I found an alternative. Microsoft has released an updated driver for Access, which comes in both 32bit and 64bit flavors. Like the original Jet OleDB driver, this will allow you to access dBase file formats from SQL Server x64.
The only restriction is that the DBF must be in one of the dBASE formats supported by ISAM. I have done a few tests using a dBASE IV format and it seems to work, using the following connection string.
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\folder;Extended Properties=dBASE IV;User ID=Admin;Password=;

Resources