I will have a Postgres database in production but want to use MS SQL (whatever edition) for reporting. So, I would like to have replication set up where MS SQL subscribes from postgres. Is this possible?
All heterogeneous replication scenarios are deprecated by Microsoft, and they now recommend building solutions using SSIS and CDC instead.
We load data from PostgreSQL into our SQL Server reporting database using SSIS and it works well, although we had to use a commercial OLE DB provider because of limitations (at that time) in the open-source one.
Actually copying the data is usually the easy part; most of the work comes in gathering requirements, understanding the data, transforming it, implementing logging and error handling etc. SSIS can do some things for you right away (e.g. logging) but my general advice would be to use it primarily as a workflow tool and for simple data copying with minimal transformation logic (e.g. data type conversion). If something seems seems too difficult or clumsy in SSIS then you can put it into a stored procedure or script and call that from SSIS instead.
I've been using and following PostgreSQL for several years and not aware of such a solution. If one exists, I'm concerned that might be complex or fragile. I would recommend regular export/imports via cron. In the between the export and import, you would need to take care of the translation step of the formats.
If you reporting actually happens in MS Excel or MS Access, I recommend looking into connecting them directly to PostgreSQL via ODBC.
Related
I have a scenario where I get queries on a webservice that need to be executed on a database.
The source for these queries is from a physical device so I cant really change the input to my queries.
I get the queries from the device in MSSQL. Earlier the backend was in SQL Server, so things were pretty straight forward. Queries would come in and get executed as is on the DB.
Now we have migrated to Postgres and we don't have to the option to modify the input data (SQL queries).
What I want to know is. Is there any library that will do this SQL Server/T-SQL translation for me so I can run the SQL Server queries through this and execute the resulting Postgres query on the database. I searched a lot but couldn't find much that would do this. (There are libraries that convert schema from one to another but what I need is to be able to translate SQL Server queries to Postgres on the fly)
I understand there are quite a bit of nuances that will be different between SQL and postgres so a translator will be needed in between. I am open to libraries in any language(that preferably runs on linux : ) ) or if you have any other suggestions on how to go about this would also be welcome.
Thanks!
If I were in your position I would have a look on upgrading your SQL Sever to 2019 ASAP (as of today, you can find on Twitter that the officially supported production ready version is available on request). Then have a look on the Polybase feature they (re)introduced in this version. In short words it allows you to connect your MSSQL instance to other data source (like Postgres) and query the data in as they would be "normal" SQL Server DB (via T-SQL) then in the background your queries will be transformed into the native pgsql and consumed from your real source.
There is not much resources on this product (as 2019 version) yet, but it seems to be one of the most powerful features coming with this release.
This is what BOL is saying about it (unfortunately, it mostly covers the old 2016 version).
There is an excellent, yet very short presentation by Bob Ward (
Principal Architect # Microsoft) he did during SQL Bits 2019 on this topic.
The only thing I can think of that might be worth trying is SQL::Translator. It's a set of Perl modules that have been around for ages but seem to be still maintained. Whether it does what you want will depend on how detailed those queries are.
The no-brainer solution is to keep a SQL Server Express in place and introduce Triggers that call out to the Postgres database.
If this is too heavy, you can look at creating a Tabular Data Stream (TDS is SQL Server network transport) gateway with limited functionality and map each possible incoming query with any parameters to a static Postgres query. This limits any testing to a finite, small, number of cases.
This way, there is no SQL Server, and you have more control than with the trigger option.
If your terminals have a limited dialect demand then this may be practical. Attempting a general translation is very likely to be worth more than the devices cost to replace (unless you have zillions already deployed).
There is an open implementation FreeTDS that you could use if you are happy with C or Java.
Is there seriously no way of using a shared access non-server driven database file format without having to use an SQL Server? The Entity Framework is great, and it's not until I've completely finished designing my database model, getting SQL Server Compact Edition 4.0 to work with Visual Studio that I find out that it basically cannot be run off a network drive and be used by multiple users. I appreciate I should have done some research!
The only other way as far as I can tell is to have to set up an SQL server, something which I doubt I would be able to do. I'm searching for possible ways to use it with Access databases (which can be shared on a network drive) but this seems either difficult or impossible.
Would I have to go back to typed DataSets or even manually coding the SQL code?
Another alternative is to try using SQL
Install SQL Server express. Access is not supported by EF at all and my experience with file based databases (Access, SQL Server CE) is mostly:
If you need some very small mostly readonly data to persist in database you can use them (good for code tables but in the same time such data can be simply stored in XML).
If you expect some concurrent traffic and often writing into DB + larger data sets their performance and usability drops quickly. They are mostly useful for local storage for single user.
I'm not sure how this relates for example to SQLite. To generate database from model for SQLite you need special T4 template (using correct SQL syntax).
Have you tried SQLite? It has a SQL provider, and as far as I know EF supports any provider. Since it's file-based, that might be a plausible solution. It's also free.
I don't know between ADO, DAO and DLookUps and such. Does anyone know?
I find that the application bottleneck is usually the actual SELECT/INSERT/UPDATE on the DB and not how the application calls it. if you are worrying about speed make sure you have well designed tables and indexes.
Regarding DAO vs ADO, I'm not sure about a difference in performance but there is a difference in available functionality.
There is a microsoft article showing the differences in a nice table. Choosing ADO or DAO.
It also states:
"In particular, ADO is a good choice if you are developing an Access database solution that will later be upgraded to SQL Server — you can write your ADO code to minimize the number of changes that will be required to work against a SQL Server database. In addition, ADO is a good choice for developing new data access components that work with SQL Server, multidimensional data, and Web applications."
Seems like ADO might be the way to go.
I don't know anything about DLookUps though.
I am developing a data driven website and quite a lot of programming logic resides in database stored procedures and database functions. I found myself changing the stored proc/functions quite a lot in order to fix bugs or add new functionality. The data (tables) have remained mostly untouched.
The issue I am having is keeping track of versions of stored proc/functions. Currently I am incrementing version of whole database when I do a set of changes. As data is huge (10 Gb) I get issues having to run development version and release versions of databases in parallel.
I wish to put all the stored procs and functions in one database and keep data in one database, so that I can better manage the changes.
I am sure others would have encountered similar suggest and request suggestions on how to best handle this situation.
I would also recommend using source control keyword expansion in your stored procedures ($Version:$)
That way you can eyeball, grep, search syscomments, etc to see what version you have on your deployed database.
You can version just the schema dumps. In combination with source control keword expansion (as suggested by Rawheiser), you just take a look at what version you have in the database, generate a diff and apply it.
Also, there are several excellent tools to compare databases and their schemas, generate DDL scripts etc.: SQL Workbench, Power Architect, DDLUtils and Redgate SQL Compare, to name a few. SQL Compare is likely to work best with SQL Server, although all the others are FOSS and provide a higher ROI (in terms of time spent learning and what you can do with them) as they are platoform and RDBMS independent.
Finally, I have to say...I understand that the immediate results you get with logic in the DB are tempting, but if you've gone beyond more than a couple of procedures in the database, you're setting your self up for quite a lot of pain, sifting through what easily turns into spaghetti code and locking your application to a single database vendor. You might have your reasons, but I've been there and didn't like it very much. Logic can live very nicely in a different layer.
For source control you have several options:
Use a Visual Studio Database project.
Use SQL Server 2005's built-in support for source control
Use a third part tool such as SQL Compare
IMO Option 1. is preferable.
I've looked around and can't seem to find anything that answers this specific question.
What is the simplest way to move data from an MS SQL Server 2005 DB to a Postgres install (8.x)?
I've looked into several utilities like "Full Convert Enterprise", etc, and they all fail for one reason or another, ranging from strange errors that make it blow up to inserting nulls rather than actual data (wth?).
I'm looking at a DB with all table except for a single view, no stored procs, functions, etc.
At this point I'm about to write a small utility to do it for me, I just can't believe that's necessary. Surely there's something somewhere that can do this? I'm not even too worried about cost, although free is preferable :)
I don't know why nobody has mentioned the simplest and easiest way using robust MS SQL Server Management Studio.
Simply you just need to use the built-in SSIS Import/export feature. You can follow these steps:
Firstly, you need to install the PostgreSQL ODBC Driver for Windows. It's very important to install the correct version in terms of CPU arch (x86/x64).
Inside Management Studio, Right click on your database: Tasks -> Export Data
Choose SQL Server Native Client as the data source.
Choose .Net Framework Data Provider for ODBC as the destination driver.
Set the Connection String to your database in the following form:
Driver={PostgreSQL ODBC Driver(UNICODE)};Server=;Port=;Database=;UID=;PWD=
In the next page, you just need to select which tables you want to export. SQL Server will generate a default mapping and you are free to edit it. Probably you`ll encounter some Type Mismatch problems which take some time to solve. For example, if you have a boolean column in SQL Server you should export it as int4.
Microsoft Docs hosts a detailed description of connecting to PostgreSQL through ODBC.
PS: if you want to see your installed ODBC Driver, you need to check it via ODBC Data Source Administrator.
Take a look at the Software Catalogue. Under Administration/development tools I see DBConvert for MS SQL & PostgreSQL. Probably there are other similar tools listed.
You can use the MS DTS functionality (renamed to SSIS in the latest version I think). One issue with the DTS is that I've been unable to make it do a commit after each row when loading the data into pg. Which is fine if you only have a couple of 100k rows or so, but it's really very slow.
I usually end up writing a small script that dumps the data out of SQLServer in CSV format, and then use COPY WITH CSV on the PostgreSQL side.
Both those only take care of the data though. Taking care of the schema is a bit harder, since datatypes don't necessarily map straight over. But it can easily be scripted together with a static load of the schema. If the schema is simple (just varchar/int datatypes for example), that part can also easily be scripted off the data in INFORMATION_SCHEMA.
Well there are .NET bindings for MS SQL Server 2005 (obviously) and also for PostgreSQL. So it would only take a few lines of code to code up a program that could transfer data safely from one to the other. The view would probably have to be done manually as Postgres doesn't use the same language for views as SQL Server.
This answer is to help summarize current connection string because someone may overlooked the comment.
Current version of ODBC connection string is:
For 32-bit system
Driver={PostgreSQL UNICODE};Server=192.168.1.xxx;Port=5432;Database=yourDBname;Uid=postgres;Pwd=admin;
For 64-bit system
Driver={PostgreSQL UNICODE(x64)};Server=192.168.1.xxx;Port=5432;Database=yourDBname;Uid=postgres;Pwd=admin;
You can check the driver name by typing ODBC in windows search.
And open ODBC Data Source Administrator