I am using SQlite in my windows application to handle a custom kind of data set.
The structure of data is obtained from another vendor and cannot be changed.
In some cases the Data needs to have a huge number of columns in the table.
as the SQlite is having a column limit of 2000, tried of making a custom build from SQlite source code (as mentioned here.)
For confirming the custom build SQLite code, we have just build the code (without any change) downloaded from here
Build the project SQLite.Interop.2015 and used the DLL instead of the one we were using in our projects.
But we could find that the DB operations became exponentially slow.
Is there anything to be taken care, while building the source code of our own?
Related
I'm working on a legacy SqlServer database with no version control. I've tried importing it into a VS 2017 database project, but it takes more than an hour to load ("your project will be ready after 1200000 operations are completed"), and usually crashes out in less time than it took to load.
Does anyone have any suggestions for a version control system I can try that will cope with real-life databases?
Baseline your Database and call this Version 0.1.0
As you need to make changes to it, like add columns, data etc. Script this and add this to the source control of your choice. Call this file something like:
Version-0.1.1.sql
As you make more and more changes the amounts of files will be added to.
Version-0.1.2.sql
Version-0.1.3.sql
Version-0.1.4.sql
Of course you will test these before you deploy to live. As you are working with what is a legacy system I would probably shy away from investment in expensive tools for what is a legacy system in the first place.
To bring a database to a particular version you would run the scripts in the order. Obviously in each script you have failover etc that handles anything that may go wrong within the scripts.
It is a manual process but the best points are it's cheap, easily to understand, does not require much expense and it's a methodical system to manage change.
Note: Obviously deploy scripts to a UAT version before directly on Live.
I have had a lot of success using flyway with both sql server and postgres. It allows you to create numbered versions as betelgeuce described in his answer, but also offers additional protection of ensuring your earlier versions haven't been changed before deploying any new changes
Twist to the standard “SQL database change workflow best practices”
Background
ASP.NET/C# Web App
MS SQL
Environments
Production
UAT
Test
Dev
We create patch scripts (XML and sql) that are source controlled in Mercurial. We have cmd line utility that installs patches to DB (utitlity.exe install –patch) from a Release folder the build packages. Patches have meta data that helps with when patch should run and we log patches installed in a table in the target DB. All these were covered in the 3 year old question:
SQL Server database change workflow best practices
Our Problem/Twist
I think this works well for tables, views, functions and stored procedures. We struggle with application configuration data. Here are some touch points on application configurations.
New client. BA performs system study and fit analysis. Out of this comes a configuration word document of what application configurations need to be setup. Note some of these may also come in phases over time. We need to get these new configurations into the system for the developer and client UAT.
Developer works on feature request or bug fix. A new configuration change comes out of that change. The configuration needs to make it into the system for testing and promotion to UAT and up.
QA finds that the developer missed an associated configuration change. That configuration needs to make it into the system for promotion to UAT and up.
Build goes to UAT. Client performs acceptance testing but find they really want to change another unassociated configuration and have it promoted with the changes. In other words they found they want to change a business process by a configuration. The configuration needs to make it into the system for promotion to PRD.
As the client operates in PRD they may tweak application settings. These configurations need to make it into the system for future development and testing.
The general issue is making sure we are accounting for all the configurations and accidently not miss any during promotions which causes grief.
Our Attempts At A Process
a. We have had member of the QA team to write patches (xml and sql) and check those in. This requires a build to make sure those get into the package. With this approach it really just took care of item 1 above and we fell apart on the other items. The nice thing is for the items that made it into the patches it was just an install with the utility.
b. A developer threw together a Config page on the application. All the configurations could be uploaded and downloaded via XML document but it requires the app to be running. For item 1, member of QA team would manually setup configurations in the application and then would download the Config.xml file. This XML file would be used to upload configurations in other environments. We would use text diff tool to look at differences between config.xml files from different environments. This addressed item 1 and the others items but had problems. Problems were not all configurations made it into the XML document (just needs to be fixed by developer), some of the configurations didn’t have a UI in the application so you still had to manually go to the database on some, comparing the XML document with text diff was difficult at time (looked mostly due to sorting but I’m sure there are other issues), XML was not very human readable and finally the XML document did not allow for deleting existing incorrect or outdated configs.
c. Recently we went with option B, but over time for a new client we just started manually tracking configs and promoting them manually by hand (UI and DB) through the promotions. Needless to say lots of human errors.
So we have been looking at solutions. Eventually it would be great to get as much automation in as possible. I’m looking at going with the scripting approach and just focusing on process, documentation and looking at using Redgate data compare in addition to what we had been doing with compare on config.xml. With Redgate we have to create views though and there is no way to create update scripts from that approach except to manually update the scripts. It does at least allow a comparison without the app running. I’m also looking at pulling out the configs from our normal patches and making it a system independent of the build (utility.exe –patch –config). When I say focus on process it will be things like if we compare and find a config change either reported by client or not, we still script it, just means we have to have a process in place to quickly revalidate config install before promoting to the next level. As for documentation looking at making the original QA document a living document instead of just an upfront document. The goal is to try and enhance clarity and reduce missing configurations during promotion. Unfortunately it doesn’t improve speed of delivery.
Does anyone have any recommendations or best practices to pass along. Thanks.
Can I ask exactly what you mean by application configuration. I'm interpreting that as both:
Config files in the web application
Static reference data inside the database
Full disclosure I work for Red Gate. You might be interested in taking a look at Deployment Manager, it's a deployment tool that deploys applications, databases and configuration. It's free for up to 5 projects and target servers.
The approach it uses is to package application code and the database state into packages. These packages can be deployed into dev, test, staging and production environments. The same package is deployed to each environment.
Any application configuration that needs to change between environments is handled in one of the ways below:
Variable substitution in web.config. The tool allows you to specify override values for variables in these files, and set these per environment/server
Substituting the web.config file per environment.
Custom powershell scripts that are run pre/post deploy. You could use these to execute custom SQL based on the environment or server.
Static data within the database, using SQL Source Control's static
data feature. I've written a blog post about how to supply
different sets of static data to different environments/customers.
This allows you to source control the application configurations and deploy them to different environments.
I'm trying to implement automated integration tests for my application. It's a very complex monster. You could say that its database and part of the filesystem are part of its state, because it saves image files in the hard drive, and references to those in the DB. The software needs all those, in a coherent state, to work properly.
Back to writing tests: To run any relevant test, I need some image files in the filesystem, and certain records filled in the database. I thought of putting all of these in a separate folder called TestEnvironmentData in the repository, and retrieving them from the Continuous Integration Server (Team City), but a colleague said the repo is quite full as it is, and that I should set up a special directory, and databases, only in the Continuous Integration server. I don't like that because the tests success depend on me manually mantaining stuff in the server, and restoring initial state before every test becomes cumbersome.
What do you guys do when you need to write integration tests for an app like this? The main goal is having an automated test harness to approach a large scale refactoring. There's lots of spaghetti code and the app's current architecture is hardly unit testable, that's why I decided on integration tests first.
Any alternative approach is welcome.
Developer Repeatability is key when setting up a Continous Integrations Server. I have set one up for my last three employers and I have found the key to success is the developers being able to run the same tests from their dev system in order to get the same results as the CI Server.
The easiest way to do this would be to check in the test artifacts into source control but you could also use dropbox or a Network Share that you copy them from in one of the build steps.
For a .Net solution I have always used MsBuild as you can most easily replicate the build process of Visual Studio and get the same binaries/deployables. As for keeping your database in sync so that tests can be repeatable in the past I used the MbUnit test framework and the [Rollback] attribute as it would roll back any changes to Sql Server that happened in the test. I believe that Nunit now has this attribute as well.
The CI server is great for finding code that breaks existing functionality but unless developers can reproduce the error on their machine they won't trust the CI server for some time.
First of all, we use Maven to build our code. It's like ant, but it relies on convention instead of configuration for many things, like Ruby On Rails does. One of those conventions is a standardized directory structure:
(project)----src----main----(language)
| | \--resources
| \--test----(language)
| \--resources
\--target---...
Using a directory structure like this makes it easy to keep your application resources and testing resources near each other, yet still be able to build for test or build for production, or just build both but just package up the application parts after running the tests.
As far as resetting the database between tests, how you do that is greatly dependent on the DBMS you're using. For instance, if you're using MySQL it's very easy to get the test data the way you want and do a mysqldump to a file you then load before the test. With other DBMSs you may have to drop and recreate the tables and reload the data, or make separate tables for the starting point and use a CREATE/SELECT sql statement to duplicate it each time.
There really is no reliable way around the "reset the database between tests" step.
I'm using the GDR release of VSTS Database edition source control the DB and generate deployment scripts. It works pretty well but the problem is that it only seems to handle scripting and deploying the schema. It stops short of handling scripting and deployment of the actual data itself (i.e. the lookup and standing data which also deployed with the DB).
I know it's easy enough to write the deployment scripts by hand, but is this what every one does? Is there a recommended way of deploying data with the VSTS deployment engine? Is there some tooling that help with this - I don't mean a full product like SQLCompare, just something that fills the gap with VSTS DB.
Thanks in advance.
Kaneda
The VSTS: DB best practices blog advocates using post-deployment scripts to insert reference data into temporary tables, then update the target tables based on the delta (ie update x inner join temp where x.something <> temp.something)
There's some suggestions floating around that this might make a powertool, and at least one MVP has written a tool to generate those scripts.
(NB: I haven't tried this - I only just found out about it myself)
Personally I would still stick with RedGate if I had any choice in the matter.
GDR comes with a data comparison engine, but as far as I've been able to tell so far a data comparison can't even be stored in a project (let alone be properly supported by it) - so it's pretty ad-hoc. Unlike a Schema Compare, there is no File \ Save As.
The comparison engine can be automated via DDE but that's automation within the Visual Studio IDE, and not really suitable for some kind of scripted installation process. As much as anything there's no way I could see to specify which tables to include in the comparison (since all you get to do via DDE is open the wizard for the user to select)
Alternatively all the functionality appears to reside in Microsoft.VisualStudio.TeamSystem.DataPackage.dll , but since the API documentation hasn't been written yet (the help doco that comes with GDR is full of errors as it is) it's going to be a bit of a hit-and-miss adventure to work out where to start.
As someone who's used RedGate's SqlCompare, SqlDataCompare and their respective APIs to do this before, much of the GDR functionality seems a bit half-baked to me.
What I will probably do this time round is sync the data with a SSIS package (export to CSV at build time / import from CSV at install time), but I'd far rather be using the SqlDataCompare API (or SqlPackager) right now.
Let's say we have a continuous integration server. When I check in, the post-hook pulls the latest code, runs the tests, packages everything. What is the best way to also automate the database changes?
Ideally, I'd build an installer that could either build a database from scratch or update an existing one using some automated syncing method.
I've recently bumped into an article, that might be of use.
The author explained some of the best continuous integration practices including testing, processing and automation.
Here are some of the key takeaways:
In many shops code is unit tested at the point of commit. For databases, it is preferred running all unit tests at once and in sequence against a QA database, vs development, as a part of the Test step
The test step is a critical part of any CI/CD process. Test scripts, including unit tests themselves, should also be versioned in source control, extracted at the point of the Build step and executed
Pulling data from production is appealing as a quick expedient, but is never a good idea
The best approach is using a tool or script to quickly, repeatedly and reliably create synthetic test data for your transactional tables
Running unit tests to produce manual summary results for human consumption defeats the purpose of automation. We need machine readable results, that can allow an automated process to abort, branch and/or continue.
Running a CI process, which requires 100% of all tests to pass, is akin to not having CI at all, if the workflow pipeline is set up atomically to stop on failure, which it should. To thread the needle, tests should have built in thresholds, that will raise an error based on either the % of tests failing or in some cases, if certain high priority tests fail.
All processes should ultimately produce a Boolean result of pass or fail, but some non-automated processes can easily find their way into your CI workflow pipeline (e.g. unit testing). Software should be plug-n-play into any workflow pipeline, taking known inputs and producing expected outputs – like pass, fail.
CI/CD process should be aborted on failure and a notification email should be immediately sent vs continuing to cycle the pipeline.
The CI process should not cycle again until any errors in the last build are fixed. On failure, the entire team should get the failure notification, including as many details as to what failed as possible.
If a pipeline takes 1 hour, from start to finish, to complete, including all the testing, then all the build intervals should be set to no less than one hour and all new commits should be queued, and applied to the next build.
No plain text passwords should exist in automation scripts
If you have the opportunity to define and control the whole database management and db creation process, have a serious look at DB Ghost - it's more than just a tool - it's a process.
If you like it and can implement it, you'll get great returns on it - but it's a bit of a "all-or-nothing" kind of approach. Recommended.
I would caution against using a db backup as a development artifact, most CI best practices suggest that you manage the schema, procedures, triggers, and views as first class development artifacts. The side effects is that you can take this one step further and use them to build a new database whenever you want, ideally you also have some data that can be pushed into the database.
Here is a cliff notes version to get your feet wet, but there is lots out there in this space:
http://www.infoq.com/news/2008/02/versioning_databases_series
I like some of the ideas that Scott Ambler has here as well, the site is good but the book is surprisingly deep for such a difficult set of problems.
http://www.agiledata.org/
http://www.amazon.com/exec/obidos/ASIN/0321293533/ambysoftinc
Red Gate is a quite robust solution and it works out of the box.
But the best thing is that you can integrate it with your continuous integration process. I use it with Msbuild and Hudson.
quickly explaining how it works:
http://blog.vincentbrouillet.com/post/2011/02/10/Database-schema-synchronisation-with-RedGate
if you need to know more about this, feel free to ask
The Red Gate approach using SQL Source Control and the SQL Compare Pro command line is detailed with code samples here:
http://downloads.red-gate.com/HelpPDF/ContinuousIntegrationForDatabasesUsingRedGateSQLTools.pdf
Troy Hunt wrote an article on Simple Talk entitled "Continuous Integration for SQL Server Databases":
http://www.simple-talk.com/content/article.aspx?article=1247
Have you looked at FluentMigrator? The default download includes Nant scripts that would be easy to add in to a CI. Free, open source and easy to use. Works for a wide variety of databases.
The latest version (5.0) of DB Ghost doesn't suffer from the "non ASCII character" problem (it just means that the file is UTF8 encoded) and it should be able to do exactly what you need.
Also, the tools can actually be used standalone to perform the various functions (scripting, building, comparing, upgrading and packaging) if you want, it's just that using them all together provides a full end-to-end process thus making the overall value greater than the sum of it's parts.
In essence, to make changes to the schema you update individual object creation scripts and per-table insert scripts (for reference data) that are held under source control just like you were developing a “day one” greenfield database. The DB Ghost tools are used to enable the whole thing by building these scripts into a brand new database (using continuous integration if required) and then comparing and upgrading a target database, which can be a copy of the production database. This process produces a delta script which can be used on the real production database during go-live.
You can even produce a Visual Studio database project and add it into any solutions you currently have.
Malc
I know this post is old, but we have a new solution that takes the following approach:
Developers script individual SQL changes and commit them to source
control.
Our program (OneScript) pulls the change script files from
source control, filters and sorts them, and generates a single
release script file.
That release script file is then applied to a
database to do a release.
Our home page here explains this process in more detail and has a link to an example that does these steps automatically from a Subversion hook. So soon after a commit, the developer receives an email saying if the release was successful or had errors. The PowerScript code is included.
Disclaimer -I'm working at the company that makes OneScript.