SSIS: Finding Table Used in Other Package(s)/ Integration - sql-server

I did see some other posts on this, but they were rather old and there does not appear to be any solutions at this point.
I'm trying to determine where a particular table(s) that SSIS is loading during a monthly job is being used in other packages. The package that loads these tables have in the past several months been taking much longer than before, and I'm trying to see if I can eliminate this load all together.
I just happened to check the Allocation packages in our database to see how the tables were being used, and discovered that I can't find anywhere when/where those tables are being used. Is there a function or query I can run in SSMS or elsewhere to determine how to find this information?
Thx in advance - please let me know if I need to clarify something.

The packages are just XML files. If you have the packages somewhere on your file system you can use any program that searches through text files.
I'm not sure about older SSIS projects but with an SSIS project in Data Tools for SQL Server 2012 you can just use the build in search function to search through your entire solution. It will also search in the XML of all the packages.

If you don't have this particular information saved anywhere already in your documentation then I think you are going to have some difficulty in finding an accurate way to retrieve this information. However, there are a few automated data collection options that might help you get most of the way there.
The first option is that because all SSIS Packages are essentially glorified XML that is being fed into an engine you can perform a patterned search on the packages like GREP to look for that particular table name. Any packages that dynamically retrieve and build the table name though would not be found through this method.
Another option would be to run a server side SQL trace with a pattern match based on the table name(s) and limited to the host or application name of SSIS. Run over the course of a month or so would make for a fairly accurate list.

I haven't used it myself, but the DOC xPress tool from PragmaticWorks might be what you're looking for.

Related

Is there a way to export a Informatica maplet 'graphical' data to a simple csv/Excel file?

The firm I work in has a lot of data sources entering the firm database using the Informatica ETL tool, stored in maplets and other data models (sorry If I'm not using the exact terminology).
The problem is that all the business logic is stored in the 'graphical interface' and nowhere else - Every time I want to see what field goes into the target field I have to trace the inputs through the maplet and that takes a very long time.
The Question is: Is there a tool that can takes all the relationships in the Informatica maplet and somehow export them to a excel table (so I can see it all without tracing)? that way I could try to make proper documentation....
Thanks in Advance.
It's possible to export mappings or whole workflows to XML. Next, you can use this tool - it will create tables with source to target dependency for every mapping.
Keep in mind it will only map input to output, it won't extract the full logic and transformations done along the way - that would've been to complex for simple visualization.
Informatica supports exporting mapping information to Excel - just search the documentation which tells you how to do it.
However, for anything other than the simplest of mappings, what ends up in Excel is not that easy to understand. If your Informatica installation supports it, then using the lineage capabilities is a much better bet.

Generating several similar SSIS packages (file data source to DB)

Is there a way to automatically generate SSIS packages? I need to create a lot of SSIS packages that just erase data from one table and import data from a text file. The file name matches table name and the column headers are in the first line of the file.
For more detailed information:
I am working on a project in which I have to separate two systems that are currently coupled (one system has direct access to the other's database). After the modifications, one system will provide data through txt files to be loaded in the other database.
We have to use SSIS to load data into the database from the text files.
The text files will be provided in CSV format with column headers in the first line.
The tables from both databases have matching column names, and all we need to do is clear the table and load data from the files.
I have more than one hundred tables with different number of columns. Do I need to create each package manually?
I'm familiar with 2 free options.
EzAPI might be a good place if you're a .NET heavy shop or just really want to geek out with the API. This approach allows you to control the pretty much the entire package generation but at the cost of coding time. I find EzAPI generally easier than working with the base COM/.NET libraries for SSIS.
Biml is an interesting beast. Varigence will be happy to sell you a license to Mist but it's not needed. All you would need is BIDSHelper and then browse through BimlScript and look for a recipe that approximates your needs. Once you have that, click the context sensitive menu button in BIDSHelper and whoosh, it generates packages.
I did this just using vb, I passed in the table names as a command parameter and used vb to generate the insert and clear, worked a charm... I can try and dig it out tomorrow when I'm back in the office but it was pretty simple. There didn't seem to be any other way to say "just get x and export it", "just take y and import it into z" so vb it had to be. In fact come to think of it I think I actually used a small xml file to pass the table info for export and then determined the table name for import from the csv file name. To be clear, this was only one package but it could dynamically choose the number of imports/exports it did. Further clarification this was vb within ssis as a processing step

Are most LDAP administrators creating LDIFs by hand?

Are there tools that make the job easier? If command-line only tools exist, then can anyone speculate if there is a market for a GUI tool? For example, you can create a relational database by modeling visually. Should the same notion exist for LDAP?
Apache Directory Studio includes an ldif-Editor. It is still a text editor but with syntax highlighting, autocompletion and group collapsing for ldif files:
http://directory.apache.org/studio/
I don't know if there are any tools but it isn't that hard to create them by hand.
If you are using IPlanet LDAP then they had a nice interface for creating and modifying schemas though. :)
I don't know if you would consider that to be by hand otherwise that is one tool to use.
I've done some LDIF handling using Perl and the Net::LDAP::LDIF module and it made scripting custom LDAP conversions very easy.
Have you looked at the command-line tool, LDIFDE.exe? Should be on your domain controller.
Business people give me Excel spreadsheets with inconsistent formatting of user and group data and want it loaded right away (then they come back with a new version and tell me they've only added some new users, but some are missing, some data is now invalid, there's a missing column etc.) They want unique passwords assigned, group memberships set up based on department id fields, and so forth.
Then they come back two weeks later and want to know about the differences between that spreadsheet and one from six months ago. Sigh.
I generally just do it all with a few hand-crafted Python scripts.
A lot of times you may be copying objects from one tree to another. Or backing them up. In that case, most LDAP tools have some way of exporting as LDIF. Then you can easily modify the files as needed.
Or copy examples to reuse.
I have seen a number of tools that will do tasks and output the results as LDIF, which can be handy, but they are basically point usage tools.

What are the best practices for database scripts under code control

We are currently reviewing how we store our database scripts (tables, procs, functions, views, data fixes) in subversion and I was wondering if there is any consensus as to what is the best approach?
Some of the factors we'd need to consider include:
Should we checkin 'Create' scripts or checkin incremental changes with 'Alter' scripts
How do we keep track of the state of the database for a given release
It should be easy to build a database from scratch for any given release version
Should a table exist in the database listing the scripts that have run against it, or the version of the database etc.
Obviously it's a pretty open ended question, so I'm keen to hear what people's experience has taught them.
After a few iterations, the approach we took was roughly like this:
One file per table and per stored procedure. Also separate files for other things like setting up database users, populating look-up tables with their data.
The file for a table starts with the CREATE command and a succession of ALTER commands added as the schema evolves. Each of these commands is bracketed in tests for whether the table or column already exists. This means each script can be run in an up-to-date database and won't change anything. It also means that for any old database, the script updates it to the latest schema. And for an empty database the CREATE script creates the table and the ALTER scripts are all skipped.
We also have a program (written in Python) that scans the directory full of scripts and assembles them in to one big script. It parses the SQL just enough to deduce dependencies between tables (based on foreign-key references) and order them appropriately. The result is a monster SQL script that gets the database up to spec in one go. The script-assembling program also calculates the MD5 hash of the input files, and uses that to update a version number that is written in to a special table in the last script in the list.
Barring accidents, the result is that the database script for a give version of the source code creates the schema this code was designed to interoperate with. It also means that there is a single (somewhat large) SQL script to give to the customer to build new databases or update existing ones. (This was important in this case because there would be many instances of the database, one for each of their customers.)
There is an interesting article at this link:
https://blog.codinghorror.com/get-your-database-under-version-control/
It advocates a baseline 'create' script followed by checking in 'alter' scripts and keeping a version table in the database.
The upgrade script option
Store each change in the database as a separate sql script. Store each group of changes in a numbered folder. Use a script to apply changes a folder at a time and record in the database which folders have been applied.
Pros:
Fully automated, testable upgrade path
Cons:
Hard to see full history of each individual element
Have to build a new database from scratch, going through all the versions
I tend to check in the initial create script. I then have a DbVersion table in my database and my code uses that to upgrade the database on initial connection if necessary. For example, if my database is at version 1 and my code is at version 3, my code will apply the ALTER statements to bring it to version 2, then to version 3. I use a simple fallthrough switch statement for this.
This has the advantage that when you deploy a new version of your application, it will automatically upgrade old databases and you never have to worry about the database being out of sync with the software. It also maintains a very visible change history.
This isn't a good idea for all software, but variations can be applied.
You could get some hints by reading how this is done with Ruby On Rails' migrations.
The best way to understand this is probably to just try it out yourself, and then inspecting the database manually.
Answers to each of your factors:
Store CREATE scripts. If you want to checkout version x.y.z then it'd be nice to simply run your create script to setup the database immediately. You could add ALTER scripts as well to go from the previous version to the next (e.g., you commit version 3 which contains a version 3 CREATE script and a version 2 → 3 alter script).
See the Rails migration solution. Basically they keep the table version number in the database, so you always know.
Use CREATE scripts.
Using version numbers would probably be the most generic solution — script names and paths can change over time.
My two cents!
We create a branch in Subversion and all of the database changes for the next release are scripted out and checked in. All scripts are repeatable so you can run them multiple times without error.
We also link the change scripts to issue items or bug ids so we can hold back a change set if needed. We then have an automated build process that looks at the issue items we are releasing and pulls the change scripts from Subversion and creates a single SQL script file with all of the changes sorted appropriately.
This single file is then used to promote the changes to the Test, QA and Production environments. The automated build process also creates database entries documenting the version (branch plus build id.) We think this is the best approach with enterprise developers. More details on how we do this can be found HERE
The create script option:
Use create scripts that will build you the latest version of the database from scratch, which is empty except the default lookup data.
Use standard version control techniques to store,branch,tag versions and view histories of your objects.
When upgrading a live database (where you don't want to loose data), create a blank second copy of the database at the new version and use a tool like red-gate's link text
Pros:
Changes to files are tracked in a standard source-code like manner
Cons:
Reliance on manual use of a 3rd party tool to do actual upgrades (no/little automation)
Our company checks them in simply because someone decided to put it in some SOX document that we do. It makes no sense to me at all, except possible as a reference document. I can't see a time we'd pull them out and try and use them again, and if we did we'd have to know which one ran first and which one to run after which. Backing up the database is much more important then keeping the Alter scripts.
for every release we need to give one update.sql file which contains all the new table scripts, alter statements, new/modified packages,roles,etc. This file is used to upgrade the database from 1 version to 2.
What ever we include in update.sql file above one all this statements need to go to individual respective files. like alter statement has to go to table as a new column (table script has to be modifed not Alter statement is added after create table script in the file) in the same way new tables, roles etc.
So whenever if user wants to upgrade he will use the first update.sql file to upgrade.
If he want to build from scrach then he will use the build.sql which already having all the above statements, it makes the database in sync.
sriRamulu
Sriramis4u#yahoo.com
In my case, I build a SH script for this work: https://github.com/reduardo7/db-version-updater
How is an open question
In my case I am trying to create something simple that is easy to use for developers and I do it under the following scheme
Things I tested:
File-based script handling in git using GitlabCI
It does not work, collisions are created and the Administration part has to be done by hand in case of disaster and the development part is too complicated
Use of permissions and access via mysql clients
There is no traceability on changes to the database and the transition to production is manual
Use of programs mentioned here
They require uploading the structures and many adaptations and usually you end up with change control just like the word
Repository usage
Could not control the DRP part
I could not properly control the backups
I don't think it is a good idea to have the backups on the same server and you generate high lasgs for the process
This was what worked best
Manage permissions per user and generate traceability of everything that is sent to the database
Multi platform
Use of development-Production-QA database
Always support before each modification
Manage an open repository for change control
Multi-server
Deactivate / Activate access to the web page or App through Endpoints
the initial project is in:
In case the comment manager reads this part, I understand the self-promotion but please just remove this part and leave the rest since I think it complies with the answer to the question reacted in the post ...
https://hub.docker.com/r/arelis/gitdb
I hope this reaches you since I see that several
There is an interesting article with new URL at: https://blog.codinghorror.com/get-your-database-under-version-control/
It a bit old but the concepts are still there. Good Read!

How to keep Stored Procedures and other scripts in SVN/Other repository?

Can anyone provide some real examples as to how best to keep script files for views, stored procedures and functions in a SVN (or other) repository.
Obviously one solution is to have the script files for all the different components in a directory or more somewhere and simply using TortoiseSVN or the like to keep them in SVN, Then whenever a change is to be made I load the script up in Management Studio etc. I don't really want this.
What I'd really prefer is some kind of batch script that I can run periodically (nightly?) that would export all the stored procedures / views etc that had changed in a given timeframe and then commit them to SVN.
Ideas?
Sounds like you're not wanting to use Revision Control properly, to me.
Obviously one solution is to have the
script files for all the different
components in a directory or more
somewhere and simply using TortoiseSVN
or the like to keep them in SVN
This is what should be done. You would have your local copy you are working on (Developing new, Tweaking old, etc) and as single components/procedures/etc get finished, you would commit them individually until you have to start the process over.
Committing half-done code just because it's been 'X' time since it was last committed is sloppy and guaranteed to cause anyone else using the repository grief.
I find it best to treat Stored Procedures just like any other compilable code: Code lives in the repository, you check it out to make changes and load it in your development tool to compile or deploy the code.
You can create a batch file and schedule it:
delete the contents of your scripts directory
using something like ExportSQLScript to export all objects to script/scripts
svn commit
Please note: That although you'll have the objects under source control, you'll not have the data or it's progression (is that a renamed field, or 1 new field and 1 deleted?).
This approach is fine for maintaining change history. But, of course, you should never be automatically committing to the "production build" (unless you like broken builds).
Although you didn't ask for it: This approach also won't produce a set of scripts that will upgrade a current DB. You'll only have initial creation scripts. Recording data progression and creation upgrade scripts is beyond basic source control systems.
I'd recommend Redgate SQL Compare for this - it allows you to compare database versions and generate change scripts - it's also fairly easily scriptable.
Based on your expanded question, you really want to use DDL triggers. Check out this article that details how to create a changelog system for your database.
Not sure on your price range, however DB Ghost could be an option for you.
I don't work for this company (or own the product) but in my researching of the same issue, this product looked quite promising.
I should've been a little more descriptive. The database in question is for an internal ERP system and thus we don't have many versions of our database, just Production/Testing/Development. When we've done a change request, some new fancy feature or something, we simply execute a script or series of scripts to update the procedures in question on the Testing database, if that is all good, then we do the same to Production.
So I'm not really after a full schema script per se, just something that can keep track of the various edits to the stored procedures over time. For example, PROCESS_INVOICE does stuff. It gets updated in some minor way in March. Some time later in say May it is discovered that in a rare case customers get double invoiced (or some other crazy corner case). I'd like to be able to see what has happened over time to this procedure. Currently the way the development environment is setup here I don't have that, which I'm trying to change.
I can recommend DBPro which is part of Visual Studio Team Edition. Have been using it for a few months for storing all parts of the database in Team Foundation Server as well as for deployment and database compares, etc.
Of course, as someone else mentioned, it does depend on your environment and price range.
I wrote a utility for dumping all of the relevant parts of my db into a directory structure that I use SVN on. I never got around to trying to incorporate it into the Manager but, if you're interested, it's here: http://www.reluctantdba.com/dbas-and-programmers/sqltools/svnforsql2005.aspx
It's free and, since I regularly run it, you know any bugs get fixed quickly.
You can always try integrating SourceSafe with SQL Server. Here's a quick start : link . To work with it you've got to have Managment Studio Developers Edition.

Resources