i have multiple dabatases (>100) with the identic structure.
For business-monitoring, i have about 80 queries which check information in the database.
Now, i want to execute each of this queries on each of this databases and load the result into splunk.
In splunk, is it possible to define the >100 database-connections once and the 80 queries once and then make some "magic" step to execute each statement on each database?
I don't want to create a new connection for each combination of database and query.
The only way Splunk has to connect to a database "itself" is via DB Connect (docs)
From Splunk's perspective, there is no way to connect to 100 databases without having unique connections to each.
So far as I know, there is no tool that will connect to more than one database without unique connections - that's something database servers enforce in transactional models.
That being said, if you have a way to enumerate all the databases you want to connect to, and a place to save the queries you want to run, you could build either a
scripted-input add-on that could use your language of choice (whatever's available on the Splunk server(s)/endpoint(s) it's running on) to iterate through each database, run each query, and ship the results back to Splunk, or
in similar fashion to the scripted-input option, write a script (or set of scripts) that would execute the queries in question against the databases you're targeting, and submit results to the HTTP Event Collector (HEC) (HL has a great write-up on HEC over here, and here's George Starcher's Python class for HEC)
Related
I'm working on a Microsoft BI project.
I am currently in the process of connecting my systems to SQL Server. I want to connect my Active Directory to a table in SQL Server and I want to sync to one table per hour. This means that every hour the details of the Active Directory will be updated.
I realized that it is necessary to use SSIS to do this I would be happy for help to connect my AD to SQL Server with the help of SSIS.
There are two routes available to you to sync AC user classes to a table. You can use an ADO source in an SSIS Data Flow Task or you can write custom .NET code as part of a Script Source. The right answer depends on your team's ability to maintain and troubleshoot a particular solution as well as the size of your AD tree/forest. If you're a small shop (under a thousand) anything is going to work. If you're a larger shop, then you need to worry about the query mechanism and the total rows returned as there is an upper boundary of how many results can be returned in a single query. In that case, then a script task likely makes more sense as you can more easily write a query to pull all the accounts that start with A, B, etc. I've never worked with Hebrew, so I assume one could do a similar filter for aleph, bet, etc.
General steps
Identify your domain controller as you need to know what server to ask information from. I do not know how to deal with Azure Active Directory requests as I believe it works a bit different there but haven't had client work that needed it.
Create a Connection Manager for ADO.NET . Use the ".Net Providers for OleDb\OLE DB Provider for Microsoft Directory Services" and point that to your DC.
Write a query to pull back the data you need. Based on the comment, it seems you want something like this
SELECT
distinguishedName
, mail
, samaccountname
, mobile
, telephoneNumber
, objectSid
, userAccountControl
, title
, sn
FROM
'LDAP://DC=domain,DC=net'
WHERE
sAMAccountType = 805306368
ORDER BY
sAMAccountName ASC
Using that query, we'll add a Data Flow Task and within it, add an ADO.NET Source. Configure it to use our ADO.NET Connection manager and use the above query (adjusting for the LDAP line and any other fields you do/don't need)
Add an OLE DB Connection Manager to your package and point it to the database that will record the data.
Add an OLE DB Destination to the Data Flow and connect the output line from the ADO.NET Source to this destination. Pick the table in the drop down list and on the Columns tab, make sure you have all of your columns connected. You might run into issues where the data types don't match so you'll need to figure out how to handle that - either change your table definition to match the source or you need to add data conversion/derived columns components to the data flow to mangle the data into the correct shape.
You might be tempted to pull in group membership. Do not. Make that a separate task as a person might be a member of many groups (at one client, I am in 94 groups). Also, the MemberOf data type is a DistinguishedName, DN, which SSIS cannot handle. So, check your types before you add them into an AD query.
References
ldap query to get disabled user records with whenchanged within 30 days
http://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html
Is there a particular part of the AD that you want? In any but the smallest corporations the AD tends to be huge. Making a SQL copy of an entire forest every hour is a very strange thing that may have many adverse effects on your AD, network, security and domain-wide performance.
If you are just looking to backup your AD, I believe that there are other options available, specific to the Windows AD (maybe even built-in, I'm not an AD expert).
If you really, truly want to do this here is a link to get you started: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS. You can find many more articles on this just by Googling "Copy AD to Sql".
However, heed the warnings well: the AD is effectively a multi-domain-wide distributed database, attempting to copy it into a centralized database like SQL Server every hour is contra-indicated. You are really fighting against its design.
UPDATE Based on the Comments:
Basically you've got too much in one question here. Sql Server, SSIS and the Active Directory (AD) are each huge subjects in and of themselves and the first time that you attempt to use all of them together you will run into many individual issues depending on your environment, experience and specific project goals. We cannot anticipate all of them in a single answer on this site.
You need to start using the information you have from the following links to begin to implement this yourself, and then ask specific questions as you run into problems along the way.
Here are the links that you can start with,
The link I provided above from MS: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS
The link that you provided in the comments that explains how to setup ADSI as a linked server and how to use T-SQL on it: https://yiengly.wordpress.com/2018/04/08/query-active-directory-in-sql-server-with-linked-server/
This one explain how to use AD from within an SSIS DataFlow task (but is limited to 1000 rows): https://dataqueen.unlimitedviz.com/2012/05/importing-data-from-active-directory-using-ssis/
This related one explains how to use AD within an SSIS Script task to get around the DataFlow task limits: https://dataqueen.unlimitedviz.com/2012/09/get-around-active-directory-paging-on-ssis-import/
As you work your way through this you may run into specific problems, which you can ask about at https://dba.stackexchange.com which has more specific expertise with Sql Server and SSIS.
Based on your goals, I think that you will want to use a staging table approach. That is, use your AD/Sql query to import all of the AD users records into a new/empty temporary table that has the same column definition as your production table, then use a Merge query to find and update the changed user records and insert the new user records (this is called a Differential or Type II update).
I work on multiple Azure databases, using AzureAD - Universal with MFA in SSMS.
I usually have a number of different scripts on the go, comparing results between the different databases, then an hr or so later, the connection is terminated by the server and if I try to re-run it, it tells me doesn't recognise the table, because it no longer knows which database to connect to.
Is there a way to specify the database within SSMS, if I try USE dbDocument before the script, it tells me Msg 40508, Level 16, State 1, Line 1 USE statement is not supported to switch between databases. Use a new connection to connect to a different database..
Is there any way around this, without having to right-click the db in question and hitting NEW QUERY?
The USE statement is not supported on Azure SQL Database. One solution is to split your work to several different steps according to the database that you want to use, and connect to the right database for each step.
If you want to execute the same query on multiple Azure Databases then you can use one of these options:
Use horizontal partitioned databases based on shard map using the elastic database. Here you can find how to query a remote database.
You can also use elastic queries.
Or you can create your solution based on Azure Data Lake.
Looking at this MS doc, I doubt this is possible at all...
If a database other than the current database is provided, the USE statement does not switch between databases, and error code 40508 is returned. To change databases, you must directly connect to the database. The USE statement is marked as not applicable to SQL Database at the top of this page, because even though you can have the USE statement in a batch, it doesn't do anything.
Our team is trying to create an ETL into Redshift to be our data warehouse for some reporting. We are using Microsoft SQL Server and have partitioned out our database into 40+ datasources. We are looking for a way to be able to pipe the data from all of these identical data sources into 1 Redshift DB.
Looking at AWS Glue it doesn't seem possible to achieve this. Since they open up the job script to be edited by developers, I was wondering if anyone else has had experience with looping through multiple databases and transfering the same table into a single data warehouse. We are trying to prevent ourselves from having to create a job for each database... Unless we can programmatically loop through and create multiple jobs for each database.
We've taken a look at DMS as well, which is helpful for getting the schema and current data over to redshift, but it doesn't seem like it would work for the multiple partitioned datasource issue as well.
This sounds like an excellent use-case for Matillion ETL for Redshift.
(Full disclosure: I am the product manager for Matillion ETL for Redshift)
Matillion is an ELT tool - it will Extract data from your (numerous) SQL server databases and Load them, via an efficient Redshift COPY, into some staging tables (which can be stored inside Redshift in the usual way, or can be held on S3 and accessed from Redshift via Spectrum). From there you can add Transformation jobs to clean/filter/join (and much more!) into nice queryable star-schemas for your reporting users.
If the table schemas on your 40+ databases are very similar (your question doesn't clarify how you are breaking your data down into those servers - horizontal or vertical) you can parameterise the connection details in your jobs and use iteration to run them over each source database, either serially or with a level of parallelism.
Pushing down transformations to Redshift works nicely because all of those transformation queries can utilize the power of a massively parallel, scalable compute architecture. Workload Management configuration can be used to ensure ETL and User queries can happen concurrently.
Also, you may have other sources of data you want to mash-up inside your Redshift cluster, and Matillion supports many more - see https://www.matillion.com/etl-for-redshift/integrations/.
You can use AWS DMS for this.
Steps:
set up and configure DMS instance
set up target endpoint for redshift
set up source endpoints for each sql server instance see
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SQLServer.html
set up a task for each sql server source, you can specify the tables
to copy/synchronise and you can use a transformation to specify
which schema name(s) on redshift you want to write to.
You will then have all of the data in identical schemas on redshift.
If you want to query all those together, you can do that by wither running some transformation code inside redsshift to combine and make new tables. Or you may be able to use views.
I have two databases, one on a remote server the other local. (SQL Server 2008)
The database on my local server has the entire structure setup but no data. I would like to copy the data from the remote server to my server and I am wondering the best method in which to do this.
The main issue I am experiencing is the user that I have to the remote database has limited permissions. I cannot read the stored procedures, user defined functions so when I use Import/Export wizard I do not get the schema etc. So a regular dump/restore is not working for me as it restores the tables without the Primary Keys/Foreign Keys and the stored procedures.
I'd like to do this,
INSERT INTO localtable SELECT * FROM remotedb.table
I was having issues because of the IDENTITY fields and I had to explicitly name all of the columns. Also I am not sure if SQL Server Management Studio allows you to use two different databases, remote and local, so I was looking for any advice.
I have also tried applications like SQL FTP and Backup and it fails because it runs out of memory (I have 16GB of memory on the machine and the DB is like 4GB). I also can use the SQL Server import/export wizard but then I don't get the schema information. I also tried SQL Compare from Red Gate and it runs into issues with the permissions. Unfortunately I do not have the time to request and gain access to a new user so I was hoping someone had a creative idea.
You can definitely use SQL Server Backups for this. It will not run out of memory. If it does please tell us the message (because likely you are misinterpreting it). This is the fastest possible and the most complete solution.
You can tell the export wizard to also script the schema. It is hidden under "advanced" somewhere (terrible UI). But the script will be extremely big and I know of no way to execute it.
You can drop all schema objects except PKs in the target database. Then you can use remote queries to copy all the data over. You will not get any problems with foreign keys and identity columns if you drop the beforehand. After you are done you can recreate all those objects. It is probably best if you use a transaction for all of this because that way you get consistent source data from a point-in-time.
Ive run into the issue where I need to query 2 separate databases(same instance) in one query.
I am used to doing this with mysql, but Im not sure how to do it with DB2.
In mySQL it would be something like:
SELECT user_info.*, game.*
FROM user_info, second_db.game_stats as game
WHERE user_info.uid = game.uid
So the question is how i translate a query like that into DB2 syntax?
Equivalent of this
Is there a reason why you have the tables in a separate database? MySQL doesn't support the concept of schemas, because in MySQL a "schema" is the same thing as a "database". In DB2, a schema is simply a collection of named objects that lets you group them together.
In DB2, a single database is much closer to an entire MySQL server, as each DB2 database can have multiple schemas. With multiple schemas inside the same database, your query can run more or less unchanged from how it is written.
However, if you really have 2 separate DB2 databases (and, for some reason, don't want to migrate to a single database with multiple schemas): You can do this by defining a nickname in your first database.
This requires a somewhat convoluted process of defining a wrapper (CREATE WRAPPER), a server (CREATE SERVER), user mapping(s) (CREATE USER MAPPING) and finally the nickname (CREATE NICKNAME). It is generally easiest to do these tasks using the Control Center GUI because it will walk you through the process of defining each of these.