I have a table Users in SQL Server of users logging into a platform. Among other columns, it has:
user_id (a guid that uniquely identifies a user)
Date ( a dd/mm/yyyy date) - this is when the user logs to a platform
version (a platform version)
The table is populated with thousands of lines
I’d like to know what % of my users still log on after 1,2,3,...n months after first use, but not sure how to tackle this problem.
I am thinking about a concept of “new” users, i.e. the id’s that appearing for the first time new in a month, and the “existing” users, i.e users already there before. I am not sure if SQL is the best tool to make this report, or should I use any other tool such as Power BI. Any advice appreciated.
Related
I'm working on a Microsoft BI project.
I am currently in the process of connecting my systems to SQL Server. I want to connect my Active Directory to a table in SQL Server and I want to sync to one table per hour. This means that every hour the details of the Active Directory will be updated.
I realized that it is necessary to use SSIS to do this I would be happy for help to connect my AD to SQL Server with the help of SSIS.
There are two routes available to you to sync AC user classes to a table. You can use an ADO source in an SSIS Data Flow Task or you can write custom .NET code as part of a Script Source. The right answer depends on your team's ability to maintain and troubleshoot a particular solution as well as the size of your AD tree/forest. If you're a small shop (under a thousand) anything is going to work. If you're a larger shop, then you need to worry about the query mechanism and the total rows returned as there is an upper boundary of how many results can be returned in a single query. In that case, then a script task likely makes more sense as you can more easily write a query to pull all the accounts that start with A, B, etc. I've never worked with Hebrew, so I assume one could do a similar filter for aleph, bet, etc.
General steps
Identify your domain controller as you need to know what server to ask information from. I do not know how to deal with Azure Active Directory requests as I believe it works a bit different there but haven't had client work that needed it.
Create a Connection Manager for ADO.NET . Use the ".Net Providers for OleDb\OLE DB Provider for Microsoft Directory Services" and point that to your DC.
Write a query to pull back the data you need. Based on the comment, it seems you want something like this
SELECT
distinguishedName
, mail
, samaccountname
, mobile
, telephoneNumber
, objectSid
, userAccountControl
, title
, sn
FROM
'LDAP://DC=domain,DC=net'
WHERE
sAMAccountType = 805306368
ORDER BY
sAMAccountName ASC
Using that query, we'll add a Data Flow Task and within it, add an ADO.NET Source. Configure it to use our ADO.NET Connection manager and use the above query (adjusting for the LDAP line and any other fields you do/don't need)
Add an OLE DB Connection Manager to your package and point it to the database that will record the data.
Add an OLE DB Destination to the Data Flow and connect the output line from the ADO.NET Source to this destination. Pick the table in the drop down list and on the Columns tab, make sure you have all of your columns connected. You might run into issues where the data types don't match so you'll need to figure out how to handle that - either change your table definition to match the source or you need to add data conversion/derived columns components to the data flow to mangle the data into the correct shape.
You might be tempted to pull in group membership. Do not. Make that a separate task as a person might be a member of many groups (at one client, I am in 94 groups). Also, the MemberOf data type is a DistinguishedName, DN, which SSIS cannot handle. So, check your types before you add them into an AD query.
References
ldap query to get disabled user records with whenchanged within 30 days
http://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html
Is there a particular part of the AD that you want? In any but the smallest corporations the AD tends to be huge. Making a SQL copy of an entire forest every hour is a very strange thing that may have many adverse effects on your AD, network, security and domain-wide performance.
If you are just looking to backup your AD, I believe that there are other options available, specific to the Windows AD (maybe even built-in, I'm not an AD expert).
If you really, truly want to do this here is a link to get you started: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS. You can find many more articles on this just by Googling "Copy AD to Sql".
However, heed the warnings well: the AD is effectively a multi-domain-wide distributed database, attempting to copy it into a centralized database like SQL Server every hour is contra-indicated. You are really fighting against its design.
UPDATE Based on the Comments:
Basically you've got too much in one question here. Sql Server, SSIS and the Active Directory (AD) are each huge subjects in and of themselves and the first time that you attempt to use all of them together you will run into many individual issues depending on your environment, experience and specific project goals. We cannot anticipate all of them in a single answer on this site.
You need to start using the information you have from the following links to begin to implement this yourself, and then ask specific questions as you run into problems along the way.
Here are the links that you can start with,
The link I provided above from MS: https://social.technet.microsoft.com/Forums/ie/en-US/79bb4879-4d82-4a41-81a4-c62afc6c4b1e/copy-all-ad-objects-to-sql-database?forum=winserverDS
The link that you provided in the comments that explains how to setup ADSI as a linked server and how to use T-SQL on it: https://yiengly.wordpress.com/2018/04/08/query-active-directory-in-sql-server-with-linked-server/
This one explain how to use AD from within an SSIS DataFlow task (but is limited to 1000 rows): https://dataqueen.unlimitedviz.com/2012/05/importing-data-from-active-directory-using-ssis/
This related one explains how to use AD within an SSIS Script task to get around the DataFlow task limits: https://dataqueen.unlimitedviz.com/2012/09/get-around-active-directory-paging-on-ssis-import/
As you work your way through this you may run into specific problems, which you can ask about at https://dba.stackexchange.com which has more specific expertise with Sql Server and SSIS.
Based on your goals, I think that you will want to use a staging table approach. That is, use your AD/Sql query to import all of the AD users records into a new/empty temporary table that has the same column definition as your production table, then use a Merge query to find and update the changed user records and insert the new user records (this is called a Differential or Type II update).
I'm new to the ETL world and I'm trying to arrange for a data file to be sent to us from another company so that then it can be ETLed to a data warehouse. I'll be developing the ETL via Integration Services on SQL Server 2014. Basically, I have five types of records: account records (meta records related to a person's account), purchase records, etc.
TO make it easier, Account records just give information about the account holder (name, account id, sex, etc), and purchase records show a history of purchases and their amounts, etc
My Question is: the company sending us the records is asking me this: how do you want the records arranged?
Multi-header/trailer: each header indicates the type of records we are getting (header1 will be Account)
Multiple files (each type of records will be on a separate file)
Mention the person followed by the records that belong to him.
For example:
Person X
Account Records....
Purchase Records...
...
Person Y
Account Records....
Purchase Records...
For SSIS as your ETL tool, always go for the same format within the file.
SSIS can handle Header records (in that we can skip them). It cannot handle trailer records (because our columns are no longer consistent.
1 and 3 both violate the above.
As SSIS gives you access to the .NET framework, you can write all the custom parsing and then you can handle any file format, even a 1 or 3 but that's rarely a wise investment on the part of your company unless you're just flush with .NET devs who want to write ETL. Use the Out Of the Box components until they don't meet the task at hand and then use script Tasks or Components to compensate. When that's the starting place for your package, it is usually fraught with peril.
I have a mini account software. In this software I can store multiple company data. The data is stored in SQL Server 2008 R2 database.
In current database I have a User table which stores all user names, a Company Master table which stores company details like name,address, session etc. and user ID as FK with user table. Next is tran table which link with company Master and stores vouchers details and others table link to tran tabel like bill, payment etc.
The app is build for small companies and professionals who keep & maintain there their client data. In that scenario all data is separate and mutually independedent. In case of the small company they maintain all subsidiary company's account related data in a single app. Some time they receipt or send any one subsidiary company data to that company or any government body or Audit firms. like mobile phone contacts, I can send all contacts or any selected contact.
Users used to select his/her company first form company Master and then add/edit reference data or view report on the basis of selected company ID.
Now my problem is the data volume is become very high on some client places because of 50 to 60 companies data are stored in a single database and how I get company ID wise backup or restore the data. Is filegroup of sql server can help on this matter? I have no knowledge of filegroup.
Please help me.
Do not split your SQL database into multiple SQL databases (either do not create more filegroups etc.) just because you need to get data filtered by the CompanyId. Everytime when your Client would need to create a new Company, your application would have to create a new database for it. This would also quite complicate things like app updates.
If you do not face any grave performance problems - like when using SQL Express and your client database is 9 GB (max. database size for Express is 10 GB) - leave 1 database for 1 client.
Be sure all your related tables are well indexed by the CompanyID column. Then you can provide means to export data by CompanyID from your application - custom reports, exports to csv files, Excel etc.
Database backup file is usually not used for passing data to other applications. Its goal is to assure disaster recovery - when the disk fails etc. then your client will be able to recover easily. On contrary when he would have 50 database files in place of just 1 he would have hard time restoring all those databases properly.
What I need to do is some data between 2 databases. The source can be anything (comma separated file, xls file, any database, ...), the destination is MS SQL Server.
I do not need to sync all data, I just need to sync particular tables.
Example:
I need to sync accounting Software (runs on PostgreSQL) CUSTOMERS table with CRM (runs on SQL Server).
Some problems this tool should be able to face:
1) Accounting software customers table has 1 field that is not mapped in crm customers table. (In this way I want to map this extra field to the field CUSTOMERS_CUSTOM_DATA.EXTRA_FIELD)
2) Having some rules (like sync only customers whose code is between 10000 and 99999)
3) Allowing to perform some post insert tasks (for example I am using manually managed seuqences for the tanble IDs, so after inserting 10 records I need to add 10 to the sequence)
4) Having an exception handling mechanism so if something is wrong it can wither call a sql server stored procedure (that I already have and it will send an e-mail to me) or simply send a message to notify that something was wrong in the nightly sync.
5) Be easy to schedule when to perform data sync (hourly, daily, INCLUDING MANUAL)
6) Perform data conversion: if Surname field in source table is varchar(20) and in destination table is varchar(15) I want to explicitly say "perform a truncation".
7) Have different rules for insert or update. For example in the source e-mail field is not present, but I want to populate it in the destination I decide to perform this operation on insert only, not on update. (for example as I insert a new customer I want to populate the e-mail field concatenating name and surname, but then I want to let the users to modify it, this first insertion is just to simplify data entry, but then this particular case will be handled manually. So I want to say (on insert populate e-mail field, on update don't do anything with email field)
8) In case of delete in the source db don't delete on the destination but only change the varchar(10) STATUS to DELETED.
Note: I know that Integration Services will be perfect for this, but I must support the Express Edition, so SSIS is not an option.
I created a bunch of scripts and scheduled stored procedure that at present do what I need, but it is very hard to maintain and the total lack of a GUI makes the work much slower. I remember having seeing TALEND time ago, maybe that tool is also the answr I need, anyway I need to provide a quick answer to management, so I have now no time to investigate all the tools on the market, and I would prefer to have a suggestion from an expert.
I believe SQL Server Integration Services does all that, and I believe SQL Server Management Studio allows you to create and package your SSIS jobs so that they can be deployed elsewhere.
Finally I went for TALEND, I never really used SSIS, I just saw a live demo of it at a SQL Server conference. Anyway Talend is a free alternative (and quite rich) to SSIS, so it will suite the needs of all customers, including the ones (95%) that has SQL Server Express.
I am stuck with a problem of implementing security at dimension level in SSAS. Here is what I did -
1. Defined a role in SSAS and applied security at dimension level (Unchecking cube dimensions that I don't want this role to access and setting Allowed & denied Sets).
2. Tested using Cube Browser, it worked fine.
3. Tested using SSRS, no change, I was still able to query the dimensions & get results that I don't want.
Question - Is it possible to propagate the security I define at Cube level to SSRS? I would like to believe yes it is.
If yes then here is what I need -
Users will logon to the Report Manager using Windows Identity (Integrated Authentication on IIS turned on -done)
Capture this identity to find out SSAS role that they belong to - I guess this would be through a query, does not seem to work automatically (How to do this?)
User works within the restrictions of this role in SSRS (role based security applied at SSAS level) i.e. if dimension X is not available to user, he/she should not be able to query it. (How to do this?)
I have referred quite a few blogs on this and even found one - http://www.sqlmag.com/Article/ArticleID/96763/sql_server_96763.html
but this one seems to have more information on how to set it up within SSAS, rather than how to use this in SSRS.
Anyone who has worked on this approach OR have an understanding of this please let me know.
I think you need to look at your datasource in SSRS on the report server, and make sure it is set to use the logged in users windows cred's once authenticated, it might be what you are looking for.
All you need to do is:
In the data source in SSRS report, specify the Role Name created in SSAS database like this:
Data Source=LOCALHOST;Initial Catalog=XXXXX;Roles=RoleName
Thanks
Sameer
I haven't done this in SSAS, but I've done it in the engine. Jeremiah Peschka has a blog about row-based security setup, and if you're going to do this with integrated Windows security, then you can use the user_name() function to grab the current login's name. You'll be using a lookup table for each dimension, with a row for each dimension row plus the user's name. When querying, join to the dimension security table like this:
FROM dbo.Customers cs
INNER JOIN dbo.CustomersSecurity css ON cs.CustomerId = css.CustomerId AND css.UserName = User_Name()
That way, your join will only return records for customers that the user can see.
The drawback is that if you're using partitioning, the engine won't build a good execution plan to only pluck the right records from the right partitions based on what your user can see. For example, if you log in as a user that can only see records in Florida, and your data is partitioned by state, it won't matter - the engine will still scan all partitions, because it won't be able to predict the user's info when the plan is built.