data model for performance monitoring for Tableau Server - sql-server

I have a question in regards to some performance monitoring.
I am going to connect to the Postgres database and what I want to do is extract the relevant information from tableau server database to my very own database.
I am following a document at the moment to perform the relevant steps needed to retrieve the performance information from Postgres, but what I really require to do is set up a data model for our very own database.
I’m not a strong DBA so may require help in designing the data model but the requirement is:
We want a model in place so that we can see how long workbooks take to load and if any of them take let’s say longer than 5 seconds, we are alerted of this so we can go in and investigate.
My current idea for the data model in very basic terms is having the following tables:
Users – Projects – Workbooks – Views – Performance
Virtually we have our users who access various projects that contain their very own workbooks. The views table is simply for workbook view so that we can see how many times a workbook has been viewed and when. Finally performance table is required for the load time.
This is a very basic description we require but my question is simply is there anyone who has knowledge of tableau and data models to help design a very basic model and schema for this? Will need it salable so that it can perform for as many tableau servers as it can.
Thank you very much,

I've found an article from the blog of a monitoring tool that I like and maybe it could help you with your monitoring. I'm not an expert in PostgreSQL but is worth to take a look:
http://blog.pandorafms.org/how-to-monitor-postgress/
Hope this can help!

Related

SQL Server copy data across databases

I'm using SQL Server 2019. I have a "MasterDB" database that is exposed to GUI application.
I'm also going to have as many as 40 user database like "User1DB", "User2DB", etc. And all these user databases will have "exact same" schema and tables.
I have a requirement to copy tables data (overwriting target) from one user database (let's say "User1DB") to the other (say "User2DB"). I need to do this from within "MasterDB" database since the GUI client app going to have access to only this database. How to handle this dynamic situation? I'd prefer static SQL (in form of Stored Procedures) rather than dynamic SQL.
Any suggestion will greatly be appreciated. Thanks.
Check out this question here for transferring data from one database to another.
Aside from that, I agree with #DaleK here. There is no real reason to have a database per user if we are making the assumption that a user is someone who is logging into your frontend app.
I can somewhat understand replicating your schema per customer if you are running some multi-billion record enterprise application where you physically have so much data per customer that it makes sense to split it up, but based on your question that doesn't seem to be the case.
So, if our assumptions are correct, you just need to have a user table, where your fields might be...
UserTable
UserId
FName
LName
EmailAddress
...
Edit:
I see in the comments you are referring to "source control data" ... I suggest you study up on databases and how they're meant to be designed, implemented, and how data should be transacted. There are a ton of great articles and books out there on this with a simple Google search.
If you are looking to replicate your data for backup purposes, look into some data warehouse design principles, maybe creating a separate datastore in a different geographic region for that. The latter is a very complex subject to which I can't go over in this answer, but it sounds like that goes far beyond your current needs. My suggestion is to backtrack and hash out the needs for your application, while understanding some of the fundamentals of databases (or different methods of storing data). Implement something and then see where it can be expanded upon / refactored.
Beyond that, I can't be more detailed than the original question you posted. Hope this helps.

split table rows of data into multiple tables according to column obeying constraints

I have a source flat file with about 20 columns of data an roughly 11K records. Each record (row) contains info such as
patientID,PatietnSSN.PatientDOB,PatientSex,PatientName,Patientaddress,PatientPhone,PatientWorkPhone,PatientProvider,PatientReferrer,PatientPrimaryInsurance,PatientInsurancePolicyID.
My goal is to move this data to a sql database.
I have created a database with the below datamodel
I know want to do a bulk insert to move all the records however I am unsure how to do that as you can see there are and have to be constraints in order to ensure referential integrity. What should my approach be? am I going about this all wrong? thus far I have used SSIS to import the data into a single staging table and now I must figure out how to write the 11k plus records to the individual tables in which they belong... so record 1 of the staging table will create 1 record across almost all of the tables minus perhaps the ones where there are 1 to many relationships like "provider" and "Referrer" as one provider will be linked to many patients but one patient can only have one provider.
I hope I have explained this well enough. Please help!
As the question is generic, I'll approach the answer in a generic way as well - in an attempt to at least get you asking the right questions.
Your goal is to get flat-file data into a relational database. This is a very common operation and is at least a subset of the ETL process. So you might want to start your search by reading more on ETL.
Your fundamental problem, as I see it, is two-fold. First, you have a large amount of data to insert. Second, you are inserting into a relational database.
Starting with the second problem first; Not all of your data can be inserted each time. For example, you have a provider table that holds a 1:many relationship with a patient. That means that you will have to ask the question of each patient row in your flat table as to whether the provider exists or needs creating. Also, you have seeded Ids, meaning that in some instance you have to maintain your order of creation so that you can reference the id of a created entry in the next created entry. What this means to you is that your effort will be more complex than a simple set of SQL inserts. You need to logic associated with the effort. There are several ways to approach this.
Pure SQL/TSQL; It can be accomplished but would be a lot of work and hard to debug/troubleshoot
Write a program: This gives you a lot of flexibility, but means you will have to know how to program and use programming tools for working with a database (such as an ORM)
Use an automated ETL tool
Use SQL Server's flat-file import abilities
Use an IDE with import capabilities - such as Toad, Datagrip, DBeaver, etc.
Each of these approaches will take some research and learning on your part -- this forum cannot teach you how to use them. And the decision as to which one you want to use will somewhat depend on how automated the process should be.
Concerning your first issue -- large data inserts. SQL has the facility for Bulk inserts docs, but you will have to condition your data first.
Personally (as per my comments), I am a .Net developer. But given this task, I would still script it up in Python. The learning curve is very kind in Python and it has lots of great tools for working with files and database. .Net and EF carry with it a lot of overhead with respect to what you need to know to get started that python doesn't -- but that is just me.
Hope this helps get you started.
Steve you are a boss, thank you. Ed thanks to you as well!
I have taken everyone's guidance into consideration and concluded that I will not be able to get away with a simple solution for this.
There are bigger implications so it makes sense to accomplish this ground work task in such a way that allows me to exploit my efforts for future projects. I will be proceeding with a simple .net web app using EF to take care of the data model and write a simple import procedure to pull the data in.
I have a notion of how I will accomplish this but with the help of this board I'm sure success is to follow! Thanks all-Joey
For the record tools I plan on using (I agree with the complexity and learning curve opinions but have an affinity for MS products).
Azure SQL Database (data store)
Visual Studio 2017 CE (ide)
C# (Lang)
.net MVC (project type)
EF 6 (orm)
Grace (cause I'm only human :-)

SQL Server switching live database

A client has one of my company's applications which points to a specific database and tables within the database on their server. We need to update the data several times a day. We don't want to update the tables that the users are looking at in live sessions. We want to refresh the data on the side and then flip which database/tables the users are accessing.
What is the accepted way of doing this? Do we have two databases and rename the databases? Do we put the data into separate tables, then rename the tables? Are there other approaches that we can take?
Based on the information you have provided, I believe your best bet would be partition switching. I've included a couple links for you to check out because it's much easier to direct you to a source that already explains it well. There are several approaches with partition switching you can take.
Links: Microsoft and Catherin Wilhelmsen blog
Hope this helps!
I think I understand what you're saying: if the user is on a screen you don't want the screen updating with new information while they're viewing it, only update when they pull a new screen after the new data has been loaded? Correct me if I'm wrong. And Mike's question is also a good one, how is this data being fed to the users? Possibly there's a way to pause that or something while the new data is being loaded. There are more elegant ways to load data like possibly partitioning the table, using a staging table, replication, have the users view snapshots, etc. But we need to know what you mean by 'live sessions'.
Edit: with the additional information you've given me, partition switching could be the answer. The process takes virtually no time, it just changes the pointers from the old records to the new ones. Only issue is you have to partition on something patitionable, like a date or timestamp, to differentiate old and new data. It's also an Enterprise-Edition feature and I'm not sure what version you're running.
Possibly a better thing to look at is Read Committed Snapshot Isolation. It will ensure that your users only look at the new data after it's committed; it provides a transaction-level consistent view of the data and has minimal concurrency issues, though there is more overhead in TempDB. Here are some resources for more research:
http://www.databasejournal.com/features/mssql/snapshot-isolation-level-in-sql-server-what-why-and-how-part-1.html
https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
Hope this helps and good luck!
The question details are a little vague so to clarify:
What is a live session? Is it a session in the application itself (with app code managing it's own connections to the database) or is it a low level connection per user/session situation? Are users just running reports or active reading/writing from the database during the session? When is a session over and how do you know?
Some options:
1) Pull all the data into the client for the entire session.
2) Use read committed or partitions as mentioned in other answers (however requires careful setup for your queries and increases requirements for the database)
3) Use replica database for all queries, pause/resume replication when necessary (updating data should be faster than your process but it still might take a while depending on volume and complexity)
4) Use replica database and automate a backup/restore from the master (this might be the quickest depending on the overall size of your database)
5) Use multiple replica databases with replication or backup/restore and then switch the connection string (this allows you to update the master constantly and then update a replica and switch over at a certain predictable time)

Best practices for analyzing/reporting database with 'flexible' schema

I am given a task to create views (Excel, websites, etc. not database 'view') for a SQL Server table with 'flexible' schema like below:
Session(guid) | Key(int) | Value(string)
My first thought is to create a series of 'standard' relational data tables/views that speak to the analysis/reporting requests. They can be either new tables updated by a daemon service who transforms data on a schedule, or just a series of views with deeply nested queries. Then, use SSAS, SSRS and other established ways to do the analysis and reporting. But I'm totally uncertain if that's the right line of thinking.
So my questions are:
Is there a terminology for this kind of 'flexible' schema so that I can search for related information?
Do my thoughts make sense or they're totally off?
If my thoughts make sense, should I create views with deep queries or new tables + data transform service?
I would start with an SSAS cube to expose all the values , presuming you can get some descriptive info from the key. The cube might have one measure (count) and three dimensions for each of your attributes.
This cube would have little value for end users (too confusing), but I would use it to validate whether any particular data is actually usable before proceeding. I think this is important because usually this data structure masks weak data validation and integrity in the source system.
Once a subject has been validated I would build physical tables via SSIS in preference to views - I find them easier to test and tune.
Finally found the terminology - it's called entity-attribute-value pattern (EAV) and there are a lot of discussions and resources around it.

What is the best approach for decoupled database design in terms of data sharing?

I have a series of Oracle databases that need to access each other's data. The most efficient way to do this is to use database links - setting up a few database links I can get data from A to B with the minimum of fuss. The problem for me is that you end up with a tightly-coupled design and if one database goes down it can bring the coupled databases with it (or perhaps part of an application on those databases).
What alternative approaches have you tried for sharing data between Oracle databases?
Update after a couple of responses...
I wasn't thinking so much a replication, more on accessing "master data". For example, if I have a central database with currency conversion rates and I want to pull a rate into a separate database (application). For such a small dataset igor-db's suggestion of materialized views over DB links would work beautifully. However, when you are dynamically sampling from a very large dataset then the option of locally caching starts to become trickier. What options would you go for in these circumstances. I wondered about an XML service but tuinstoel (in a comment to le dorfier's reply) rightly questioned the overhead involved.
Summary of responses...
On the whole I think igor-db is closest, which is why I've accepted that answer, but I thought I'd add a little to bring out some of the other answers.
For my purposes, where I'm looking at data replication only, it looks like Oracle BASIC replication (as opposed to ADVANCED) replication is the one for me. Using materialized view logs on the master site and materialized views on the snapshot site looks like an excellent way forward.
Where this isn't an option, perhaps where the data volumes make full table replication an issue, then a messaging solution seems the most appropriate Oracle solution. Oracle Advanced Queueing seems the quickest and easiest way to set up a messaging solution.
The least preferable approach seems to be roll-your-own XML web services but only where the relative ease of Advanced Queueing isn't an option.
Streams is the Oracle replication technology.
You can use MVs over database links (so database 'A' has a materialized view of the data from database 'B'. If 'B' goes down, the MV can't be refreshed but the data is still in 'A').
Mileage may depend on DB volumes, change volumes...
It looks to me like it's by definition tightly coupled if you need simultaneous synchronous access to multiple databases.
If this is about transferring data, for instance, and it can be asynchronous, you can install a message queue between the two and have two processes, with one reading from the source and the other writing to the sink.
The OP has provided more information. He states that the dataset is very large. Well how large is large? And how often are the master tables changed?
With the use of materialized view logs Oracle will only propagate the changes made in the master table. A complete refresh of the data isn't necessary. Oracle streams also only communicate the modifications to the other side.
Buying storage is cheap, so why not local caching? Much cheaper than programming your own solutions.
An XML service doesn't help you when its database is not available so I don't understand why it would help? Oracle has many options for replication, explore them.
edit
I've build xml services. They provide interoperability between different systems with a clear interface (contract). You can build a xml service in C# and consume the service with Java. However xml services are not fast.
Why not use Advanced Queuing? Why roll your own XML service to move messages (DML) between Oracle instances - It's already there. You can have propagation move messages from one instance to another when they are both up. You can process them as needed in the destination servers. AQ is really rather simple to set up and use.
Why do they need to be separate databases?
Having a single database/instance with multiple schemas might be easier.
Keeping one database up (with appropriate standby databases etc) will be easier than keeping N up.
What kind of immediacy do you need and how much bi-directionality? If the data can be a little older and can be pulled from one "master source", create a series of simple ETL scripts run on a schedule to pull the data from the "source" database into the others.
You can then tailor the structure of the data to feed the needs of the client database(s) more precisely and you can change the structure of the source data until you're blue in the face.

Resources