Is there a way to catalog data in snowflake? - snowflake-cloud-data-platform

Is there a way in snowflake to describe/catalog schema so it is easier to search? For example, column name is "s_id", which really represents "system identifier". Is there a way to specify that s_id is "system identifier" and user can search "system identifier", which would return s_id as column name.

Just about every object in Snowflake allows for a comment to be added. You could leverage the comment field to be more descriptive of the columns intent, which can then be searched in information_schema.columns. I've seen many Snowflake customers use the comments as a full-blown data dictionary. My only suggestion is to not use any apostrophe's in your comments, as they screw up your get_ddl() results.

Adding comments along with table and field can or may serve the purpose but if you are in a complex enterprise environment and a lot of groups are part of it, it will be hard to follow this approach for long. Either you must have single ownership for data models tool which promotes such changes to the snowflake. Managing such information via comment is error-prone and that's why snowflake has partners which help on the different development requirement.
If you go to their partner website (link), you will find a lot of companies that have good integration with SF and one of them is Alation (link). You can try it for free for 30 days and see if that serves your purpose of data cataloging or not.

Related

SQL Server copy data across databases

I'm using SQL Server 2019. I have a "MasterDB" database that is exposed to GUI application.
I'm also going to have as many as 40 user database like "User1DB", "User2DB", etc. And all these user databases will have "exact same" schema and tables.
I have a requirement to copy tables data (overwriting target) from one user database (let's say "User1DB") to the other (say "User2DB"). I need to do this from within "MasterDB" database since the GUI client app going to have access to only this database. How to handle this dynamic situation? I'd prefer static SQL (in form of Stored Procedures) rather than dynamic SQL.
Any suggestion will greatly be appreciated. Thanks.
Check out this question here for transferring data from one database to another.
Aside from that, I agree with #DaleK here. There is no real reason to have a database per user if we are making the assumption that a user is someone who is logging into your frontend app.
I can somewhat understand replicating your schema per customer if you are running some multi-billion record enterprise application where you physically have so much data per customer that it makes sense to split it up, but based on your question that doesn't seem to be the case.
So, if our assumptions are correct, you just need to have a user table, where your fields might be...
UserTable
UserId
FName
LName
EmailAddress
...
Edit:
I see in the comments you are referring to "source control data" ... I suggest you study up on databases and how they're meant to be designed, implemented, and how data should be transacted. There are a ton of great articles and books out there on this with a simple Google search.
If you are looking to replicate your data for backup purposes, look into some data warehouse design principles, maybe creating a separate datastore in a different geographic region for that. The latter is a very complex subject to which I can't go over in this answer, but it sounds like that goes far beyond your current needs. My suggestion is to backtrack and hash out the needs for your application, while understanding some of the fundamentals of databases (or different methods of storing data). Implement something and then see where it can be expanded upon / refactored.
Beyond that, I can't be more detailed than the original question you posted. Hope this helps.

Help with MS ACCESS database structure

Help with MS ACCESS database structure
Using MS Acess Database 200 (*.mdb). I am trying to create a little databse app for myself for storing hoto knowledge about a number of our apps.
We currently have three applications, with a number of versions amongest each
I would like have a search able database where i can search for knowledge using keywords, title, app, or version
Would the following be correct?
Apps
id
app
verions
id
appId
version
knowledge
id
versionId
question
answer
keywords
id
knowledgeId
keyword
If you want more control over keywords (like who can add a new keyword), you may need a Keywords table, so you can try and control duplicates: (access, Access, MS Access). A KnowledgeKeywords table would be needed to create a many-to-many relationship between Knowledge and Keywords.
knowledge
id versionId question answer
knowledgekeywords
id knowledgeId keywordid
keywords
id keyword
It depends. Are you trying to get a database up and running that meets your needs? Or is it the intellectual challenge of making this thing that interests you?
If you want to make this database as a learning experience, then you're off to a good start. You can put forms, reports, tables, and of course some VBA. You'll probably need some more ID fields though. But the beauty of Access is that you can change the design relatively easily as you go.
If you already know your VBA or have other things to do, then there's a wide variety of freeware apps that will meet your needs. I googled "tree notes software" and found this: http://www.freedownloadscenter.com/Information_Management/Notes_Management_Tools/Tree_Notes.html. If you look at the screenshot, you'll see how you can put in software, versions, questions, etc. I haven't tried this, but I assume you can search for specific keywords. (This isn't the only Personal Information Manager of its kind. Search for one that suits you best).

how do I integrate the aspnet_users table (asp.net membership) into my existing database

i have a database that already has a users table
COLUMNS:
userID - int
loginName - string
First - string
Last - string
i just installed the asp.net membership table. Right now all of my tables are joined into my users table foreign keyed into the "userId" field
How do i integrate asp.net_users table into my schema? here are the ideas i thought of:
Add a membership_id field to my users table and on new inserts, include that new field in my users table. This seems like the cleanest way as i dont need to break any existing relationships.
break all existing relationship and move all of the fields in my user table into the asp.net_users table. This seems like a pain but ultimately will lead to the most simple, normalized solution
any thoughts?
I regularly use all manner of provider stacks with great success.
I am going to proceed with the respectful observation that your experience with the SqlProvider stack is limited and that the path of least resistance seems to you to be to splice into aspnet_db.
The abstract provider stack provides cleanly separated feature sets that compliment and interact with each other in an intuitive way... if you take the time to understand how it works.
And by extension, while not perfect, the SqlProviders provide a very robust backing store for the extensive personalization and security facilities that underly the asp.net runtime.
The more effort you make to understand the workings of these facilities, focusing less on how to modify (read: break) the existing schema and more on how to envision how your existing data could fit into the existing schema the less effort you will ultimately expend in order to ultimately end up with a robust, easily understandable security and personalization system that you did not have to design, write, test and maintain.
Don't get me wrong, I am not saying not to customize the providers. That is the whole point of an abstract factory pattern. But before you take it upon yourself to splice into a database/schema/critical infrastructural system it would behoove you to better understand it.
And once you get to that point you will start to see how simple life can be if you concentrate on learning how to make systems that have thousands of man hours in dev time and countless users every minute of every day work for you the more actual work you will get done on the things that really interest you and your stakeholders.
So - let me suggest that you import your users into the aspnet_db/sqlprovider stack and leverage the facilities provided.
The userId in aspnet_db is a guid and should remain that way for very many reasons. If you need to retain the original integral user identifier - stash it in the mobile pin field for reference.
Membership is where you want to place information that is relevant to security and identification. User name, password, etc.
Profiles is where you want to place volitile meta like names and site preferences.
Anyway - what I am trying to say is that you need to have a better understanding of the database and the providers before you hack it. Start of by understanding how to use it as provided and your experience will be more fruitful.
Good luck.
In my experience, the "ASP.NET membership provider" introduces more complexity than it solves. So I'd go for option 2: a custom user table.
P.S. If anyone has been using the "ASP.NET membership provider" with success, please comment!

Why use database schemas?

I'm working on a single database with multiple database schemas,
e.g
[Baz].[Table3],
[Foo].[Table1],
[Foo].[Table2]
I'm wondering why the tables are separated this way besides organisation and permissions.
How common is this, and are there any other benefits?
You have the main benefit in terms of logically groupings objects together and allowing permissions to be set at a schema level.
It does provide more complexity in programming, in that you must always know which schema you intend to get something from - or rely on the default schema of the user to be correct. Equally, you can then use this to allow the same object name in different schemas, so that the code only writes against one object, whilst the schema the user is defaulted to decides which one that is.
I wouldn't say it was that common, anecdotally most people still drop everything in the dbo schema.
I'm not aware of any other possible reasons besides organization and permissions. Are these not good enough? :)
For the record - I always use a single schema - but then I'm creating web applications and there is also just a single user.
Update, 10 years later!
There's one more reason, actually. You can have "copies" of your schema for different purposes. For example, imagine you are creating a blog platform. People can sign up and create their own blogs. Each blog needs a table for posts, tags, images, settings etc. One way to do this is to add a column
blog_id to each table and use that to differentiate between blogs. Or... you could create a new schema for each blog and fresh new tables for each of them. This has several benefits:
Programming is easier. You just select the approppriate schema at the beginning and then write all your queries without worrying about forgetting to add where blog_id=#currentBlog somewhere.
You avoid a whole class of potential bugs where a foreign key in one blog points to an object in another blog (accidental data disclosure!)
If you want to wipe a blog, you just drop the schema with all the tables in it. Much faster than seeking and deleting records from dozens of different tables (in the right order, none the less!)
Each blog's performance depends only (well, mostly anyway) on how much data there is in that blog.
Exporting data is easier - just dump all the objects in the schema.
There are also drawbacks, of course.
When you update your platform and need to perform schema changes, you need to update each blog separately. (Added yet later: This could actually be a feature! You can do "rolling udpates" where instead of updating ALL the blogs at the same time, you update them in batches, seeing if there are any bugs or complaints before updating the next batch)
Same about fixing corrupted data if that happens for whatever reason.
Statistics for all the platform together are harder to calculate
All in all, this is a pretty niche use case, but it can be handy!
To me, they can cause more problems because they break ownership chaining.
Example:
Stored procedure tom.uspFoo uses table tom.bar easily but extra rights would be needed on dick.AnotherTable. This means I have to grant select rights on dick.AnotherTable to the callers of tom.uspFoo... which exposes direct table access.
Unless I'm completely missing something...
Edit, Feb 2012
I asked a question about this: SQL Server: How to permission schemas?
The key is "same owner": so if dbo owns both dick and tom schema, then ownership chaining does apply. My previous answer was wrong.
There can be several reasons why this is beneficial:
share data between several (instances
of) an application. This could be the
case if you have group of reference
data that is shared between
applications, and a group of data
that is specific for the instance. Be careful not to have circular references between entities in in different schema's. Meaning don't have a foreign key from an entity in schema 1 to another entity in schema 2 AND have another foreign key from schema 2 to schema 1 in other entities.
data partitioning: allows for data to be stored on different servers
more easily.
as you mentioned, access control on DB level

How would you design your database to allow user-defined schema

If you have to create an application like - let's say a blog application, creating the database schema is relatively simple. You have to create some tables, tblPosts, tblAttachments, tblCommets, tblBlaBla… and that's it (ok, i know, that's a bit simplified but you understand what i mean).
What if you have an application where you want to allow users to define parts of the schema at runtime. Let's say you want to build an application where users can log any kind of data. One user wants to log his working hours (startTime, endTime, project Id, description), the next wants to collect cooking recipes, others maybe stock quotes, the weekly weight of their babies, monthly expenses they spent for food, the results of their favorite football teams or whatever stuff you can think about.
How would you design a database to hold all that very very different kind of data? Would you create a generic schema that can hold all kind of data, would you create new tables reflecting the user data schema or do you have another great idea to do that?
If it's important: I have to use SQL Server / Entity Framework
Let's try again.
If you want them to be able to create their own schema, then why not build the schema using, oh, I dunno, the CREATE TABLE statment. You have a full boat, full functional, powerful database that can do amazing things like define schemas and store data. Why not use it?
If you were just going to do some ad-hoc properties, then sure.
But if it's "carte blanche, they can do whatever they want", then let them.
Do they have to know SQL? Umm, no. That's your UIs task. Your job as a tool and application designer is to hide the implementation from the user. So present lists of fields, lines and arrows if you want relationships, etc. Whatever.
Folks have been making "end user", "simple" database tools for years.
"What if they want to add a column?" Then add a column, databases do that, most good ones at least. If not, create the new table, copy the old data, drop the old one.
"What if they want to delete a column?" See above. If yours can't remove columns, then remove it from the logical view of the user so it looks like it's deleted.
"What if they have eleventy zillion rows of data?" Then they have a eleventy zillion rows of data and operations take eleventy zillion times longer than if they had 1 row of data. If they have eleventy zillion rows of data, they probably shouldn't be using your system for this anyway.
The fascination of "Implementing databases on databases" eludes me.
"I have Oracle here, how can I offer less features and make is slower for the user??"
Gee, I wonder.
There's no way you can predict how complex their data requirements will be. Entity-Attribute-Value is one typical solution many programmers use, but it might be be sufficient, for instance if the user's data would conventionally be modeled with multiple tables.
I'd serialize the user's custom data as XML or YAML or JSON or similar semi-structured format, and save it in a text BLOB.
You can even create inverted indexes so you can look up specific values among the attributes in your BLOB. See http://bret.appspot.com/entry/how-friendfeed-uses-mysql (the technique works in any RDBMS, not just MySQL).
Also consider using a document store such as Solr or MongoDB. These technologies do not need to conform to relational database conventions. You can add new attributes to any document at runtime, without needing to redefine the schema. But it's a tradeoff -- having no schema means your app can't depend on documents/rows being similar throughout the collection.
I'm a critic of the Entity-Attribute-Value anti-pattern.
I've written about EAV problems in my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Here's an SO answer where I list some problems with Entity-Attribute-Value: "Product table, many kinds of products, each product has many parameters."
Here's a blog I posted the other day with some more discussion of EAV problems: "EAV FAIL."
And be sure to read this blog "Bad CaRMa" about how attempting to make a fully flexible database nearly destroyed a company.
I would go for a Hybrid Entity-Attribute-Value model, so like Antony's reply, you have EAV tables, but you also have default columns (and class properties) which will always exist.
Here's a great article on what you're in for :)
As an additional comment, I knocked up a prototype for this approach using Linq2Sql in a few days, and it was a workable solution. Given that you've mentioned Entity Framework, I'd take a look at version 4 and their POCO support, since this would be a good way to inject a hybrid EAV model without polluting your EF schema.
On the surface, a schema-less or document-oriented database such as CouchDB or SimpleDB for the custom user data sounds ideal. But I guess that doesn't help much if you can't use anything but SQL and EF.
I'm not familiar with the Entity Framework, but I would lean towards the Entity-Attribute-Value (http://en.wikipedia.org/wiki/Entity-Attribute-Value_model) database model.
So, rather than creating tables and columns on the fly, your app would create attributes (or collections of attributes) and then your end users would complete the values.
But, as I said, I don't know what the Entity Framework is supposed to do for you, and it may not let you take this approach.
Not as a critical comment, but it may help save some of your time to point out that this is one of those Don Quixote Holy Grail type issues. There's an eternal quest for probably over 50 years to make a user-friendly database design interface.
The only quasi-successful ones that have gained any significant traction that I can think of are 1. Excel (and its predecessors), 2. Filemaker (the original, not its current flavor), and 3. (possibly, but doubtfully) Access. Note that the first two are limited to basically one table.
I'd be surprised if our collective conventional wisdom is going to help you break the barrier. But it would be wonderful.
Rather than re-implement sqlservers "CREATE TABLE" statement, which was done many years ago by a team of programmers who were probably better than you or I, why not work on exposing SQLSERVER in a limited way to the users -- let them create thier own schema in a limited way and leverage the power of SQLServer to do it properly.
I would just give them a copy of SQL Server Management Studio, and say, "go nuts!" Why reinvent a wheel within a wheel?
Check out this post you can do it but it's a lot of hard work :) If performance is not a concern an xml solution could work too though that is also alot of work.

Resources