I get that Snowflake has it's own extensions to ANSI SQL such as UNDROP.
However, I would like to know what "dialect" (for want of a better word of SQL) does Snowflake use ?
I read in the docs:
Snowflake supports standard SQL, including a subset of ANSI SQL:1999 and the SQL:2003 analytic extensions
https://docs.snowflake.com/en/user-guide/querying.html
So, is the SQL that Snowflake uses recognised anywhere ? I am wondering about the situation when I have a SQL syntax problem and I want to post here on stackoverflow. Do I use tag:snowflake-sql which seems to be just a synonymn for snowflake-cloud-data-platform or is there another SQL-related tag on stackoverflow that I should use ? Obviously I want to get the best SQL-specific answers so I don't want to limit the tags too much if possible.
Is there a list of differences anywhere between ANSI SQL:1999, SQL:2003 and Snowflake ?
Just like every DB they support 90% of of Standard X and +10% different because it's cheap for them to do so, or it allows exposing underlying conceptual framing the DB is expressed in.
Which really means there is "no standard" that captures "what they do".
I different question might be to flip that on the head and question "why does one want a standard".
For the case of post here snowflake-cloud-data-platform is the current accept tag. And long as you don't have too many generic sql or other DB specific tags used, using any SQL that runs on Snowflake, is acceptable.
Many people like Gordan will give answers in standard ANSI SQL, and those answers a wonderful, but can sometimes be expressed in a more dense form, given snowflakes less rigid expression tree rules.
The other reason to want to know about standards is the classic, "we want to write standard SQL so we can move later" to which I often think, you should write the fastest performing SQL you can express, and move providers and re-express the intentions in the new DB, verse run poorly optimized "run anywhere" SQL that costs 30% more each and every month, because it 4 years time, there might be an 6 month DB move project.
Based on my experience I would say Snowflake SQL is closer to Postgres than MySQL / T-SQL
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
My sister is going to start taking classes to try to learn how to become a web developer. She's sent me the class lists for a couple of candidate schools for me to help guide her decision.
One of the schools mentions Microsoft Access as the primary tool used in the database classes including relational algebra, SQL, database management, etc.
I'm wondering - if you learn Microsoft Access will you be able to easily pick up another more socially-acceptable database technology later like MySQL, Postgres, etc? My experience with Access was not pleasant and I picked up a whole lot of bad practices when I played around with it during my schooling years.
Basically: Does Microsoft Access use standards-compliant SQL? Do you learn the necessary skills for other databases by knowing how Microsoft Access works?
Access I would say a lot more peculiarities over 'actual' databasing software. Access can be used as a front end for SQL databases easily and that's part of the program.
Let's assume the class is using databases built in Access. Then let's break it down into the parts of a database:
Tables
Access uses a simplified model for variables. Basically you can have typical number, text fields, etc. You can fix the number of decimals for instance like you could with SQL. You won't see variables like varchar(x) though. You will just pick a text field and set the field size to "8", etc. However, like a real database, it will enforce the limits you've put in. Access will support OLE objects, but it quickly becomes a mess. Access databases are just stored like a file and can become incredibly large and bloat quickly. Therefore using it for more than storing address books, text databases, or linking to external sources via code...you have to be careful about how much information you store just because the file will get too big to use.
Queries
Access implements a lot of things along the line of SQL. I'm not aware that it is SQL compliant. I believe you can just export your Access database into something SQL can use. In code, you interact with SQL database with DAO, ADO, ADODB and the Jet or Ace engines. (some are outdated but on older databases work) However, once you get to just making queries, many things are similar. Typical commands--select, from, where, order, group, having, etc. are normal and work as you'd see them work elsewhere. The peculiar things happen when you get into using calculated expressions, complicated joins (access does not implement some kinds of joins but you will see arguably the most important--inner join/union ). For instance, the behavior of distinct is different in Access than other database architecture. You also are limited in the way you use aggregate functions (sum/max/min/avg) . In essence, Access works for a lot of tasks but it is incredibly picky and you will have to write queries just to work around these problems that you wouldn't have to write in a real database.
Forms/Reports
I think the key feature of Access is that it is much more approachable to users that are not computer experts. You can easily navigate the tables and drag and drop to create forms and reports. So even though it's not a database in my book officially, it can be very useful...particularly if few people will be using the database and they highly prefer ease of use/light setup versus a more 'enterprise level' solution. You don't need crystal reports or someone to code a lot of stuff to make an Access database give results and allow users to add data as needed.
Why Access isn't a database
It's not meant to handle lots of concurrent connections. One person can hold the lock and there's no negotiating about it--if one person is editing certain parts of the database it will lock all other users out or at least limit them to read-only. Also if you try to use Access with a lot of users or send it many requests via code, it will break after about 10-20 concurrent connections. It's just not meant for the kinds of things oracle and mySQL are built for. It's meant for the 'everyman' computer user if you will, but has a lot of useful things programmers can exploit to make the user experience much better.
So will this be useful for you to learn more about?
I don't see how it would be a bad thing. It's an environment that you can more easily see the relational algebra and understand how to organize your data appropriately. It's a similar argument to colleges that teach Java, C++, or Python and why each has its merits. Since you can immediately move from Access to Access being the front-end (you load links to the tables) for accessing a SQL database, I'm sure you could teach a very good class with it.
MS-Access is a good Sand-pit to build databases and learn the Basic's (Elementary) design and structure of a Database.
MS-Access'es SQL implementation is jsut about equivalent to SQL1.x syntax. Again Access is a Great app for learning the interaction between Query's, Tables, and Views.
Make sure she doesnt get used to the Macro's available in Access as they structure doesnt translate to Main-Stream RDBMS. The best equivalent is Stored procedures (SProcs) in professional RDBMS but SProcs have a thousand fold more utility and functionality than any Access Macro could provide.
Have her play with MS-Access to get a look and feel for DBMS, but once she gets comfortable with Database design, have her migrate to either MS-SQL Express or MySQL or Both. SQL-Express is as close to the real thing without paying for MS-SQL Std. MySQL is good for the LAMP web infrastructures.
I don't have experience in database development, so I need your suggestions in choosing of a database that can be used in Firemonkey.
I need to store html files (without media now, but they can be with), their total size is around 20 GB (uncompressed text). A main feature must be maximally fast searching of text in the database, and it must be possible to implement human searching (like google). Plus, there can be compression (20 GB is to much to store. If compression makes searching slow it's not required).
What kind of databases are appropriate for my concern?
Thanks a lot for your suggestions!
Edited
Requirements:
Price: Free
Location: local or remote
Operating system support: Windows
System requirements: a database with a large footprint
(hopefully in exchange of better performances)
Performances: fast text searching
Concurrent users: 20
Full text indexing and searching: human (Google-like) fast
text searching is required
Manageability: doesn't matter much
I know an on-line web legal database that can search words through 100 GB of information in milliseconds. I need the same performance, and Google-like searching is required.
Delphi database access layer is separated from FireMonkey, it's the same used by VCL (although FM AFAIK relies only on LiveBindings to access data, but that's not an issue in your case).
Today 20 GB are really not much data. Almost any database will handle them without much effort if properly configured. What engine to choose depends on:
Price: how much are you going to spend for it?
Location: do you need a local database (same machine) or a remote one (LAN or WAN)?
Operating system support: which OS should it run on?
System requirements: do you need a database with a small footprint, or you can use one with a larger one (hopefully in exchange of better performances)?
Performances: what are the required performances?
Concurrent users: how much user will connect to the database concurrently?
Full text indexing and searching: not all databases offer it out of the box
Manageability: some databases may require more management than others.
There is no "one database fits all" yet.
I'm no DBA so I can't say directly, and honestly I'm not sure that any one person could give a direct answer to this question as it's one of those it just depends scenarios.
http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
That's a good starting point to compare features and platform compatibility. I think the major thing to consider here is what hardware will be running it and how can you best utilize that to accomplish the task at hand.
If you have a server farm being sure your DB supports distribution and some sort of load balancing (most do to some degree from what I understand).
To speed up searching unless you code up a custom algorithm that searches the compressed version somehow I think you're going to want to keep the data un-compressed. Searching the compressed data actually might be faster. If you're able to use the index for the compressed file to compare against your plain text search parameters then are just looking for those keys that were matched within the index. If any are found in the index check for them within the compressed data. Without tons of custom code I haven't heard of any DB that supports this idea of searching compressed text (though I could easily be wrong on this point).
If the entire data set needs to be decompressed before doing the search it will very likely be much slower (memory is relatively cheap compared to CPU time). It looks like Firemonkey has a limited selection of DBs to use so that will help to narrow your choices down as well.
What I would suggest based on your edited question, is to write (or find) a parser or regular expression to extract all the important elements from the HTML that you would like to be searchable. Then store those in a database along with a reference for where they were found in the HTML. In terms of Google like searching, if you mean in terms of how it can correct misspellings and use synonyms, you probably need some sort of custom code to do dictionary look ups for spelling and thesaurus look ups for synonyms. I believe full text searching in any modern DB will handle the need to query with LIKE or similar statements in the where clause.
Looks like ldsandon's answer covers most of this anyhow. TLDR; if not thanks for reading.
I would recommend PostgreSQL for this task. It has good performance, and built in full text search capability for Google-like searching. And it's free and open source.
Unfortunately Delphi doesn't come with Postgres data access components out of the box. You can connect by ODBC, or you can purchase components available from, for example, Devart, DA-Soft or microOLAP.
Have you considered NoSQL databases? The Wikipedia article explains their differences to SQL databases and also mentions that they are suited as document store.
http://en.wikipedia.org/wiki/NoSQL
The article lists around twelve implementations in the document store category, many are open source. (Jackrabbit, CouchDB, MongoDB).
This question on Stackoverflow contains some pointers to Delphi clients:
Delphi and NoSQL
I would also consider caching on the application server, to speed up search. And of course a text indexing solution like Apache Lucene.
I would take Microsoft SQL Server Express Edition. I think 2008 R2 is latest stable version but there is also Denali (2011). It match all criterien you have.
You can use ADO to work with.
Try the Advantage Database Server.
It's easy to manage and configure.
Both dbase-like and SQL data management languages.
Fast indexed full text search capabilities.
Plus, unparalled support from the developers themselves.
The local server (stand-alone version, as opposed to the network based server) is free.
devzone.advantagedatabase.com
There is a Firebird version with full text search according to its documentation - http://www.red-soft.biz/en/document_21 - it uses Apache Lucene, a popular search engine
For the first time in years I've been doing some T-SQL programming in SQL Server 2008 and had forgotten just how bad the language really is:
Flow control (all the begin/end stuff) feels clunky
Exception handling is poor. Exceptions dont bubble in the same way they do in every other language. There's no re-throwing unless you code it yourself and the raiserror function isn't even spelt correctly (this caused me some headaches!)
String handling is poor
The only sequence type is a table. I had to write a function to split a string based on a delimiter and had to store it in a table which had the string parts along with a value indicating there position in the sequence.
If you need to doa lookup in the stored proc then manipulating the results is painful. You either have to use cursors or hack together a while loop with a nested lookup if the results contain some sort of ordering column
I realize I could code up my stored procedures using C#, but this will require permissioning the server to allow CLR functions, which isn't an option at my workplace.
Does anyone know if there are any alternatives to T-SQL within SQL Server, or if there are any plans to introduce something. Surely there's got to be a more modern alternative...
PS: This isn't intended to start a flame-war, I'm genuinely interested in what the options are.
There is nothing wrong with T-SQL; it does the job it was intended for (except perhaps for the addition of control flow structures, but I digress!).
Perhaps take a look at LINQ? You can write CLR Stored Procedures, but I don't recommended this unless it is for some feature that's missing (or heavy string handling).
All other database stored procedure languages (PL/SQL, SQL/PSM) have about the same issues. Personally, i think these languages are exactly right for what they are intended to be used for - they are best used to do code data-driven logic, esp. if you want to reuse that for multiple applications.
So I guess my counter question to you is, why do you want your program to run as part of the database server process? Isn't what you're trying to do better solved at the application or middle-ware level? There you can take any language or data-processing tool of your choosing.
From My point of view only alternative to T-SQL within SQL Server is to NOT use SQL Server
According to your point handling stings whit delimiter ,
From where cames these strings ?
You could try Integration services and "ssis packages" for converting data from one to other.
Also there is nice way to access non SQL data over Linked Serves,
I am about to write a program to keep track of my school assignments and I was wondering what database language would be the most efficient and simple to implement to track the meta-data of the assignments? I am thinking about XML, but it would require several documents.
I (currently) have at least ten assignments per week for 45 weeks. The data that has to be stored includes name, issue date, due date, path, and various states of completion. What ever language it's in would have to be able to take a large increase in both the number of assignments and the amount of meta-data without having to make large changes in either the format or the retrieval system.
Quite frankly, if you pick a full-fledged database you run the risk of spending more time on data entry than you do on your homework. If you really need to keep track of this, I would seriously recommend a spreadsheet.
First, I think you are confusing a relational database system with a database language. In all likelihood, you will be using a database that uses SQL. From there, you will need to another programming platform to build an application around. If you wanted, you could use an Microsoft Access database that allows you to build a simple front-end that is stored in the same file as the database. In this case you would be programming with VBA.
Pretty much any more database system would be suitable for your needs, even Access handle orders of magnitude more work than you are describing.
Some possible database systems are, again, Microsoft Access, Microsoft SQL Server Express, VistaDB, SQLite (probably the best choice after access for your needs), and of course there are many others.
You could either build a web front end or a desktop; I assume you are using Windows. You could use Visual Studio C# Express for this if you wanted. Or you could go with VB.NET, VB6, or what have you.
My answer isn't directly related, but as you are designing your database structures you might want to take at some of the the objects in the SIF specification in particular look at the Assignment and GradingAssignment objects.
As for how to store the data, you could use a rdbms (sqlite, mysql) or perhaps key-value database (zodb, link).
Of course, if this is just a small personal project you could just serialize the data to something like xml, json, csv or whatever and storing it as a file. It might be better in the long run to use a database though. A database format will probably scale a lot easier.
I would recommend Oracle Express (With Application Express) It will scale up to 4gb of user data. Beyond that, you would have to start paying. Application Express is very simple and build CRUD applications for, which is what is sounds like yours is.
For a project like that I would use Sqlite or Mysql, it's be fast enough. Plus it's easy to setup.
Say I want to make an "Optimized query generator". Basically a SQL query optimizer that is a lot better than what can be put in an SQL server based on time/space limitations. It would take a query and DB stats as input and generate an SQL query tailored for the target system that will quickly optimize to a nearly ideal plan.
How much of SQL would need to be supported? Is there a subset of SQL that is flexible enough to easily describe most useful queries but enough smaller than full SQL to make it worth trimming it down to? Also is there a better way to describe queries if you don't need to stick "close to the machine"?
I'm not thinking of a program that you would process existing SQL through but rather a tool for creating new SQL from. It wouldn't actual need to take SQL as input as long as the input language is able to describe the requirements of the query.
I guess another form of the question would be: are their any parts of SQL that are only there for performance and never improve readability/understandability?
As pointed out by someone doing this would require "tons of product-specific knowledge" and that (e.g. nested sub queries vs whatever, what kind of indexes should be used, that sort of thing) is exactly what the tool would be intended to encapsulate so that the user doesn't need to learn that knowledge.
note: I am not interested in generating actual query plans as that is the DBMS's job and can't be done from SQL anyway. I'm interested in a system that can automate the job of making good SQL for a given DBMS from an input that doesn't need to be tuned for that DBMS.
I'm surprised to hear you describe SQL as "close to the machine". SQL itself is declarative rather than procedural, and one of the interesting aspects of relational databases is the freedom implementers have to innovate, since SQL itself dictates very little about how the queries should be executed.
I think for sheer utility, it would be very difficult to improve on SQL. I'm not saying it's the perfect language, but it is the lingua franca of relational (and even some non-relational) databases.
Bramha, I'm not sure if you know what you are asking. SQL Optimization isn't simply a matter of making sure that query components are in the right order. You seem to recognize that you'll need to have intimate knowledge of the indices, data page layouts, etc. etc. but you'd still be left with just reording query clauses unless you gain the appropriate "hooks" into the SQL Server query processor. Because that is what MS does - it essentially "compiles" queries down into a deeper, more fundamental level to optimize the data access.
umm...there are (I think, too lazy to google it) nine relational operators (scan, jump, hash-merge, etc.) that are used to construct the execution plan of a SQL query. The choice of operators is based on the usage statistics of the target database tables, available indices, et al.
It sounds like you're trying to recreate what the query planner already does...?
EDIT:
I don't think that most queries have that many options in how they can be executed, and
I don't think there is anything you could do to the SQL to force the DB engine to create an execution plan "your way" even if you did fine a more optimal solution.
unless you are planning on creating your own database engine!
I am very confused by this question; it looks like reinventing the wheel but with no wagon to mount it on!?
You might find the patterns in "SQL Queries for Mere Mortals" useful as they work through a structured canonical format starting with English descriptions.
Online at Safari, if you want to take a quick peek.
Is your intent to write this for a single specific database engine? If not, I suspect that you'll have a rather difficult time of this. Optimization of database queries relies heavily on the exact specifics of the engine's implementation and internals, as well as the tables, indexes, primary/foreign key relations, type and distribution of data, etc, etc. The actual logic of creating an optimized query would likely have very little overlap between different database engines. (For that matter, at least for MySQL the table type would make a huge difference on optimizations.) Each release of each supported DB engine may have significantly different characteristics, as well -- keep in mind that if you're generating SQL, then you need to be able to predict how the engine's own optimizer/query planner will handle the SQL you've generated.
The thing is, query optimization relies only weakly on relational theory, and very heavily on detailed knowledge of the DB's guts and the data being held. Even if you're able to extract the DB's metadata, I suspect that you'll have a difficult time producing a better query plan than the DB itself would -- and if you're not getting the DB's metadata, then your cause is hopeless.
Good luck - you've chosen to compete with such companies as Microsoft and Oracle, who live or die by how well their query optimizers do exactly what you propose. The first and primary way to compare one database product with another is with benchmark testing, where the same query workload is applied to each of them, timing measurements are taken, and the winner in most cases is determined by speed of execution.
The world will be impressed if you can do significantly better than the publisher on any of these benchmarks, using their products. At least you'll have a solid career opportunity with whichever one(s) you use.
This is a very old question by now, and I agree with most of the other answers that it is perhaps a bit misguided. But there is something to it. Have you read Gulutzan and Pelzer's "SQL Performance Tuning" (Addison-Wesley, 2003)? It compares a number of DBMSs and how equivalent but differently formulated queries impact the execution time. In other words, which idiosyncrasies and bugs exist in the query optimizers.
For example, they found that in most systems a WHERE clause such as WHERE column1 = 'A' AND column2 = 'B' will be evaluated from left to right, but from right to left in Oracle (under certain conditions, and in the particular version of Oracle that was current when they wrote the book). Therefore, the least likely condition should be put last in Oracle, but first in most other systems.