MS Access Queries Conversion to Sql Server - sql-server

I am converting lots of access queries to sql server stored procedure. So the sql need to meet the t-sql standard. For example IIF etc
Is there a tool that can convert big access queries to t-sql ? What is the best way of doing this ?

As far as a "tool" that will just convert the queries for you, I'm not aware of one. Neither is anyone on this thread or this site.
There are a couple places I can direct you, though, that can possibly help with the transition.
Here is a cheat sheet you can use as a quick glance when converting your queries.
If your queries use any [Forms]! references, there could also be an issue with that. (I've never tried it, but I am going to assume it doesn't work.)
This resource has probably the most detailed explanations on things you might need to learn in SQL Server. From stored queries, to handling NULLs to some of the other differences. There are also differences in MS Access SQL compared to T-SQL. Gordon Linoff briefly describes 10 important differences in his blog.
Access does not support the case statement, so conditional logic is
done with the non-standard IIf() or Switch() functions.
Access requires parentheses around each pair-wise join, resulting in
a proliferation of nesting in from clauses that only serves to
confuse people learning SQL.
Access join syntax requires the INNER for INNER JOIN. While it may
be a good idea to use inner for clarify, it is often omitted in
practice (in other databases).
Access does not support full outer join.
Access does not allow union or union all in subqueries.
Access requires the AS for table aliases. In most databases, this
is optional, and I prefer to only use as for column aliases.
Ironically, the use of as for table aliases is forbidden in Oracle.
Access uses double quotes to delimit strings (as opposed to single
quotes) and is the only database (to my knowledge) that uses & as a
string concatenation operator.
Access uses * for the wildcard in like rather than %.
Access allows BETWEEN AND . This is allowed in other databases, but
will always evaluate to false.
Access does not support window/analytic functions (using the over
and partition by clauses).
In sum, no, there is no tool that I have seen.

Related

Would SQL Server always go through the second condition in case of "WHERE #Arg IS NULL OR Name=#Arg"?

We usually encountered the case that select result that if not input argument or a column equals the argument, the expression like below comes to our brains:
WHERE #Arg IS NULL OR Name = #Arg
Almost all programming languages will ignore the second condition if #Arg is set to NULL. However, I find it was being executed in SQL Server very slowly and looked like the second condition always be went through.
Anyone could give me help?
This is not a question of boolean short circuit (which is not guaranteed in SQL, see On SQL Server boolean operator short-circuit) but a problem of compilation. SQL Server will have to compile a query plan for you and the plan will have to work for any value of #Arg. This requirement will eliminate many possible optimizations (eg. if you have an index on Name it cannot be used), resulting in unnecessarily slow queries.
This pattern is repeated again and again, typically in search forms, and the recommendation is to create different queries. The underlying problem is an incorrect API design that uses the same call to do different things (search by Name, search by ID etc).
The article linked by Martin goes to great length explaining pros and cons of various approaches. Erland's advice is more nuanced (use IF for simple cases, dynamic SQL for complex cases) but it boils down to the same thing: use different queries for each case, whether hard coded (IFs) or code-generated (dynamic SQL).

Query equivalence evaluation

My question is rooted in T-SQL, SQL Server environment, but its scope is not confined to this technology. I am working on a database with a quite complex business logic, with existing views, stored procedures and new ones to be designed. By means of comparisons of different queries or part of them, I have a strong feeling that there are sections performing the same job with a different arrangement, but of course to refactor the whole mess I need something more that a feeling; so I am trying to determine a way to demonstrate that two statements are equivalent.
An obvious but weak response could be to ascertain that the two queries A and B produce the same recordset: if A is a subset of B and B is a subset of A, they are the same recordset; but I am not sure that this is a good idea because, of course, a recordset is not a query, the results could depend on data and specific parameter values. My questions is: there is a method to prove the equivalence of two different queries? I would say yes, because the optimization performed by the database should works on this. Someone could provide me some pointer to documentation or books digging in this? If there is no general method to prove the equivalence, there is some smart approach based on regression testing performed according to some effective heuristic that does the job?
Edited later: in case, reverse engineering the queries (by hand?) by means of relational algebra, could be a superior method to assess the query equivalence instead of using other queries and / or the computer? There are automated tools helping in performing this "reverse engineering", in case?
Thanks a lot for helping
You probably can't prove it, since the problem seems to be NP-complete; check this SO question on query equivalence (that one is about Oracle, but there are a couple of answers / links that should be relevant for you).
You can check the execution plans of the two queries. If they are the same, you have your answer!
Only by the execution plan you can check it. Apart from that i dont think that there is any way to prove this thing.
You'll need to implement some "canonical query plan" generator for this (an "optimal query plan" as generated by the DBMS can be nondeterministic). In most cases, using alphabetical ordering of terms and tables as a tie-breaker will get you there.
I doubt you are going to be able to formally proof or disprove this but my take on this would be to
identify all use cases
identify all boundary values
identify all parameters
and derive a test plan from that. It would require you to
create testdata for each case
run both queries against that data
compare the results
If you don't find any differences after testing, you can be reasonably assured that both statements are equivallent.

Can Joins between views and table hurt performance?

I am new in sql server,
my manager has given me job where i have to find out performance of query in sql server 2008.
That query is very complex having joins between views and table. I read in internet that joins between views and table cause performance hurt?
If any expert can help me on this? Any good link where i found knowledge of this? How to calculate query performance in sql server?
Look at the query plan - it could be anything. Missing indexes on one of the underlying view tables, missing indexes on the join table or something else.
Take a look at this article by Gail Shaw about finding performance issues in SQL Server - part 1, part 2.
A view (that isn't indexed/materialised) is just a macro: no more, no less
That is, it expands into the outer query. So a join with 3 views (with 4, 5, and 6 joins respectively) becomes a single query with 15 JOINs.
This is a common "I can reuse it" mistake: DRY doesn't usually apply to SQL code.
Otherwise, see Oded's answer
Oded is correct, you should definitely start with the query plan. That said, there are a couple of things that you can already see at a high level, for example:
CONVERT(VARCHAR(8000), CN.Note) LIKE '%T B9997%'
LIKE searches with a wildcard at the front are bad news in terms of performance, because of the way indexes work. If you think about it, it's easy to find all people in the phone book whose name starts with "Smi". If you try to find all people who have "mit" anywhere in their name, you will find that you need to read the entire phone book. SQL Server does the same thing - this is called a full table scan, and is typically quite slow. The other issue is that the left side of the condition uses a function to modify the column (specifically, converting it to a varchar). Essentially, this again means that SQL Server cannot use an index, even if there was one for the column CN.Note.
My guess is that the column is a text column, and that you will not be allowed to change the filter logic to remove the wildcard at the beginning of the search. In this case, I would recommend looking into Full-Text Search / Indexing functionality. By enabling full text indexing, and using specific keywords such as CONTAINS, you should get better performance.
Again (as with all performance optimisation scenarios), you should still start with the query plan to see if this really is the biggest problem with the query.

How does an index work on a SQL User-Defined Type (UDT)?

This has been bugging me for a while and I'm hoping that one of the SQL Server experts can shed some light on it.
The question is:
When you index a SQL Server column containing a UDT (CLR type), how does SQL Server determine what index operation to perform for a given query?
Specifically I am thinking of the hierarchyid (AKA SqlHierarchyID) type. The way Microsoft recommends that you use it - and the way I do use it - is:
Create an index on the hierarchyid column itself (let's call it ID). This enables a depth-first search, so that when you write WHERE ID.IsDescendantOf(#ParentID) = 1, it can perform an index seek.
Create a persisted computed Level column and create an index on (Level, ID). This enables a breadth-first search, so that when you write WHERE ID.GetAncestor(1) = #ParentID, it can perform an index seek (on the second index) for this expression.
But what I don't understand is how is this possible? It seems to violate the normal query plan rules - the calls to GetAncestor and IsDescendantOf don't appear to be sargable, so this should result in a full index scan, but it doesn't. Not that I am complaining, obviously, but I am trying to understand if it's possible to replicate this functionality on my own UDTs.
Is hierarchyid simply a "magical" type that SQL Server has a special awareness of, and automatically alters the execution plan if it finds a certain combination of query elements and indexes? Or does the SqlHierarchyID CLR type simply define special attributes/methods (similar to the way IsDeterministic works for persisted computed columns) that are understood by the SQL Server engine?
I can't seem to find any information about this. All I've been able to locate is a paragraph stating that the IsByteOrdered property makes things like indexes and check constraints possible by guaranteeing one unique representation per instance; while this is somewhat interesting, it doesn't explain how SQL Server is able to perform a seek with certain instance methods.
So the question again - how do the index operations work for types like hierarchyid, and is it possible to get the same behaviour in a new UDT?
The query optimizer team is trying to handle scenarios that don't change the order of things. For example, cast(someDateTime as date) is still sargable. I'm hoping that as time continues, they fix up a bunch of old ones, such as dateadd/datediff with a constant.
So... handling Ancestor is effectively like using the LIKE operator with the start of a string. It doesn't change the order, and you can still get away with stuff.
You are correct - HierarchyId and Geometry/Geography are both "magical" types that the Query Optimizer is able to recognize and rewrite the plans for in order to produce optimized queries - it's not as simple as just recognizing sargable operators. There is no way to simulate equivalent behavior with other UDTs.
For HierarchyId, the binary serialization of the type is special in order to represent the hierarchical structure in a binary ordered fashion. It is similar to the mechanism used by the SQL Xml type and described in a research paper ORDPATHs: Insert-Friendly XML Node Labels. So while the QO rules to translate queries that use IsDescendant and GetAncestor are special, the actual underlying index is a regular relational index on the binary hierarchyid data and you could achieve the same behavior if you were willing to write your original queries to do range seeks instead of calling the simple method.

How to implement database engine independent paging?

Task: implement paging of database records suitable for different RDBMS. Method should work for mainstream engines - MSSQL2000+, Oracle, MySql, etc.
Please don't post RDBMS specific solutions, I know how to implement this for most of the modern database engines. I'm looking for the universal solution. Only temporary tables based solutions come to my mind at the moment.
EDIT:
I'm looking for SQL solution, not 3rd party library.
There would have been a universal solution if SQL specifications had included paging as a standard. The requirement for any RDBMS language to be called an RDBMS language does not include paging support as well.
Many database products support SQL with proprietary extensions to the standard language. Some of them support paging like MySQL with the limit clause, Rowid with Oracle; each handled differently. Other DBMS's will need to add a field called rowid or something like that.
I dont think you can have a universal solution (anyone is free to prove me wrong here;open to debate) unless it is built into the database system itself or unless there is a company say ABC that uses Oracle, MySQL, SQL Server and they decide to have all the various database systems provide their own implementation of paging by their database developers providing a universal interface for the code that uses it.
The most natural and efficient way to do paging is using the LIMIT/OFFSET (TOP in Sybase world) construct. A DBindependent way would have to know which engine it's running on and apply the proper SQL construct.
At least, that's the way I've seen it done in DB independent libraries' code. You can abstract away the paging logic once you get the data from the engine with the specific query.
If you really are looking for a single, one SQL sentence solution, could you show what you have in mind? Like the SQL for the temp table solution. That would probably get you more relevant suggestions.
EDIT:
I wanted to see what were you thinking because I couldn't see a way to do it with temp tables and not use a engine specific construct. You used specific constructs in the example. I still don't see a way to implement paging in the database with only (implemented) standard SQL. You could bring the whole table in standard SQL and page in the application, but that is obviously stupid.
So the question would now be more like "Is there a way to implement paging without using LIMIT/OFFSET or equivalent?" and I guess that the answer is "Sanely, no." You could try using cursors but you'll fall prey to database specific sentences/behavior there as well.
A wacko (read stupid) idea that just occurred to me would be to add a page column to the table, say create table test (id int, name varchar, phone varchar, page int) and then you can get page 1 with select * from table where page = 1. But that means having to add code to maintain that column, which, again could only be done by either bringing the whole database or using database specific constructs. That besides having to add a different column per each possible ordering and many other flaws.
I can't provide proof, but I really think you just can't do it sanely.
Proceed as usual:
Start by implementing it according to the standard. And then handle the corner cases, i.e. the DBMSes which don't implement the standard. How to handle the corner cases depends on your development environment.
You are looking for a "universal" approach. The most universal way to paginate is through the use of cursors, but cursor-based pagination don't fit very well with a non-stateful environment like a web application.
I've written about the standard and the implementations (including cursors) here:
http://troels.arvin.dk/db/rdbms/#select-limit-offset
SubSonic can do this for you if you if you can tolerate Open Source...
http://subsonicproject.com/querying/webcast-using-paging/
Other than that I know NHib does as well
JPA lets you do it with the Query class:
Query q = ...;
q.setFirstResult (0);
q.setMaxResults (10);
gives you the first 10 results in the result set.
If you want a DBMS independent raw SQL solution, I'm afraid you're out of luck. All the vendors do it differently.
#Vinko Vrsalovic,
as I wrote in question I know how to do it in most DBs. I what to find universal solution or get a proof that it doesn't exist.
Here is one stupid solution based on temporary table. It's obviously bad, so no need to comment on it.
N - upper bound
M - lower bound
create #temp (Id int identity, originalId int)
insert into #temp(originalId)
select top N KeyColumn from MyTable
where ...
select MyTable.* from MyTable
join #temp t on t.originalId = MyTable.KeyColumn
where Id between M and M
order by Id asc
drop #temp

Resources