Creating sequence in query - sql-server

I am trying to add an incremental column to tally time in 30 second increments for a report and need to add said column to an existing query but cannot find a way to add it without rebuilding the report and I don't have that kind of time.
Using identity and sequence just give me an error because I am using an open query table generated from a Wonderware Historian query and querying that table. Now I need to add a column that puts a 30 second increment per line starting at zero for line one.
Sorry if I am wording this terribly I'm unsure how else to ask. Can anyone help me with some code to add a generated incremental INT column to a query without having to make a bunch of extra tables with joins?

Related

Search code for all column references to a given table in SQL Server

In migration effort we're being asked to optimize our code because our new data source has billions of rows versus our million row subsets which is being taken away.
To optimize we need to do what was being done for us previously and that is retrieve our subset of data. To that end we've been asked to only retrieve columns from a given table that we actually use.
The only way I can envision that is to create a stored procedure to query the metadata of a given table and then do a search for each column. I was hoping to find someone who actually had such a utility they would be willing to share as I'm under a severe time constraint.

Pentaho ETL Table Input Iteration

Context
Im having a table with Customer information. I want to find out the repeat customers in the table based on information like:
First_Name
Last_Name
DOB
Doc_Num
FF_Num
etc.
Now to compare one customer with the rest of the records in the same table, I need to:
read one record at a time
and compare this record with the rest in such a way that if a column does not match
then I need to compare the other columns for the records
Question
Is there a way to make the Table_Input step read or output one record at a time but it should read the next record automatically after the processing of the previous record is complete? This process should continue till all the records in the table are checked/ processed.
Also, would like to know if we can Iterate the same procedure instead of reading one record at a time from Table_Input?
To make your Table Input read and write row by row, doesn't see like the best solution and I don't think it would achieve what you want (e.g. keeping a track of previous records).
You could try using the Unique rows step, that can redirect a duplicate row (using the key you want) to another flow where it will be treated differently (or delete it if you don't want it). From what I can see you'll want to have multiple Unique rows to check each one of the columns.
Is there a way to make the Table_Input step read or output one record at a time but it should read the next record automatically after the processing of the previous record is complete?
Yes it is possible to change the buffer rows in between the steps. You can change the Nr of Rows in rowset to 1. But it is not recommended to change this property unless you run low on memory. This might make the tool behave abnormally.
Now as per the comments shared, i see there are two questions:
1. You need to check the count of duplicate entries:
You can achieve this result either using a Group By step or using the Unique step as answered by astro11. You can get the count of names easily and if the count is greater than 1, you can consider it as duplicate.
2. Checking on the two data rows:
You want to validate two names (for e.g.) like "John S" and "John Smith". Both are names should ideally be considered as a single name, hence a duplicate.
First of all this is a data quality issue and no tool will consider these rows as same. What you can do is to use a step called "Fuzzy match". This step based on the algorithms you choose will try to give you the measure of the closest match of Names. But for achieving this you need to have a seperate MASTER table with all the possible names. You can use "Jaro Winkler" algo to get the closest match.
Hope this helps :)

KeyLookup in massive columns in sql server

I have one simple query which has multiple columns (more than 1000).
When i run with single column it gives me result in 2 seconds with proper index seek, logical read, cpu and every thing is under thresholds.
But when i select more than 1000 columns it takes 11 mins for the result and gives me key lookup.
You folks have you faced this type of issue?
Any suggestion on that issue?
Normally, I would suggest to add those columns in the INCLUDE fields of your non-clustered index. Adding them in the INCLUDE removes the LOOKUP in the execution plan. But as everything with SQL Server, it depends. Depending on how the table is used i.e, if you're updating the table more than just plain SELECTing on it, then the LOOKUP might be ok.
If this query is run once per year, the overhead of additional index is probably not worth it. If you need quick response time, that single time of the year when it needs to be run, look into 'pre executing' it and just present the result to the user.
The difference in your query plan might be because of join elimination (if your query contains JOINs with multiple tables) or just that the additional columns you are requesting do not exist in your currently existing indexes...

Does every record has an unique field in SQL Server?

I'm working in Visual Studio - VB.NET.
My problem is that I want to delete a specific row in SQL Server but the only unique column I have is an Identity that increments automatically.
My process of work:
1. I add a row in the column (the identity is being incremented, but I don't know the number)
2. I want to delete the previous row
Is there a sort of unique ID that every new record has?
It's possible that my table has 2 exactly the same records, just the sequence (identity) is different.
Any ideas how to handle this problem?
SQL Server has a few functions that return the generated ID for the last rows, each with it's own specific strengths and weaknesses.
Basically:
##IDENTITY works if you do not use triggers
SCOPE_IDENTITY() works for the code you explicitly called.
IDENT_CURRENT(‘tablename’) works for a specific table, across all scopes.
In almost all scenarios SCOPE_IDENTITY() is what you need, and it's a good habit to use it, opposed to the other options.
A good discussion on the pros and cons of the approaches is also available here.
I want to delete the previous row
And that is your problem. There is no such concept in SQL as a 'previous row'. The word previous implies order and order applies only to queries, where is achieved by adding an ORDER BY clause. Tables have no order. You need to rephrase this in terms of "I need to delete the record that satisfies <this> condition.". This may sound to you like pedantic gibberish, but you will never find a solution until you acknowledged the problem.
Searching for a way to interpret the value of the inserted identity column and then subtracting 1 from it is flawed with many many many problems. It is incorrect under concurrency. It is incorrect in presence of rollbacks. It is incorrect after ETL jobs. Overall, never expect monotonically increasing identities, they're free to jump gaps and your code should be correct in presence of gaps.

Incremental reports with JasperReports

I am using JasperReports to generate reports from SQL Server on daily basis. The problem is that every day the report reads data from beginning, but I want it to exclude records read earlier and include only new rows. The database is old and doesn't have timestamp columns in table so there is no way to identify which records are 'new' and which ones are 'old'.
I am not allowed to modify it either.
Please suggest any other way if possible.
You can create a new table and every time you print records on your report, insert that records in the table. So you can use a query with a NOT EXISTS condition from the original table on the new table.
The obvious drawbacks of this approach is space consumption on the DB and the extra work needed in inserting records on the new table, but if you cannot modify the original table, it's the only solution.
Otherwise the Alex K suggestion is very good.

Resources