query to find out count of rows in a particular instance of time

query to find out count of rows in a particular instance of time - timesten

Is it possible to find out the count of rows in all tables at a particular instance of time (not depends on any column values)?
Thanks!

try:
select sum(num_rows)
from user_tables;
user_tables is a timesten table with useful information about the tables in the schema. It has a num_rows attribute that will give you what you are looking for.

Related

How to join query_id & METADATA$ROW_ID in SnowFlake

I am working on tracking the changes in data along with few audit details like user who made the changes.
Streams in Snowflake gives delta records details and few audit columns including METADATA$ROW_ID.
Another table i.e. information_schema.query_history contain query history details including query_id, user_name, DB name, schema name etc.
I am looking for a way so that I can join query_id & METADATA$ROW_ID so that I can find the user_name corresponding to each change in data.
any lead will be much appreciated.
Regards,
Neeraj

The METADATA$ROW_ID column in a stream uniquely identifies each row in the source table so that you can track its changes using the stream.
It isn't there to track who changed the data, rather it is used to track how the data changed.
To my knowledge Snowflake doesn't track who changed individual rows, this is something you would have to build into your application yourself - by having a column like updated_by for example.

Only way i have found is to add
SELECT * FROM table(information_schema.QUERY_HISTORY_BY_SESSION()) ORDER BY start_time DESC LIMIT 1
during reports / table / row generation
Assuming that you have not changed setting that you can run more queries at same time in one session , that gets running querys id's , change it to CTE and do cross join to in last part of select to insert it to all rows.
This way you get all variables in query_history table. Also remember that snowflake does keep SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY ( and other data ) up to one year. So i recommend weekly/monthly job which merges data into long term history table. That way you an also handle access to history data much more easier that giving accountadmin role to users.

Unrecognized name: _PARTITIONTIME

I am trying to find the total number of partitions in a bigQuery partitioned table. I am using below query:
`SELECT
_PARTITIONTIME AS pt, COUNT(1)
FROM
`dataset_name.table_name`
GROUP BY 1
ORDER BY 1 DESC`
I took a break from Bigquery for almost 4 months, and I remember this query used to work earlier. Am I missing something?

As #ElliottBrossard mentioned in the comments, _PARTITIONTIME is a pseudo column available on partitioned tables only. If your table is not partitioned, the query will not work.
You can find more information regarding partitioned tables here.

This query works for me
Please share your result and error if you need additional help

Comparing two rows in SQL Server

Scenario
A very large size query returns a lot of fields from multiple joined tables.
Some records seem to be duplicated.
You accomplish some checks, some grouping. You focus on a couple of records for further investigation.
Still, there are too much fields to check each value.
Question
Is there any built-in function that compares two records, returning TRUE if the records match, otherwise FALSE and the set of not matching fields?

The CHECKSUM function should help identify matching rows
SELECT CHECKSUM(*) FROM table

May be this is what you are looking for:
SELECT * FROM YourTable
GROUP BY <<ColumnList>>
HAVING COUNT(*) > 1
Just developing on the suggestion provide by Podiluska to find the records which are duplicates
SELECT CHECKSUM(*)
FROM YourTable
GROUP BY CHECKSUM(*)
HAVING COUNT(*) > 1

I would suggest that use the hashbytes function to compare rows.It is better than checksum.
What about creating a row_number and parttion by all the columns and then select all the rows which are having the rn as 2 and above? This is not slow method as well as it will give you perfect data and will give the full row's data which is being duplicated.I would go with this method instead of relying on all the hashing techniques..

Query Database Vs Query Datatable

I have a situation where i want to return a count of members in a database by category i have 6 categories in all and approx 15,000 members.
Therefore is it better to query the database 6 separate times using something like "select count(*)" or is it better to return all records, returning only the category column, and then query the data resulting table for each of the 6 categories to get a count.
The first method limits the db queries to one, but returns more data which has to be processed further,
The second method queries the db six times but provides the result via less data and no further processing.
I guess what i'm asking in the database engine quicker or is .net? I'm using sqlserver 2008 with .net4
Is there any best practice or reasons people know of why i should use one method over the other?
Thanks

I understand you just only need Catetory and Count. So you can do just one time query as follow.
SELECT CATEGORY, COUNT(CATEGRORY) TOTAL_COUNT
FROM TABLE
GROUP BY CATEGORY

It doesn't seem to be a good idea to quer the DB 6 times, while you have a group by. Not to mention if you need to join the category to a category table. Besides it has the drawback that you'll have to either hardcode the categories or query the tables JUST to get the categories (if there is no separate lookup category table). If you query the database to get the 6 categories dinamically... how would you do it? With a select distinct? With a group by?
In any case just to get the categories present in all rows it'll be a heavy query. So, if you're going to perform a heavy query, at least do it in the simplest way:
select category, count(*) CategoryCount from table
group by category

SQL Server Select Query

I have to write a query to get the following data as result.
I have four columns in my database. ID is not null, all others can have null values.
EMP_ID EMP_FIRST_NAME EMP_LAST_NAME EMP_PHONE
1 John Williams +123456789
2 Rodney +124568937
3 Jackson +124578963
4 Joyce Nancy
Now I have to write a query which returns the columns which are not null.
I do not want to specify the column name in my query.
I mean, I want to use SELECT * FROM TABLE WHERE - and add the filter, but I do not want to specify the column name after the WHERE clause.
This question may be foolish but correct me wherever necessary. I'm new to SQL and working on a project with c# and sql.
Why I do not want to use the column name because, I have more than 250 columns and 1500 rows. Now if I select any row, at least one column will have null value. I want to select the row, but the column which has null values for that particular row should not appear in the result.
Please advice. Thank you in advance.
Regards,
Vinay S

Every row returned from a SQL query must contain exactly the same columns as the other rows in the set. There is no way to select only those columns which do not return null unless all of the results in the set have the same null columns and you specify that in your select clause (not your where clause).
To Anders Abels's comment on your question, you could avoid a good deal of the query complexity by separating your data into tables which serve common purposes (called normalizing).
For example, you could put names in one table (Employee_ID, First_Name, Last_Name, Middle_Name, Title), places in another (Address_ID, Address_Name, Street, City, State), relationships in another, then tiny 2-4 column tables which link them all together. Structuring your data this way avoids duplication of individual facts, like, "who is John Williams's supervisor and how do I contact that person."

Your question reads:
I want to get all the columns that don't have a null value.
And at the same time:
But I don't want to specify column names in the WHERE clause.
These are conflicting goals. Your only option is to use the sys.tables and sys.columns DMVs to build a series of dynamic SQL statements. In the end, this is going to be more work that just writing one query by hand the first time.

You can do this with a dynamic PIVOT / UNPIVOT approach, assuming your version of SQL Server supports it (you'll need SQL Server 2005 or better), which would be based on the concepts found in these links:
Dynamic Pivot
PIVOT / UNPIVOT
Effectively, you'll select a row, transform your columns into rows in a pivot table, filter out the NULL entries, and then unpivot it back into a single row. It's going to be ugly and complex code, though.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

query to find out count of rows in a particular instance of time - timesten

Is it possible to find out the count of rows in all tables at a particular instance of time (not depends on any column values)? Thanks!

try: select sum(num_rows) from user_tables; user_tables is a timesten table with useful information about the tables in the schema. It has a num_rows attribute that will give you what you are looking for.

Related

How to join query_id & METADATA$ROW_ID in SnowFlake

Unrecognized name: _PARTITIONTIME

Comparing two rows in SQL Server

Query Database Vs Query Datatable

SQL Server Select Query

Categories

Resources