dbeaver row count estimate is 0 - database

QUESTION
I have connect to my redshift database using dbveaver, and I saw some tables(they do have a lot of data) has 0 as Row Count Estimate(tried to refresh but still got 0).
I want to know if this is due to the size of the table is too large? or some other reasons?
Is there anyway to make those 0 become a real estimate row count for my tables.
Thanks!!

Related

How to increase SQL Server select query performance?

I have a table with 102 columns and 43200 rows. Id column is an identity column and 2 columns have an unique index.
When I just execute
Select *
from MyTable
it takes almost 8 minutes+ over the network.
This table has a Status column which contains 1 or 0. If I select with where Status = 1, then I'm getting 31565 rows and the select is taking 6 minutes+. For your information status 1 completed and will not change ever anymore. But 0 status is working in progress and the rows are changing different columns value by different user stage.
When I select with Status = 0, it takes 1.43 minutes and returns 11568 rows.
How can I increase performance for completed and WIP status query separately or cumulatively? Can I somehow use caching?
The SQL server takes care of caching. At least as long as there is enough free RAM. When it take so long to get the data at first you need to find the bottleneck.
RAM: Is there enough to hold the full table? And is the SQL server configured to use it?
Is there an upper limit to RAM usage? If not SQL server assumes unlimited RAM and this will often end caching in page file, which causes massive slow downs
You said "8+ minutes through network". How long does it take on local execution? Maybe the network is slow
Hard drive: When the table is too big to be held in RAM it gets read from hard drive. HDDs are somewhat slow. Maybe defragmenting the indices could help here (at least somewhat)
If none helps, the SQL profiler might help to show you where the bottleneck actually is to find
This is an interesting question, but it's a little open-ended, more info is needed. I totally agree with allmhuran's comment that maybe you shouldn't be using "select * ..." for a large table. (It could in fact be posted as an answer, it deserves upvotes).
I suspect there may be design issues - Are you using BLOB's? Is the data at least partially normalized? ref https://en.wikipedia.org/wiki/Database_normalization
I Suggest create a non clustered index on "Status" Column. It improves your queries with Where Clause that uses this column.

Oracle Historgram and reading wrong index

I have 2 databases, one is the main database that many users work on it and a testing database, the second one is test database that loaded by a dump from the main DB.
I have a select query a with join conditions and union all on a table TAB11 that contains 40 million rows.
The problem that the query is reading wrong index in the main DB but in test DB is reading correct index. Note that both have latest gather statistics on the table and same count rows. I start to dig into histograms and skew data and I noticed in main DB the table has 37 histogram created on its columns ,however in the test db the table has only 14 columns has histogram. so apparently those created histogram are effecting the query plan to read wrong index (right?). ( those histogram created by oracle , and not by anyone)
My question:
-should I remove the histogram from those columns, and when I gather static again oracle will create the needed one and read them correctly ? but I am afraid it will effect the performance of the table.
-should I add this when i gather tab statistics method_opt=>'for all columns size skewonly' but I am not sure if the data are skewed or not.
-should I run gather index stats on the desired index and the oracle might read it?
how to make the query read the right index, without droping it or using force index?
There are too many possible reasons for choosing a different index in one DB vs another (including object life-cycle differences e.g. when data gets loaded, deletions/truncations/inserts/stats gathering index rebuilds ...). Having said that, in cases like this I usually do a parameter by parameter comparison of the initialization parameters on each DB; also an object by object comparison (you've already observed a delta in the histogram; thee may be others as well that are impacting this).

SQL Server - insert large number of rows?

I've read many answers here about this topic, but everyone suggests the BCP || SqlBulkCopy class from .net
I have a query which inserts into targetTable the union of 5 selects from different tables.
I have correct indexes on the tables being selected. And only 1 clustered identity index on the targetTable. However this takes a long time (~25 min). I'm talking about 5M rows (x 20 columns).
When I look at sp_who2, most of the time, it is suspended...
I want to use bulk copy but not from .net (the db already fetches the data - so I don't need to go to C#).
Questions
How can I use bulk insert (no bcp) in my select command?
Also, why is it suspended most of time? How can I give my query a higher priority?
Thank you.
p.s. I can't use bcp here because of security restrictions... I don't have permission to run this.
You're right: This is taking longer than usual. You're getting 3k rows per second. You should get 10k or 20k per second easily. In the best case 200k per second per CPU core
I suspect you are inserting all over the table, not just at the end. In this case, 3k rows per second is not unusual.
In any case, bulk copy cannot help you. It does not insert faster than a server-only insert statement.
What you can do, though, is insert using multiple threads. Partition your row source into N distinct ranges and insert each range concurrently from a separate connection. This will help if you are CPU bound. It won't if you are IO bound.

Bulkcopy inserts with DBCC CheckIdent

Our team needs to insert a cruel amount of data into our SQL Server 2008 database. We're looking for a good solution. Now we came up with one, but I have doubts with it, simply because it doesn't feel right. So I'm asking here if this seems like a good solution. Extra challange is that it's a peer-to-peer replicated database over 4 servers! :)
Imagine we have 1 million rows to insert
Start transaction
Increase current ident value on a table with 1 million
Have a DataSet/DataTable ready with 1 million rows and the correct ids
BulkCopy the data into the database
Commit transaction
Is this a good solution, might we get into concurrency issues, have too large transactions, etc.
you'll only get problems (as far as I can see, so there might be things I overlook!) if the database is online and users can insert rows into that table. Increasing the identity value for new rows on the meta-level simply means that the next row inserted by the system will use that number, so if you bump it with 1 million, it means you reserved those numbers up front.
Identity columns are 'nice' but have the side effect that they're not transferable. So if you have to migrate the data to another DB, realize that you likely have to adjust the data inserted to match the database you insert it in (as that's the scope of the data which means identity fields could collide with rows already in the table).
If this is a one-time affair, it might work out. If you're planning to do this regularly, I'd look into a more higher-level migration system where you migrate the data to new identity values or use guid's with NEWSEQUENTIALID() so you get proper checked indexes and also unique, transferable id's.

Selecting rows between x and y from database

I've got a query which returns 30 rows. I'm writing code that will paginate those 30 rows into 5 records per page via an AJAX call.
Is there any reason to return just those 5 records up the presentation layer? Would there be any benefits in terms of speed or does it just get all the rows under the hood anyways?
If so, how do I actually do it in Sybase? I know Oracle has Rownum and MS Sql has something similar, but I can't seem to find a similar function in Sybase.
Unless your record length is huge, the difference between 5 and 30 rows should be completely unnoticeable to the user. In fact there's a significant potential the multiple DB calls will harm performance more than help. Just return all 30 rows either to your middle tier or your presentation, whatever makes more sense.
Some info here:
Selecting rows N to M without Oracle's rownum?
I've never worked with Sybase, but here's a link that explains how to do something similar:
http://www.dbforums.com/sybase/1616373-sybases-rownum-function.html
Since the solution involves a temp table, you can also use it for pagination. On your initial query, put the 30 rows into a temporary table, and add a column for page number (the first five rows would be page 1, the next five page 2 and so on). On subsequent page requests, you query the temp table by page number.
Not sure how you go about cleaning up the temp table, though. Perhaps when the user's session times out?
For 30 records, it's probably not even worth bothering with pagination at all.
I think in sybase you can use
select top 5 * from table
where order-by-field > (last record of previous calls order-by-field)
order by order-by-field
just make sure you use the same order by each time.
As for benefit I guess it depends on how many rows we are talking and how big the table is etc.
I agree completely with jmgant, however, if you want to do it anyway, the process goes something like this:
Select top 10 items and store in X
Select top 5 items and store in Y
X-Y
This entire process can happen in 1 SQL statement.

Resources