Changing tempdb "in-query"

Changing tempdb "in-query" - sybase

Good day,
Is it possible to change the tempdb my current session is using?
I have a very heavy query that is meant for HD usage.
Ideally, I'd like the query to be ran using a tempdb we have specifically for such heavy things.
(Main issue is the query creates a very large temp table)
I'd like something along the lines of:
use tempdb <tempdbname>
<query>
use tempdb <normaltempdb>
If this is at all possible, even if by other means, please let me know.
Right now, the only way I know of to do this is to bind a user to a different tempdb, and then have HD login using that user, instead of the normal user.
Thanks in advance,
ziv.

In Sybase ASE you cannot change your tempdb in-flight; your tempdb is automagically assigned at log in.
You have a few options:
1 (recommended) - have the DBA create a login specifically for this process and bind said login to the desired tempdb (eg, sp_tempdb 'bind', ...); have your process use this new login
2 (not recommended) - instead of creating #temp tables, create permanent tables with a 'desired_tempdb_name..' prefix; you'll likely piss off your DBA if you forget to manually drop said tables when you're done with it
3 (ok, if you've got the disk space) - as Rich has suggested, make sure all tempdb's are sized large enough to support your process
NOTE: If you're using Sybase's SQLAnywhere, IQ or Advantage RDBMSs... sorry, I don't know how temporary databases are assigned for these products.

if your main concern is impact to tempdb and other users, you could consider creating multiple default tempdbs of the same size and structure. Add these to the default group and sessions are assigned to a tempdb on connection thus lessening the risk of one large query impacting the whole dataserver
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00841.1502/html/phys_tune/phys_tune213.htm
You could also consider the use of a login trigger for specific logins and check the program name which is connecting to decide upon which tempdb to use (e.g. Business Objects could go to a much larger DSS tempdb or similar).
There is no way to change your session tempdb in-flight that I'm aware of though as tempdb bindings are set on connection.

It sounds like you do have at least one other tempdb created by the DBA. You can bind to this by application name as well as the login Id. Set the application name in your client session (depends on what the client is as to how you do this.) Use sp_tempdb (dba only) to bind that application name to the alternative tempdb, and your # table will be in that tempdb. Any session with that application name will use that tempdb.
tempdbs do not have to be the same size or structure and you can have separate log and data (a good idea,) with more log and less data depending on what you are doing.
markp mentions permanent tables in tempdbs, and says "not recommended". This can be a good technique though. You do need to be careful about how big they get and when they are dropped. You might not need or want to drop them straightaway, for example if you need to bcp from them and/or have them visible for Support purposes, but you do need to be clear about space usage, when to drop and how.

Related

Shrinking pg_toast on RDS instance

I have a Postgres 9.6 RDS instance and it is growing 1GB a day. We have made some optimizations to the relation related to the pg_toast but the pg_toast size is not changing.
Autovacuum is on, but since autovacuum/VACUUM FREEZE do not reclaim space and VACUUM FULL does an exclusive lock, I am not sure anymore what the best approach is.
The data in the table is core to our user experience and although following this approach makes sense, it would take away the data our users expect to see during the vacuum full process.
What are the other options here to shrink the pg_toast?
Here is some data about table sizes. You can see in the first two images, that the relation scoring_responsescore is relation associated with the pg_toast.
Autovacuum settings
Results from current running autovacuum process for that specific pg_toast. It might help.

VACUUM (FULL) is the only method PostgreSQL provides to reduce the size of a table.
Is the bloated TOAST table such a problem for you? TOAST tables are always accessed via the TOAST index, so the bloat shouldn't be a performance problem.
I know of two projects that provide table reorganization with only a short ACCESS EXCLUSIVE lock, namely pg_squeeze and pg_repack, but you probably won't be able to use those in an Amazon RDS database.
To keep the problem from getting worse, you should first try to raise autovacuum_vacuum_cost_limit to 2000 for the affected table, and if that doesn't do the trick, lower autovacuum_vacuum_cost_delay to 0. You can use ALTER TABLE to change the settings for a single table.

pg_repack still does not allow to reduce the size of TOAST Segments in RDS.
And in RDS we cannot run pg_repack with superuser privileges, we have to use "--no-superuser-check" option. With this it will not be able to access the pg_toast.* tables.

Is it possible to create fast (in-memory, non-ACID, etc) tables/databases in SQL Server?

In Sqlite, there's an option to create an in-memory database, and another to not wait for things to be written to the filesystem, and to put the journal in memory or disable it. Are there any settings like this for SQL Server?
My use case is storage for data that should persist for about a day in normal use, but wouldn't be a big deal if it was lost. I would use something like memcached for it, but I want to be able to control the cache time, not just hope I have enough memory.

No.
tempdb has a bit less logging than regular databases as it doesn't have to support the "D" in acid and redo of transactions but that's about it.

Yes as of MSSQL 2014.
There is a new feature in MSSQL 2014 named In-Memory OLTP.
For a detailed feature introduction:
http://technet.microsoft.com/en-us/library/dn133186(v=sql.120).aspx

Not really. You can do something like this through SQL Server by implementing a custom solution through the SQLCLR. You can use temp tables or table variables too, but these will still write to disk. You can improve performance (by reducing blocking) but break consistency by using different ISOLATION LEVEL such as READ_UNCOMMITTED.
In brief if you really want what you ask, SQLCLR is the solution.

You could store the table on a ramdisk. That way it would always be in memory.
However, I would first try a normal table. SQL Server does a pretty good job about caching tables in memory.

Table variables:
DECLARE #name TABLE (id int identity(1,1), ...);
Table variables are kept in memory and not logged. Under memory pressure, they can spill to tempdb. However, because they are restricted to the scope of a batch execution, it would be hard (but not impossible) to store data in them for 'about a day'. I would definitely not recommend a in-memory non-acid solution based on SQL Server table variables. But, as Martin already pointed out, real tables in tempdb are a viable alternative to improve latency. You can achieve similar results on durable DBs too, with proper transaction management (batch commit) and file placement (dedicated high throughput log disk).

why are multiple DBs actually needed?

I was looking at godaddy.com which says they offer up to 10 MySQL DBs, but I don't know why you would need more than 1 ever since a DB can have mutliple tables. Can't multiple DBs be integrated into a single DB? Is there an example case where its better or unfeasible to not have multiple ones? And how do you differentiate between them when you want to call them, from their directory or from a name?
Best,

I guess separation of concerns would be the most obvious answer. In the same way you can have all of your functionality in one humongous class in object oriented programming, it's a good idea to keep non-related information separate. It's easier to wrap your head around smaller chunks of data, and future developers mights start to think tables are related, and aggregate data in a way they were never meant to.

Imagine that you're doing two different projects with two different teams. Maybe you won't one team to access the other team tables.
There can also be a space limit in each database, and It each one can be configured with specific params to optimize the performance.
In other hand, two final users can be assigned to make the backups of each entire database, and you wan`t one user to make the backup of the other DB because he could be able to restore the database in other place and access the first database data.

I'm sure there are some pretty good DBAs on the forum who can answer this in detail.
Storing tables in different databases makes because you are able to backup them up individually. Furthermore, you will be able to control access to each database under different NT groups (e.g. Admin vs. users). Although this can be done at the indvidual table level, sometimes it makes sense to grant or deny access to an entire database to a particular group.
When you need to call them in SQL Server you need to append the database name to the query like this SELECT * FROM [MyDatabase].[dbo].[MyTable].

One other reason to use separate databases relates to whether you need full transactional recovery or not. For instance, if I havea bunch of tables that are populated on a schedule through import processes and never by the users, putting them in a separate database allows me to set the recovery mode to simple which reduces the logging (a good thing when you are loading millions of records at once). I can also not do transactional log backup every fifteen minutes like I do for the data in the database with the user inserted data. It could also make recovery a faster process when needed as the databases would be smaller and thus individally take less time to recover. Won't help much when the whole server crashes but it could help a lot if onely one datbase gets corrupted for some reason. If the data relates to different applications, it simplifies the security as well to have the data in separte databases. And of course sometimes we have commercial databases and we can;t add tables to those and so may need a separate database to handles some things we want to add to that data (we do this for instance with our Project Management software, we have a spearate database where we extract and summarize data from the PM system for reporting and then write all our custome reports off that.)

SQL Server performance with a large number of tables in database

I am updating a piece of legacy code in one of our web apps. The app allows the user to upload a spreadsheet, which we will process as a background job.
Each of these user uploads creates a new table to store the spreadsheet data, so the number of tables in my SQL Server 2000 database will grow quickly - thousands of tables in the near term. I'm worried that this might not be something that SQL Server is optimized for.
It would be easiest to leave this mechanism as-is, but I don't want to leave a time-bomb that is going to blow up later. Better to fix it now if it needs fixing (the obvious alternative is one large table with a key associating records with user batches).
Is this architecture likely to create a performance problem as the number of tables grows? And if so, could the problem be mitigated by upgrading to a later version of SQL Server ?
Edit: Some more information in response to questions:
Each of these tables has the same schema. There is no reason that it couldn't have been implemented as one large table; it just wasn't.
Deleting old tables is also an option. They might be needed for a month or two, no longer than that.

Having many tables is not an issue for the engine. The catalog metadata is optimized for very large sizes. There are also some advantages on having each user own its table, like ability to have separate security ACLs per table, separate table statistics for each user content and not least improve query performance for the 'accidental' table scan.
What is a problem though is maintenance. If you leave this in place you must absolutely set up task for automated maintenance, you cannot let this as a manual task for your admins.

I think this is definitely a problem that will be a pain later. Why would you need to create a new table every time? Unless there is a really good reason to do so, I would not do it.
The best way would be to simply create an ID and associate all uploaded data with an ID, all in the same table. This will require some work on your part, but it's much safer and more manageable to boot.

Having all of these tables isn't ideal for any database. After the upload, does the web app use the newly created table? Maybe it gives some feedback to the user on what was uploaded?
Does your application utilize all of these tables for any reporting etc? You mentioned keeping them around for a few months - not sure why. If not move the contents to a central table and drop the individual table.
Once the backend is taken care of, recode the website to save uploads to a central table. You may need two tables. An UploadHeader table to track the upload batch: who uploaded, when, etc. and link to a detail table with the individual records from the excel upload.

I will suggest you to store these data in a single table. At the server side you can create a console from where user/operator could manually start the task of freeing up the table entries. You can ask them for range of dates whose data is no longer needed and the same will be deleted from the db.
You can take a step ahead and set a database trigger to wipe the entries/records after a specified time period. You can again add the UI from where the User/Operator/Admin could set these data validity limit
Thus you could create the system such that the junk data will be auto deleted after specified time which could again be set by the Admin, as well as provide them with a console using which they can manually delete additional unwanted data.

Should static database data be in its own Filegroup?

I'm creating a new DB and have a bunch of static data that won't change. If it does, it will be a manual process AND it will happen very rarely.
This data is a mix of varchars and Geographies.
I'm guessing it could be around 100K or so in total, over 4 or so tables.
Questions
Should I put these on a READ ONLY filegroup
Can I create the tables in the designer and define the filegroup during creation? Or is it only possible via a script?
Once the data is in the table (on a read only filegroup), can I change it later? Is it really hard to do that?
thanks.

It is worth it for VLDB (very large databases) for assorted reasons.
For 100,000 rows or 100 KB, I wouldn't bother.
This SQL Server support engineering team article discusses one of the associated "urban legends".
There is another one (can't find it) where you need 300 GB - 1B of data before you should consider multiple files/filegroups.
But, to answer specifically
Personal choice (there is no hard and fast rule)
Yes (edit:) In SSMS 2005, design mode, go to Indexes/Key, "data space specfication". The data lives where the clustered index is. WIthout a clustered index, then you can only do it via CREATE TABLE (..) ON filegroup
Yes, but You'll have to ALTER DATABASE myDB MODIFY FILEGROUP foo READ_WRITE with the database in single user exclusive mode

It is unlikely to hurt to put the data in to a read only space but I am unsure you will gain significantly. A read-only file group (or tablespace in Oracle) can give you 2 advantages; less to back-up each time a full backup is taken and a higher level of security over the data (e.g. it cannot be changed by a bug, accessing the DB via another tool, etc). The backup advantage is most true with larger DBs where backup windows are tight so putting a small amount of effort into excluding file groups is valuable. The security one depends on the nature of the site, data, etc. (if you do exclude the read-only space from regular backups make sure you get a copy on any retained backup tapes. I tend to backup up read-only spaces once a month.)
I am not familiar with designer.
Changing to and from read only is not onerous.

I think anything you read here is likely to be speculation, unless you have any evidence that it's been actually tried and recommended - to me it looks like a novel but unlikely idea. Do you have some reason to suspect that conventional practices will be unsatisfactory? It should be fairly easy to just try it and find out. Post your results if you get a chance.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Changing tempdb "in-query" - sybase

Related

Shrinking pg_toast on RDS instance

Is it possible to create fast (in-memory, non-ACID, etc) tables/databases in SQL Server?

why are multiple DBs actually needed?

SQL Server performance with a large number of tables in database

Should static database data be in its own Filegroup?

Categories

Resources