How can I see the query history for Sybase?
I'm using "Sybase Central 16".
I'm looking for something like this for SQL Server:
SELECT t.[text]
FROM sys.dm_exec_cached_plans AS p
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) AS t
I need to see how my client's application runs queries so I can find some needle-in-a-haystack data tables & columns.
(I'm far from being a Sybase expert. I know SQL Server very well, but I'm doing a consulting job with Sybase).
Thanks.
Assuming this is Sybase ASE, you can pull a good bit of performance related data/metrics from the MDA (monitoring) tables (master..mon%).
Plans and query text can usually be pulled from: monProcessSQLText (currently running query), monSysSQLText (recently run queries), monSysPlanText (query plans for recently run queries).
Keep in mind that the monSys% tables are queues, and the volume of data they can maintain will depend on the amount of memory allocated to said queues plus the volume of activity being monitored.
For historical purposes the DBA will typically setup a process to periodically pull data from the MDA tables and store in a repository database from which queries can be run.
There are a handful of 3rd party products on the market that claim to capture/store MDA table though a) you'll need to pay $$ for said product and/or b) the product may not capture/store the data in an easily-accessed format (eg, some products try to rollup the data into summaries ... not very useful if you need to do a deep dive into individual queries).
One free product you might want to look at is ASEMON. I typically install this product at every client I work for, often times supplanting the expen$sive 3rd party products which tend to spend more effort on summarizing and colorizing the data than in presenting the raw data typically needed for detailed P&T work ... ymmv
Related
One of my application has the following use-case:
user inputs some filters and conditions about orders (delivery date ranges,...) to analyze
the application compute a lot of data and save it on several support tables (potentially thousands of record for each analysis)
the application starts a report engine that use data from these tables
when exiting, the application deletes computed record from support tables
Actually I'm analyzing how to ehnance queries performance adding indexes/stastics to support tables and the SQL Profiler suggests me to create 3-4 indexes and 20-25 statistics.
The record in supports tables are costantly created and removed: it's correct to create all this indexes/statistics or there is the risk that all these data will be easily outdated (with the only result of a costant overhead for maintaining indexes/statistics)?
DB server: SQL Server 2005+
App language: C# .NET
Thanks in advance for any hints/suggestions!
First seems like a good situation for a data cube. Second, yes you should update stats before running your query once the support tables are populated. You should disable your indexes when inserting the data. Then the rebuild command will bring your indexes and stats up to date in one go. Profiler these days is usually quite good at these suggestions, but test the combinations to see what actully gives the best performance gains. To look as os cubes here What are the open source tools and techniques to build a complete data warehouse platform?
I have an ASP.NET MVC 4 application that uses a database in US or Canada, depending on which website you are on.
This program lets you filter job data on various filters and the criteria gets translated to a SQL query with a good number of table joins. The data is filtered then grouped/aggregated.
However, now I have a new requirement: query and do some grouping and aggregation (avg salary) on data in both the Canada Server and US server.
Right now, the lookup tables are duplicated on both database servers.
Here's the approach I was thinking:
Run the query on the US server, run the query again on the Canada server and then merge the data in memory.
Here is one use case: rank the companies by average salary.
In terms of the logic, I am just filtering and querying a job table and grouping the results by company and average salary.
Would are some other ways to do this? I was thinking of populating a reporting view table with a nightly job and running the queries against that reporting table.
To be honest, the queries themselves are not that fast to begin with; running, the query again against the Canada database seems like it would make the site much slower.
Any ideas?
Quite a number of variables here. If you don't have too much data then doing the queries on each DB and merging is fine so long as you get the database to do as much of the work as it is able to (i.e. the grouping, averaging etc.).
Other options include linking your databases and doing a single query but there are a few downsides to this including
Having to link databases
Security associated with a linked database
A single query will require both databases to be online, whereas you can most likely work around that with two queries
Scheduled, prebuilt tables have some advantages & disadvantages but probably not really relevant to the root problem of you having 2 databases where perhaps you should have one (maybe, maybe not).
If the query is quite slow and called many times, the a single snapshot once could save you some resources provided the data "as at" the time of the snapshot is relevant and useful to your business need.
A hybrid is to create an "Indexed View" which can let the DB create a running average for you. That should be fast to query and relatively unobtrusive to keep up to date.
Hope some of that helps.
I did read posts about transactional and reporting database.
We have in single table which is used for reporting(historical) purpose and transactional
eg :order with fields
orderid, ordername, orderdesc, datereceived, dateupdated, confirmOrder
Is it a good idea to split this table into neworder and orderhistrory
The new ordertable records the current days transaction (select,insert and update activity every ms for the orders received on that day .Later we merge this table with order history
Is this a recommended approach.
Do you think this is would minimize the load and processing time on the Database?
PostgreSQL supports basic table partitioning which allows splitting what is logically one large table into smaller physical pieces. More info provided here.
To answer your second question: No. Moving data from one place to another is an extra load that you otherwise wouldn't have if you used the transational table for reporting. But there are some other questions you need to ask before you make this decision.
How often are these reports run?
If you are running these reports once an hour, it may make sense to keep them in the same table. However, if this report takes a while to run, you'll need to take care not to tie up resources for the other clients using it as a transactional table.
How up-to-date do these reports need to be?
If the reports are run less than daily or weekly, it may not be critical to have up to the minute data in the reports.
And this is where the reporting table comes in. The approaches I've seen typically involve having a "data warehouse," whether that be implemented as a single table or an entire database. This warehouse is filled on a schedule with the data from the transactional table, which subsequently triggers the generation of a report. This seems to be the approach you are suggesting, and is a completely valid one. Ultimately, the one question you need to answer is when you want your server to handle the load. If this can be done on a schedule during non-peak hours, I'd say go for it. If it needs to be run at any given time, than you may want to keep the single-table approach.
Of course there is nothing saying you can't do both. I've seen a few systems that have small on-demand reports run on transactional tables, scheduled warehousing of historical data, and then long-running reports against that historical data. It's really just a matter of how real-time you want the data to be.
As part of my role at the firm I'm at, I've been forced to become the DBA for our database. Some of our tables have rowcounts approaching 100 million and many of the things that I know how to do SQL Server(like joins) simply break down at this level of data. I'm left with a couple options
1) Go out and find a DBA with experience administering VLDBs. This is going to cost us a pretty penny and come at the expense of other work that we need to get done. I'm not a huge fan of it.
2) Most of our data is historical data that we use for analysis. I could simply create a copy of our database schema and start from scratch with data putting on hold any analysis of our current data until I find a proper way to solve the problem(this is my current "best" solution).
3) Reach out to the developer community to see if I can learn enough about large databases to get us through until I can implement solution #1.
Any help that anyone could provide, or any books you could recommend would be greatly appreciated.
Here are a few thoughts, but none of them are quick fixes:
Develop an archival strategy for the
data in your large tables. Create
tables with similar formats to the
existing transactional table and
copy the data out into those tables
on a periodic basis. If you can get
away with whacking the data out of
the tx system, then fine.
Develop a relational data warehouse
to store the large data sets,
complete with star schemas
consisting of fact tables and
dimensions. For an introduction to
this approach there is no better
book (IMHO) than Ralph Kimball's
Data Warehouse Toolkit.
For analysis, consider using MS
Analysis Services for
pre-aggregating this data for fast
querying.
Of course, you could also look at
your indexing strategy within the
existing database. Be careful with
any changes as you could add indexes
that would improve querying at the
cost of insert and transactional
performance.
You could also research
partitioning in SQL Server.
Don't feel bad about bringing in a DBA on contract basis to help out...
To me, your best bet would be to begin investigating movement of that data out of the transactional system if it is not necessary for day to day use.
Of course, you are going to need to pick up some new skills for dealing with these amounts of data. Whatever you decide to do, make a backup first!
One more thing you should do is ensure that your I/O is being spread appropriately across as many spindles as possible. Your data files, log files and sql server temp db data files should all be on separate drives with a database system that large.
DBA's are worth their weight in gold, if you can find a good one. They specialize in doing the very thing that you are describing. If this is a one time problem, maybe you can subcontract one.
I believe Microsoft offers a similar service. You might want to ask.
You'll want to get a DBA in there, at least on contract to performance tune the database.
Joining to a 100 Million record table shouldn't bring the database serer to its knees. My company customers do it many hundreds (possibly thousands) of times per minute on our system.
We have a system that is concurrently inserted a large amount of data from multiple stations while also exposing a data querying interface. The schema looks something like this (sorry about the poor formatting):
[SyncTable]
SyncID
StationID
MeasuringTime
[DataTypeTable]
TypeID
TypeName
[DataTable]
SyncID
TypeID
DataColumns...
Data insertion is done in a "Synchronization" and goes like this (we only insert data into the system, we never update)
INSERT INTO SyncTable(StationID, MeasuringTime) VALUES (X,Y); SELECT ##IDENTITY
INSERT INTO DataTable(SyncID, TypeID, DataColumns) VALUES
(SyncIDJustInserted, InMemoryCachedTypeID, Data)
... lots (500) similar inserts into DataTable ...
And queries goes like this ( for a given station, measuringtime and datatype)
SELECT SyncID FROM SyncTable WHERE StationID = #StationID
AND MeasuringTime = #MeasuringTime
SELECT DataColumns FROM DataTable WHERE SyncID = #SyncIDJustSelected
AND DataTypeID = #TypeID
My question is how can we combine the transaction level on the inserts and NOLOCK/READPAST hints on the queries so that:
We maximize the concurrency in our system while favoring the inserts (we need to store a lot of data, something as high as 2000+ records a second)
Queries only return data from "commited" synchronization (we don't want a result set with a half inserted synchronization or a synchronization with some skipped entries due to lock-skipping)
We don't care if the "newest" data is included in the query, we care more for consistency and responsiveness then for "live" and up-to-date data
This may be very conflicting goals and may require a high transaction isolation level but I am interested in all tricks and optimizations to achieve high responsiveness on both inserts and selects. I'll be happy to elaborate if more details are needed to flush out more tweaks and tricks.
UPDATE: Just adding a bit more information for future replies. We are running SQL Server 2005 (2008 within six months probably) on a SAN network with 5+ TB of storage initially. I'm not sure what kind of RAID the SAn is set up to and precisely how many disks we have available.
If you are running SQL 2005 and above look into implementing snapshot isolation. You will not be able to get consistent results with nolock.
Solving this on SQL 2000 is much harder.
This is a great scenario for SQL Server 2005/2008 Enterprise's Partitioning feature. You can create a partition for each StationID, and each StationID's data can go into its own filegroup (if you want, may not be necessary depending on your load.)
This buys you some advantages with concurrency:
If you partition by stationid, then users can run select queries for stationid's that aren't currently loading, and they won't run into any concurrency issues at all
If you partition by stationid, then multiple stations can insert data simultaneously without concurrency issues (as long as they're on different filegroups)
If you partition by syncid range, then you can put the older data on slower storage.
If you partition by syncid range, AND if your ranges are small enough (meaning not a range with thousands of syncids) then you can do loads at the same time your users are querying without running into concurrency issues
The scenario you're describing has a lot in common with data warehouse nightly loads. Microsoft did a technical reference project called Project Real that you might find interesting. They published it as a standard, and you can read through the design docs and the implementation code in order to see how they pulled off really fast loads:
http://www.microsoft.com/technet/prodtechnol/sql/2005/projreal.mspx
Partitioning is even better in SQL Server 2008, especially around concurrency. It's still not a silver bullet - it requires manual design and maintenance by a skilled DBA. It's not a set-it-and-forget-it feature, and it does require Enterprise Edition, which costs more than Standard Edition. I love it, though - I've used it several times and it's solved specific problems for me.
What type of disk system will you be using? If you have a large striped RAID array, writes should perform well. If you can estimate your required reads and writes per second, you can plug those numbers into a formula and see if your disk subsystem will keep up. Maybe you have no control over hardware...
Wouldn't you wrap the inserts in a transaction, which would make them unavailable to the reads until the insert is finished?
This should follow if your hardware is configured correctly and you're paying attention to your SQL coding - which it seems you are.
Look into SQLIO.exe and SQL Stress tools:
SQLIOStress.exe
SQLIOStress.exe simulates various patterns of SQL Server 2000 I/O behavior to ensure rudimentary I/O safety.
The SQLIOStress utility can be downloaded from the Microsoft Web site. See the following article.
• How to Use the SQLIOStress Utility to Stress a Disk Subsystem such as SQL Server
http://support.microsoft.com/default.aspx?scid=kb;en-us;231619
Important The download contains a complete white paper with extended details about the utility.
SQLIO.exe
SQLIO.exe is a SQL Server 2000 I/O utility used to establish basic benchmark testing results.
The SQLIO utility can be downloaded from the Microsoft Web site. See the following:
• SQLIO Performance Testing Tool (SQL Development) – Customer Available
http://download.microsoft.com/download/f/3/f/f3f92f8b-b24e-4c2e-9e86-d66df1f6f83b/SQLIO.msi