I have a Postgres (version 9.4.5) database running on CentOS 6.7.
How would I be able to upgrade my current Postgres distribution to v9.5.x?
This must be done with minimal damage to the data as well as minimum down time as this server is currently in production.
I have considered installing a separate instance of Postgres 9.5 and swapping the balancer to point to the new instance. Then in the background perform sequential inserts. However this would cause problems with serialized columns. It is definitely not the ideal scenario.
For anyone curious, this change is being done because Postgres 9.4.5 does not have a method to perform an upsert without potential data loss and/or concurrency errors.
It should be noted that the database is currently 300GB in size and is mainly used for data warehousing.
The hardware specifications are:
CPU: http://ark.intel.com/products/83356/Intel-Xeon-Processor-E5-2630-v3-20M-Cache-2_40-GHz
RAM: 128GB
STORAGE: 1.5TB PCIE SSD
Related
We are running Dynamics GP 2010 on 2 load balanced citrix servers. For the past 3 weeks we have had severe performance hits when users are running Fixed Assets reporting.
The database is large in size, but when I run the reports locally on the SQL server, they run great. The SQL server seems to be performing adequately even when users are seeing slow performance.
Any ideas?
Just because your DB seems un-stressed, it does not mean that it is fine. It could contain other bottlenecks. Typically, if a DB server is not maxing-out its CPUs occasionally, it means there is a much bigger problem.
Standard process for troubleshooting performance problems on a data driven app go like this:
Tune DB indexes. The Tuning Wizard in SSMS is a great starting point. If you haven't tried this yet, it is a great starting point.
Check resource utilization: CPU, RAM. If your CPU is maxed-out, then consider adding/upgrading CPU or optimize code or split your tiers. If your RAM is maxed-out, then consider adding RAM or split your tiers.
Check HDD usage: if your queue length goes above 1 very often (more than once per 10 seconds), upgrade disk bandwidth or scale-out your disk (RAID, multiple MDF/LDFs, DB partitioning).
Check network bandwidth
Check for problems on your app (Dynamics) server
Shared report dictionaries are the bane of reporting in GP. they do tend to slow things down. also, modifying reports becomes impossible as somebody has it open all the time.
use local report dictionaries and have a system to keep them synced with a "master" reports.dic
One of our customer is on SQL express 2012 R2 for last couple of years and their database has grown to 1 GB. At any given time they have around 15 or more workstations connect to this database. They have a dedicated 2008 server for this database. Sometimes I could see some issues with the slow response but most of the time it is just ok. I cannot tell if suggesting standard SQL would improve the performance? Or it would be a waste of money? Can anybody suggest the parameters to check before I can make this decision?
In the task manager there are 2 sqlservr.exe processes and both of them are using 0% CPU but one of the process is using 2.2 GB of memory and the other is using 68 MB of memory.
Am I already pushing the envelope too far?
Please advise.
This cannot be answered without knowing how the system is developed. The vast majority of slow issues I have run across in many years of datbase work have to do with inefficient code or missing indexes. Getting a higher version of the database won't fix either fo those two issues.
Your problem could also be caused by physical equipment that is reaching it's limit or by network issues.
You are not close the data storage capacity of SQL server express, so I would investigate other things first as SQL Server Standard edition is quite a bit more expensive.
Your best bet would be to get a good book on performance tuning and read it. There are hundreds of things that can cause slowness.
I'm looking at solutions to store a massive quantity of information consuming the less possible disk space.
The information structure is very simple and the queries will also be very simple.
I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned.
Any ideas on this would be great.
Speaking about Apache Cassandra - it's just a disk space hog. 200 MB of logs resulted in 1.2 GB files produced by Cassandra - and the keyspace was just 4 columns with 200 length strings.
Take a look at Oracle Berkeley DB - very simple robust database (key/value):
"Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the handheld device to the datacenter, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes."
Redis might worth a check if you can store your data in key-value
Newest version of Microsoft's SQL Server (2008) supports several levels of compression (row compression and page compression, in addition to backup compression). Might be worth investigating.
Some relevant resources:
Linchi Shea shows that compression can sometimes improve performance
Official MS Best Pracices doc for SQL 2008 compression
we have a 500gb database that performs about 10,000 writes per minute.
This database has a requirements for real time reporting. To service this need we have 10 reporting databases hanging off the main server.
The 10 reporting databases are all fed from the 1 master database using transactional replication.
The issue is that the server and replication is starting to fail with PAGEIOLATCH_SH errors - these seem to be caused by the master database being overworked. We are upgrading the server to a quad proc / quad core machine.
As this database and the need for reporting is only going to grow (20% growth per month) I wanted to know if we should start looking at hardware (or other 3rd party application) to manage the replication (what should we use) OR should we change the replication from the master database replicating to each of the reporting databases to the Master replicating to reporting server 1, reporting server 1 replicating to reporting server 2
Ideally the solution will cover us to a 1.5tb database, with 100,000 writes per minute
Any help greatly appreciated
One common model is to have your main database replicate to 1 other node, then have that other node deal with replicating the data out from there. It takes the load off your main server and also has the benefit that if, heaven forbid, your reporting system's replication does max out it won't affect your live database at all.
I haven't gone much further than a handful of replicated hosts, but if you add enough nodes that your distribution node can't replicate it all it's probably sensible to expand the hierarchy so that your distributor is actually replicated to other distributors which then replicate to the nodes you report from.
How many databases you can have replicated off a single node will depend on how up-to-date your reporting data needs to be (EG: Whether it's fine to have it only replicate once a day or whether you need to the second) and how much data you're replicating at a time. Might be worth some experimentation to find out exactly how many nodes 1 distributor could power if it didn't have the overhead of actually running your main services.
Depending on what you're inserting, a load of 100,000 writes/min is pretty light for SQL Server. In my book, I show an example that generates 40,000 writes/sec (2.4M/min) on a machine with simple hardware. So one approach might be to see what you can do to improve the write performance of your primary DB, using techniques such as batch updates, multiple writes per transaction, table valued parameters, optimized disk configuration for your log drive, etc.
If you've already done as much as you can on that front, the next question I have is what kind of queries are you doing that require 10 reporting servers? Seems unusual, even for pretty large sites. There may be a bunch you can do to optimize on that front, too, such as offloading aggregation queries to Analysis Services, or improving disk throughput. While you can, scaling-up is usually a better way to go than scaling-out.
I tend to view replication as a "solution of last resort." Once you've done as much optimization as you can, I would look into horizontal or vertical partitioning for your reporting requirements. One reason is that partitioning tends to result in better cache utilization, and therefore higher total throughput.
If you finally get to the point where you can't escape replication, then the hierarchical approach suggested by fyjham is definitely a reasonable one.
In case it helps, I cover most of these issues in depth in my book:
Ultra-Fast ASP.NET.
Check that your publisher and distributor's transaction log files don't have too many VLFs (Virtual Log Files) as detailed here (step 8):
http://www.sqlskills.com/BLOGS/KIMBERLY/post/8-Steps-to-better-Transaction-Log-throughput.aspx
If your distribution database is co-located with you publisher database, consider moving it to its own dedicated server.
Our postgres server is about hitting its capacity and we're looking into adding a second database server. Are there any scaling solutions that are particularly good for a postgres setup?
You are looking at a limited set of choices, very dependent on what your specific requirements are (read-to-write ratios and how tolerant your application is of occasional inconsistent reads [synchronous vs. asynchronous replication? master-slave vs. multi-master?], how strongly connected your tables are [clustering], etc.)
http://www.postgresql.org/download/products/3
http://pgpool.projects.postgresql.org/
http://www.slony.info/
UPDATE
Over six years have elapsed since the original answer. Please refer to the High Availability, Load Balancing, and Replication chapter in the PostgreSQL documentation for the latest solutions available to you.
Did you check what is your bottleneck? What are the queries that make your server work hard? Maybe it can be tuned better.
If tuning will not help it is often much easier to upgrade a server than to set replication. Adding some disks in RAID1 or RAID10, adding some RAM, more cores and faster processor. A good RAID controller with battery backed cache would make a big difference too.
Replication id good for high availability but often a bigger server will be more cost effective if you have performance problems.
There's Postgres Advanced Server, and Continuent Tungsten are also worth looking into for an enterprise class solution.