That this question may seem awkward but I searched for a whole week to know the difference of Netezza and PureData. I appreciate anyone could help me.
If you could provide me a link would be much better.
PureData is a family of IBM products. In the family, there are:
IBM PureData for Analytics - Formerly known as Netezza. It is based on Netezza and is aimed for business intelligence and data warehousing (OLAP) applications.
IBM PureData for Operational Analytics - Formerly known as IBM Smart Analytics System. It is based on DB2 DPF and is aimed for real-time decision making and data warehousing (OLAP) applications.
IBM PureData for Transactions - It is based on DB2 PureScale and is aimed for transactional (OLTP) applications.
IBM PureData for Hadoop - It is based on InfoSphere BigInsights and is aimed for Big Data applications.
The simple answer is that Netezza and PureData System for Analytics (aka PDA) are the same thing.
The longer answer is that Netezza (a company) produced a data warehouse appliance (also commonly referred to as Netezza). IBM acquired Netezza in 2010, and subsequently re-branded the appliance as the PureData System for Analytics. Within IBM the PureData family of systems is an umbrella for their appliances and what they refer to as Expert Integrated Systems.
The software that runs on today's PDA systems is still known as Netezza Performance Server (aka NPS).
This link is somewhat dated, but explains the rebranding simply, and also provides this link to the current product page.
Related
I know that Snowflake takes away the headache of managing servers and sizing with its Virtual WH concept, but I wanted to know the physical specs of each individual server that Snow flake uses as part of its Virtual Warehouses or VH clusters. Can someone help?
There's no official documentation for the physical specs of each warehouse.
Part of Snowflake's goal is to keep each warehouse performance equivalent throughout the 3 supported clouds. The specific hardware used for each instance will keep changing as Snowflake works to optimize users' experience and the offers of each platform.
https://docs.snowflake.com/en/user-guide/warehouses-overview.html
In any case, this question seems to be a duplicate of What are the specifications of a Snowflake server?.
I need to import on HANA the data stored on DB Oracle 9-10 and Informix 11. Which is the best practice supported by SAP in order to achieve this target?
Is SAP HANA Smart Data Integration with JDBC the right tool?
Ok, so this is a one-time data transfer.
The one option probably best suited for this is the CSV export/import.
It's well supported by all DMBS and is commonly well-optimized in terms of data throughput.
But the OP wrote that this is not the preferred option.
Depending on available licenses, HANA's Smart Data Integration (SDI) feature allows using arbitrary JDBC clients to connect. That could be a viable option.
A caveat here is that data type/precision differences might go unnoticed due to the built-in -JDBC/JDBC- mapping.
Don't get me wrong here. SDI definitively works a treat but data migration never is a click'n'forget thing.
Having worked extensively with both Oracle and HANA (and a little bit with Informix a long time ago), I'd still go with CSV for this one-time effort and use SDI for on-going integration.
I'm trying to find a database solution and I came across Infobright and Amazon Redshift as potential solutions. Both are columnar databases. Infobright has been around for quite sometime whereas Amazon Redshift is newer.
What is the DBA effort between Infobright and Amazon Redshift?
How accessible is Infobright (API, query interface, etc.) vs AWS?
Where do both sit in your system architecture? Do the operate as a layer on top of your traditional RDBMS?
What is the DevOps effort to setting up both Infobright and Redshift?
I'm leaning a bit more towards Redshift because my application is hosted on AWS and I thought this would create tangible benefits in the long-run since everything is in AWS. Thank you in advance!
Firstly, I'll admit that I work for Infobright. I've done significant research into Redshift, and I feel I can give an honest opinion. I just wrote up a comparison between the two technologies; it can be found here: https://www.infobright.com/wp-content/plugins/download-monitor/download.php?id=37
DBA Effort - Infobright requires very little administration. You cannot index; you don't need to partition/etc. It's an SMP architecture and scales well. Thus, you won't be dealing with multiple nodes. Redshift is also fairly simple. You will need to maintain sorts as well as ensure Analyse is run enough.
Infobright uses a MySQL Shell. Thus, any tool that can utilize MySQL can utilize Infobright. Therefore, you have the same set of tools/interfaces/APIs for Infobright as you do with MySQL. AWS does have an SQL interface, and it does have some API capabilities. It does require that you load directly from S3. Infobright loads from flat files and named pipes from local or remote servers.
Both databases are analytic databases. You would not want to use either as a transactional database. Instead, you typically push data from your transactional system to your analytic database.
DevOps to setup Infobright will be lower than Redshift. However, Redshift is not that overly complicated either. Maintenance of the environment is more of a requirement for Redshift, though.
Infobright does have many AWS-specific installations. In fact, we have implementations that approach nearly 100TB of raw storage on one server. That said, Redshift with many nodes can achieve petabyte scale on an implementation.
There are other factors that can impact your choice. For example, Redshift has very nice failover/HA options already built-in. On the flipside, Infobright can support many concurrent queries and users; Redshift limits queries to 15 regardless of cluster size.
Take a look at the document, and feel free to contact me if you have any specific questions about either technology.
I have a mobile application (CE) running on a Motorola Symbol 3090 which should allow for scanning of inventory items and changing their properties on our SQL server (The table has existed on the server for years and now they want a way to use mobile devices to update).
Here is the problem I am facing which needs addressing.
Our warehouse is very large and spans over multiple locations so inevitably we have dead zones in the warehouse so having a constant connection is not possible. What I have proposed is a way to go in offline mode and have an up to date copy of our inventory on local device. This would allow all transactions to be found and recorded locally. When the device is returned to cradle or back to wifi it updates the database. With this proposal i'm not sure if SQL Replication is the best way to handle this type of application.
Was hoping some more experienced mobile device developers had any input into the design scheme.
I have only been developing on these types of systems (Motorola symbol 3090) for about 2 months now and have no background knowledge in SQL Replication. I understand the basics of what replication is doing but that is about the extent of my knowledge on the subject.
As ErikEJ points out in the comment above, Merge Replication would work well for this scenario. Here are some resources to get you started:
MSDN's 'explanation'
Rob Tiffany's Book
Chris Fairbairn's blog
Erik's Library
As a developer using DB2 for the first time, I'm not familiar with what the best database performance analysis tools are for it.
I'm wondering what others have found useful in terms of tools that come with DB2, and any third-party tools available for it.
e.g. Anything better than others for things like query planning, CPU measurement, index usage, etc.?
You don't specify which version/release of DB2 you're running, or whether you're running the mainframe (z/OS) version or DB2 for Linux, UNIX, and Windows (also known as DB2 for LUW).
If you're running DB2 on z/OS, talk to your DBA and you'll find out exactly which monitoring and analysis tools have been licensed.
If it's DB2 for LUW you're using, there are various structures and routines you can access directly in DB2 to capture detailed performance information. IBM adds more of these features with each new DB2 release (e.g. version 9.5 vs. 9.7), so be sure to access the version of the documentation for your specific release. Here is the monitoring guide for 9.5 and here is the 9.7 monitoring guide.
The challenge will be to capture and analyze that performance data in some useful way. BMC, CA, DBI, IBM, and even HP have very good third-party tools to help you do that. Some of them are even free.
On the open-source side, monitors from GroundWork Open Source and Hyperic HQ Open Source have some DB2 support, but you'll need to spend some time configuring either of those environments to access your DB2 server.
Many of the tools mentioned above track some combination of DB2 health and performance indicators, and may even alert you when something about DB2 or its underlying server has entered a problem status. You will face choices over what to use as the criteria for triggering alerts, versus the KPIs you simply want to capture without ever alerting.
There are a lot of monitoring tools out there that can be taught how to watch DB2, but one of the most versatile and widely used is RRDtool, either on its own with a collection of custom DB2 scripts, or as part of a Cacti or Munin installation, which automates many (but not all) aspects of working with RRDtool. The goal of RRDtool is to capture any kind of numeric time-series data so it can be rendered into various graphs; it has no built-in alert capabilities. Implementing RRDTool involves choosing and describing the data points you intend to capture and allocating RRDtool data files to store them. I use it a lot to identify baseline performance and resource utilization trends for a database or an application. The PNG bitmaps it produces can be integrated into a wide variety of IT dashboards, provided those dashboards are customizable.