According to IDG, when customers consider updating to the latest release of a product, they expect new features, enhanced security, and better performance, but increasingly want a more streamlined upgrade process. With each new release of CDP Private Cloud, this is exactly what we strive to deliver. Along with a host of new features and capabilities, we are improving the upgrade process to be as painless as possible. In this blog we will cover the new features in the 7.1.6 release and the new in-place upgrade from HDP that completely does away with replacing infrastructure and data migrations.
The CDP Private Cloud Base stack, as shown below, has provided an upgrade path for customers from CDH 5.13 – 5.16 and HDP 2.6.5. With this release, we also support upgrades from HDP 3.1.5.
CDP Private Cloud Base 7.1.6 delivers benefits in the following categories:
- Better Upgrade Support
- Support in-place upgrades from HDP 3.1.5 to CDP Private Cloud Base with enhancements to automated tools to easily transition from Ambari to Cloudera Manager.
- Support rollbacks from HDP2.6.5 and CDH5.[13-16] with appropriate documentation
- Improved Fair Scheduler to Capacity Scheduler conversion with an enhanced tool that handles more complex placement rules.
- Platform Enhancements
- YARN new placement rules engine provides better conversions for fair scheduler to capacity scheduler migration and provides better placement rules management
- Automatic Dynamic Queue support for parent and child queues
- Addition of weight mode to support easier transition for CDH customers.
- Added support for standalone NiFi/Kafka clusters
- We’ve added OS support for RHEL / CentOS 7.9 and DB support for MySQL8 and Postgres 12 to further assist with migrations.
- Object Store
- Ozone is a distributed Key Value Object Store that provides 20x the scalability of traditional HDFS and reduces cluster sprawl, removes small files limitation, and simplifies cluster management.
- Ozone supports dense node configurations of 350TB which increases the current usable storage capacity by 350% compared to HDFS and reduces storage cost by 50%.
- SDX – Security and Governance
- Ranger audit filters to provide better audit management. Ranger Audit filters help with using JSON defined filters to control the audit events captured in order to streamline audit volumes by only including relevant events.
- Improvements to Ranger Audit UI, which provides enhancements like adjustable columns and option to select the columns visible via the UI.
- Data Engineering
- Adopt Spark 3 and double your performance which is now released as a separate parcel
- Hive Warehouse Connector (HWC) makes data engineering simpler and faster.
- Better Hive-Spark interaction with HWC which makes data engineering applications simpler and more efficient to create.
- Data Warehouse
- Directed Acyclic Graphs (DAGs) and data transfer primitives with Hive on Tez improve query performance compared to traditional MapReduce.
- Impala improvements increase performance between 2x and 7x.
- Faster Hive queries with materialized views and query caching
- We’ve enabled role related statements in Impala to enable the use of Ranger as the authorization provider instead of Sentry for CDH users. Details available here.
- Operational Database
- Added transaction support with Phoenix 5.1
- Supports both SQL and No SQL with 15 – 20% better throughput performance.
- Support for complex x-row/x-table distributed transactions that runs TPC-C benchmarks alongside support for ANSI SQL makes it easy to migrate from MySQL databases to Operational Database.
- We have added “OpDB powered by ApacheAccumulo” based on Accumulo 2.0, enabling HDP customers that utilize Accumulo to upgrade to CDP Private Cloud Base with features such as semantic versioning, bulk imports, and simplified scripts.
Now let’s draw your attention to 3 of these features and expand on what they bring to the platform.
Platform – HDP 3 in place upgrade enhancements
HDP 3.1.5 customers can now upgrade their HDP 3 clusters directly to CDP Private Cloud Base without having to build a new cluster and migrate workloads or data.
The upgrade path consists of the following steps:
A new version of the AM2CM tool (1.2.0) has been created to support the transition from an Ambari managed cluster to Cloudera Manager managed cluster.
The AM2CM tool takes the Ambari 2.7.5 blueprint as an input and converts it to a Cloudera Manager deployment template. Next customers can migrate the deployment template to Cloudera Manager, which enables customers to start the CDP cluster through Cloudera Manager.
Additionally, rollback procedures are now available for upgrades from HDP2 and CDH5 clusters.
Platform – Fair Scheduler upgrade tool enhancements
The Fair Scheduler to Capacity Scheduler (FS2CS) conversion tool delivers improved scheduler transition for customers upgrading from a previous CDH release.
With the help of the tool, customers can run their jobs or applications with the same or better SLA and w/o any disruption or code change. Once the cluster is upgraded to CDP, customers can now use the YARN QueueManager to tune cluster resource management config in a more user-friendly way.
We have introduced the following new features
- Enhanced Placement Rule Engine
- Dynamic Queue Support
- Weight mode
Placement rules determine the queues to which applications and jobs are assigned. The new placement rule evaluation engine has been enhanced to deliver the following:
- Support both static queues and dynamic queues from a single parent.
- Additional policy options with fallback action configuration that can be defined as the the action that should happen if the target queue of a placement rule does not exist or it cannot be created
- Introduction of placement rule policies that provides a better solution than the mapping rule creation and in addition provide shortcuts for the most common use cases.
- Placement rule engine now supports the create flag which creates non-existing queues when automatic dynamic queue creation is enabled.
- Automatic conversion of older placement rules (queue mappings) to the new JSON-based format
New placement rules are created from a single page that allow for configuration of all options:
Prior to 7.1.6 customers could allocate resources to queues by using either Absolute mode, in which resources are allocated in units or Relative mode, in which resources are allocated as a percentage of total available resources. We’ve added a new mode for allocating resources called Weight mode in this release. The features of the weight mode are:
- Allocate capacity as a numerical value and suffixed with “w”. Weight is a fraction of the total resource. The queue priorities are used as weights to determine the fraction of total resources that each app should get.
- Switch between relative and weight modes with very few clicks
- Enable auto dynamic child creation for a queue with one click. This allows a parent queue to have both static and dynamic child queues. Static queues have rules and expressions with pre-created target queues and user mappings. Dynamic queues allow the automatic creation of queues based on rules and expressions. This feature can be easily enabled through the YARN Queue Manager UI.
Operational Database – Apache Phoenix 5.1
We’ve released Apache Phoenix 5.1 as part of Operation Database to CDP Private Cloud in order to provide the following capabilities:
- A scale-out RDBMS that is built on top of Apache HBase
- Star schema support and evolutionary schema support
- Views and secondary index support
- Full support for Apache Omid
With Phoenix 5.1, we add complex x-row, x-table transaction support (supporting TPC-C benchmarks out of the box). Prior to this release, Phoenix only supported single row atomic transactions. With this release, it is easier to move sharded MySQL & PostgreSQL deployments to Cloudera, where partition management is fully automated and scale doesn’t mean added operational complexity.
This release also brings improvements to our secondary indexes ensuring that index updates stay strongly consistent with data inserts and upserts.
With the new features, enhancements, and improved upgrade path that the 7.1.6 release adds, there’s no better time to transition your existing CDH or HDP clusters to CDP Private Cloud Base. To plan your migration, please refer to CDP Upgrade & Migration Paths for more information, or contact your Cloudera account team to discuss the best approach.
- CDP Private Cloud Base 7.1.6 Release Notes
- YARN resource allocation
- How to use YARN Dynamic Queues
- What’s new in 7.1.6
- HDP Upgrade overview
- HDP 3 to CDP Upgrade
- Journey advisor tool
- Knowledge Hub
The following folks all contributed to the blog through reviews, edits and suggestions:
Krishna Maheshwari, Sunil Govindan, Sushant Rao, Wim Stoop