In this blog post, we are going to take a look at some of the OpDB related security features of a CDP Private Cloud Base deployment. We are going to talk about encryption, authentication and authorization.
Transparent data-at-rest encryption is available through the Transparent Data Encryption (TDE) feature in HDFS.
TDE provides the following features:
- Transparent, end-to-end encryption of data
- Separation of duties between cryptographic and administrative responsibilities
- Mature key lifecycle management features
The master key for encrypting the EZKs itself can be placed in escrow in a hardware security module (HSM), such as Safenet Luna, Amazon KMS, or Thales nShield.
In addition, our cloud deployments for cloud-native stores can also support encryption key escrow with cloud vendor-provided infrastructure, such as AWS KMS or Azure Key Vault.
OpDB uses Transport Layer Security (TLS) security protocol for wire encryption. It provides authentication, privacy and data integrity between applications communicating over a network. OpDB supports the Auto-TLS feature which greatly simplifies the process of enabling and managing TLS encryption on your cluster. Both Apache Phoenix and Apache HBase (Web UIs, Thrift Server and REST Server) support Auto-TLS.
Ranger Key Management Service
Ranger KMS houses the encryption zone keys (EZKs) required to decrypt the data encryption keys that are necessary to read decrypted content in files. Through RangerKMS, users can implement policies for key access that separate and distinct from the access to underlying data. The EZKs are stored in a secure database within the KMS. This database can be deployed in a secured mode selectively in the cluster nodes.
The EZKs are encrypted with a master key which is externalized into HSMs for additional security. The configuration and policy management interfaces enable key rotation and key versioning. Access audits in Apache Ranger support the tracking of access keys.
Decryption happens only at the client and no zone key leaves the KMS during the decryption process.
Due to the separation of duties (for example, platform operators cannot get access to encrypted data at rest), it can be controlled who can access decrypted content under what conditions at a very fine-grained level. This separation is handled natively in Apache Ranger through fine-grained policies to limit operators access to decrypted data.
Key rotation and rollover can be performed from the same management interface provided in Ranger KMS.
Security Certification Standards
Cloudera’s platform provides several of the key compliance and security controls required for specific customer deployments to be certified for compliance with standards for PCi, HIPAA, GDPR, ISO 270001 and more.
For example, many of these standards require encryption of data at rest and in motion. Robust scalable encryption for data at rest through HDFS TDE and data in motion through Auto-TLS feature are provided natively in our platform. Ranger KMS is also provided which enables policies, lifecycle management and key escrow into tamper-proof HSMs. Key escrow is also supported with cloud vendor-provided infrastructure.
Combined with other AAA (Authentication, Authorization and Audits) controls available for our platform, in a CDP Data Center deployment our OpDB can fulfill many of the requirements of PCI, HIPAA, ISO 27001 and more.
Our Operational Services offerings are also certified for SOC compliance. For more information, see Operational Services.
Cloudera’s platform supports the following forms of user authentication:
- LDAP username/password
- OAuth (using Apache Knox)
Attribute-Based Access Control
Cloudera’s OpDBMS provides Role-based Access Control (RBAC) and Attribute-Based Access Control (ABAC) through Apache Ranger which is included as part of the platform.
Authorization can be provided at the cell level, column family level, table level, namespace level, or globally. This allows flexibility in defining roles as global admins, namespace admins, table admins, or even further granularity or any combination of these scopes as well.
Apache Ranger provides the centralized framework to define, administer, and manage security policies consistently across the big data ecosystem. ABAC based policies can include a combination of the subject (user), action (for example create or update), resource (for example table or column family), and environmental properties to create a fine-grained policy for authorization.
Apache Ranger also provides some advanced features like security zones (logical division of security policies), Deny policies, and policy expiration period (setting up a policy which is enabled only for limited time). These features, in combination with other features described above, create a strong base to define effective, scalable and manageable OpDBMS security policies.
For large scale OpDB environments descriptive attributes can be used to precisely control OpDBMS access using a minimal set of access control policies. The following are descriptive attributes:
- Active Directory (AD) group
- Apache Atlas-based tags or classifications
- geo-location and other attributes of the subjects, resources and environment properties
Once defined, Apache Ranger policies can also be exported / imported into another OpDBMS environment which requires the same access control with very minimal efforts.
This approach enables compliance personnel and security administrators to define precise and intuitive security policies required by regulations, such as GDPR, at a fine-grained level.
Database Administrator Authorization
Apache Ranger provides fine-grained control to allow specific administration of databases using policies or specific schemes such as grant and revoke mechanisms. It also provides fine-grained permission mapping for specific users and groups. That makes it possible to authorize DBAs for specific resources (columns, tables, column families and so on) with only the required permissions.
In addition, when TDE capabilities are used to encrypt data in HDFS, the administrators or operators can be selectively blocked from being able to decrypt data. This is achieved with specific key access policies, meaning that even though they can perform administrative operations they cannot view or change the underlying encrypted data because they do not have key access.
Detecting and blocking unauthorized usage
Several of Cloudera’s query engines have variable binding and query compilation making the code less vulnerable to user input and preventing SQL injections. Dynamic penetration testing and static code scans are performed across our platform to detect SQL injection and other vulnerabilities for every customer-facing release and remediated in each component.
Unauthorized usage can be blocked by suitable policies using Apache Ranger’s comprehensive security framework.
Least Privilege Model
Apache Ranger provides a default deny behavior in OpDB. If a user does not have explicitly granted permission by any policy to access a resource, they are automatically denied.
Explicit privileged operations have to be authorized by policies. Privileged users and operations are mapped to specific roles.
Delegated administration facilities are also available in Apache Ranger to provide explicit privileges operations and management for specific resource groups through policies.
This was Part 1 of the Operational Database Security blog post. We looked at various security features and capabilities that Cloudera’s OpDB provides.
For more information about the security-related features and capabilities of Cloudera’s OpDB a Part 2 blog post is coming soon!
For more information about Cloudera’s Operational Database offering, see Cloudera Operational Database.