Information Security 13 min read

Integrating Kerberos Authentication into a Big Data Platform for Secure Data Access

This article explains the background, principles, key concepts, workflow, and step‑by‑step deployment of Kerberos authentication on a Cloudera CDH big‑data cluster to protect sensitive medical data, and discusses its reliability and best‑practice configurations.

HaoDF Tech Team
HaoDF Tech Team
HaoDF Tech Team
Integrating Kerberos Authentication into a Big Data Platform for Secure Data Access

Background – Good Doctor Online, an internet‑based medical platform, requires strong data security; the existing CDH big‑data platform only relied on network firewalls, which is insufficient for protecting highly sensitive medical data.

What is Kerberos? – Kerberos, created by MIT, is a network authentication protocol that uses symmetric‑key cryptography and a trusted Key Distribution Center (KDC) to verify identities of clients and services, preventing credential theft over insecure networks.

Why use Kerberos? – The internet is insecure, and unauthenticated protocols expose passwords to sniffing attacks. Kerberos provides mutual authentication, encrypted communication, and fine‑grained access control, essential for protecting medical data.

Main concepts – The system revolves around Principals (users or services identified as primary/instance@realm ), the KDC (containing an Authentication Service and Ticket‑Granting Service), Ticket‑Granting Tickets (TGT), Service Tickets (ST), and session keys that secure client‑server interactions.

Kerberos workflow – The process includes seven steps: (1) client requests a TGT from AS; (2) AS validates credentials and issues TGT encrypted with a session key; (3) client decrypts TGT; (4) client presents TGT to TGS for a service ticket; (5) TGS validates TGT and issues ST; (6) client uses ST to authenticate to the target service; (7) the service validates the ST and establishes a secure session.

Applying Kerberos to the big‑data platform – Deployment steps include installing JRE policy files for strong encryption, configuring DNS/hosts, installing KDC and client packages, editing krb5.conf and kdc.conf , creating the Kerberos database, adding admin principals, distributing configuration files, and enabling Kerberos in CDH. Example configuration snippets:

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = CDH.COM # domain must be consistent
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h # ticket lifetime
 renew_lifetime = 2d # max renewal
 forwardable = true

[realms]
 CDH.COM = {
   kdc = cdh1
   admin_server = cdh1
 }

[domain_realm]
 .cdh1.com = CDH.COM
 cdh1.com = CDH.COM

[kdc]
 profile = /var/kerberos/krb5kdc/kdc.conf

Corresponding kdc.conf excerpt:

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 CDH.COM = {
   max_renewable_life = 2d
   acl_file = /var/kerberos/krb5kdc/kadm5.acl
   dict_file = /usr/share/dict/words
   admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
   supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
 }

After configuring, the cluster services are restarted, and verification confirms that Kerberos authentication is active for all components.

Is Kerberos safe? – Kerberos remains one of the most robust access‑control protocols; with strong password policies and modern encryption algorithms, it provides a high security level.

Conclusion – The article summarizes the background, theory, and practical steps for integrating Kerberos into a big‑data environment, and hints at future coverage of Apache Sentry for fine‑grained authorization.

authenticationdata protectionBig Data SecurityApache SentryKerberosCloudera
HaoDF Tech Team
Written by

HaoDF Tech Team

HaoDF Online tech practice and sharing—join us to discuss and help create quality healthcare through technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.