Skip to content

Commit 72a32ca

Browse files
add doc for encryption at rest
Use blog https://github.com/dgraph-io/open/blob/master/content/post/encryption-at-rest-dgraph-badger.md to create a doc page about encryption at rest
1 parent 31342dc commit 72a32ca

File tree

2 files changed

+163
-0
lines changed

2 files changed

+163
-0
lines changed

docs/encryption-at-rest.md

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# Encryption at Rest in Dgraph and Badger
2+
3+
Badger provides encryption at rest using AES encryption, enabling compliance with security standards
4+
such as HIPAA and PCI DSS. This feature was introduced in Badger v2 and is available to all systems
5+
built on Badger, including Dgraph.
6+
7+
## Overview
8+
9+
Badger implements encryption at the storage layer, allowing systems like Dgraph to inherit
10+
encryption capabilities without additional implementation. This separation of concerns means:
11+
12+
- Badger manages data security and encryption at the disk level
13+
- Higher-level systems like Dgraph focus on distributed operations and graph semantics
14+
- All Badger-based systems benefit from encryption improvements
15+
16+
## Encryption Algorithm
17+
18+
Badger uses the
19+
[Advanced Encryption Standard (AES)](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard),
20+
standardized by NIST and widely adopted across databases including MongoDB, SQLite, and RocksDB. AES
21+
is a symmetric encryption algorithm: the same key encrypts and decrypts data.
22+
23+
AES key sizes: 128, 192, or 256 bits. All provide strong security; 128-bit keys are computationally
24+
infeasible to brute force.
25+
26+
## Key Management
27+
28+
Badger uses a two-tier key system:
29+
30+
### Master Key
31+
32+
The user-provided AES encryption key that encrypts data keys. Master key length determines AES
33+
variant:
34+
35+
- 16 bytes: AES-128
36+
- 24 bytes: AES-192
37+
- 32 bytes: AES-256
38+
39+
**Important:** Use a cryptographically secure random key. Never use predictable strings. Generate
40+
keys using a password manager or secure random generator.
41+
42+
### Data Keys
43+
44+
Auto-generated keys that encrypt actual data on disk. Each encrypted data key is stored alongside
45+
the encrypted data. Master keys encrypt data keys, not data directly.
46+
47+
**Benefits:**
48+
49+
- Master key rotation only requires re-encrypting data keys (small, fast operation)
50+
- Data keys rotate automatically without re-encrypting all data
51+
- Minimal performance impact during key rotation
52+
53+
## Key Rotation
54+
55+
### Data Key Rotation
56+
57+
Badger automatically rotates data keys every 10 days by default. Configure the rotation interval
58+
using `Options.WithEncryptionKeyRotationDuration`.
59+
60+
All historical data keys are retained to decrypt older data. Each data key is 32 bytes; 1000 keys
61+
consume 32KB. At 10-day intervals, this represents approximately 27 years of keys.
62+
63+
### Master Key Rotation
64+
65+
Users must manually rotate master keys. Use the `rotate` command:
66+
67+
```shell
68+
badger rotate --dir=badger_dir --old-key-path=old/path --new-key-path=new/path
69+
```
70+
71+
**Requirements:**
72+
73+
- Database must be offline during master key rotation
74+
- Only data keys are re-encrypted (fast operation)
75+
- Future versions may support online rotation
76+
77+
## Initialization Vectors
78+
79+
To prevent identical plaintext from producing identical ciphertext, Badger uses Initialization
80+
Vectors (IVs).
81+
82+
### SSTable Encryption
83+
84+
Each 4KB block in SSTables uses a unique 16-byte IV stored in plaintext at the end of the encrypted
85+
block. Storage overhead: 0.4% (16 bytes per 4KB block).
86+
87+
**Security:** IVs can be stored in plaintext. Decryption requires the data key, which requires the
88+
master key. Knowledge of the IV alone is insufficient.
89+
90+
### Value Log Encryption
91+
92+
Value log entries are encrypted individually to match access patterns. To minimize storage overhead,
93+
Badger uses a 12-byte file-level IV combined with a 4-byte value offset to form the 16-byte IV.
94+
95+
**Benefits:**
96+
97+
- Saves 16 bytes per value entry
98+
- 12-byte overhead per vlog file (vs 16 bytes per value)
99+
- For 10,000 entries: 12 bytes total vs 160,000 bytes with per-value IVs
100+
101+
## Enabling Encryption
102+
103+
### New Database
104+
105+
Enable encryption when creating a new database:
106+
107+
```go
108+
opts := badger.DefaultOptions("/tmp/badger").
109+
WithEncryptionKey(masterKey).
110+
WithEncryptionKeyRotationDuration(dataKeyRotationDuration) // defaults to 10 days
111+
```
112+
113+
### Existing Database
114+
115+
Enable encryption on an unencrypted database:
116+
117+
```shell
118+
badger rotate --dir=badger_dir --new-key-path=new/path
119+
```
120+
121+
**Note:** This enables encryption for new files only. Existing data is encrypted during compaction
122+
as new files are generated. Badger operates in hybrid mode, tracking encryption status per file.
123+
124+
### Immediate Full Encryption
125+
126+
To encrypt all existing data immediately:
127+
128+
1. Export the database: `badger backup --dir=badger_dir -f backup.bak`
129+
2. Create a new encrypted database instance
130+
3. Restore the data: `badger restore --dir=new_badger_dir -f backup.bak`
131+
132+
Alternatively, use the Stream Framework and StreamWriter interface for in-place encryption with high
133+
throughput.
134+
135+
## Security Considerations
136+
137+
### Key Security
138+
139+
- Store master keys securely (key management service, secure vault)
140+
- Rotate master keys regularly
141+
- Use strong, randomly generated keys
142+
- Protect physical access to systems performing encryption
143+
144+
### Key Leakage
145+
146+
Key security is more critical than key size. Threats include:
147+
148+
- Side-channel attacks (electromagnetic radiation analysis)
149+
- Key reuse patterns enabling cryptanalysis
150+
- Physical access to encryption systems
151+
152+
Regular key rotation mitigates these risks.
153+
154+
## Terminology
155+
156+
In this context, "key" refers to:
157+
158+
- **Database key**: The key in a key-value pair stored in Badger
159+
- **Encryption key**: The cryptographic key used for encryption/decryption (master key or data key)
160+
161+
When ambiguous, this document uses "encryption key" for cryptographic keys.

docs/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,6 @@ with each release.
1515

1616
[Design](design.md)
1717

18+
[Encryption at rest](encryption-at-rest.md)
19+
1820
[Troubleshooting](troubleshooting.md)

0 commit comments

Comments
 (0)