Skip to content

Commit b406ac5

Browse files
authored
feat: Adding a detailed readme for HBase replication support. (#3527)
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/java-bigtable-hbase/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> ☕️
1 parent 68d4a01 commit b406ac5

File tree

2 files changed

+314
-31
lines changed
  • hbase-migration-tools/bigtable-hbase-replication

2 files changed

+314
-31
lines changed
Lines changed: 314 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,314 @@
1+
# HBase-Bigtable Replication
2+
3+
## Overview
4+
5+
HBase
6+
provides [async replication](https://hbase.apache.org/book.html#_cluster_replication)
7+
between clusters for various use cases like disaster recovery, data aggregation
8+
etc. Bigtable HBase replication library enables Cloud Bigtable to be added as
9+
HBase cluster replication sink. HBase to Cloud Bigtable replication enables
10+
customers to keep Cloud Bigtable up to date with the mutations happening on
11+
their HBase cluster. This feature enables near zero downtime migrations from
12+
HBase to Cloud Bigtable.
13+
14+
Replication between HBase and Cloud Bigtable will be eventually consistent. This
15+
is a result of the async nature of HBase replication. Cloud Bigtable HBase
16+
replication library will guarantee the same ordering guarantee as HBase.
17+
18+
Cloud Bigtable HBase Replication Library is deployed to HBase region servers.
19+
This jar file contains a replication endpoint responsible for replicating
20+
mutations to Cloud Bigtable. Similar to HBase, destination Cloud Bigtable
21+
cluster should have all the resources (table, column families) created before
22+
enabling replication. You can
23+
use [HBase schema translator](https://cloud.google.com/blog/products/databases/create-bigtable-tables-from-existing-apache-hbase-tables)
24+
for creating pre-split Cloud Bigtable tables.
25+
26+
The service account running the replication library should be assigned an IAM
27+
role
28+
of [roles/bigtable.user](https://cloud.google.com/bigtable/docs/access-control#roles)
29+
. Please
30+
visit [Cloud Bigtable documentation](https://cloud.google.com/bigtable/docs/authentication)
31+
to configure authentication. The library issues mutateRows RPCs.
32+
33+
## Near zero downtime migration from HBase to Cloud Bigtable
34+
35+
CBT is a natural destination for HBase workloads as it is a managed service
36+
compatible with the HBase API. Customers running business critical applications
37+
want to migrate to CBT without taking extended downtime of the applications. CBT
38+
HBase library is a critical component of such near zero downtime migrations.
39+
40+
HBase to Cloud Bigtable replication enables users to keep their Cloud Bigtable
41+
in sync with the production HBase cluster without taking a downtime. Adding
42+
Cloud Bigtable as an HBase replica guarantees that mutations are applied to
43+
Cloud Bigtable in the same order as on HBase. This is a preferred method for the
44+
“dual write” step of near zero-downtime migrations because it can guarantee
45+
ordering and tolerate CBT unavailability without data loss.
46+
47+
Near zero downtime database migration is a multi step procedure. HBase
48+
replication does not support backfilling of existing data. Hence, CBT HBase
49+
replication library only streams “current changes”. Users have to
50+
use [offline migration tools](https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-data-hbase-to-bigtable)
51+
to copy existing data. In order to avoid any race conditions between backfill
52+
process and replication writes, users should pause replication before starting
53+
backfill job, similar to enabling replication for an HBase table with existing
54+
data. Near zero downtime migrations include the following steps:
55+
56+
1. Install HBase to Cloud Bigtable library on the HBase master and region
57+
servers
58+
2. Configure Cloud
59+
Bigtable [authentication](https://cloud.google.com/bigtable/docs/authentication)
60+
3. Update hbase-site.xml with destination cloud bigtable project id, instance id
61+
and service account json file.
62+
4. Add a CBT replication peer in HBase. On HBase shell
63+
execute `add_peer '2', ENDPOINT_CLASSNAME => 'com.google.cloud.bigtable.hbase.replication.HbaseToCloudBigtableReplicationEndpoint'`
64+
. Use add_peer options to enable replication for select tables.
65+
5. Immediately disable the CBT replication peer, this allows WAL logs to
66+
accumulate on HDFS. On HBase shell execute: `disable_peer '2'`
67+
6. Check the replicated tables by executing `list_replicated_tables` and enable
68+
table level replication by executing `enable_table_replication "table_name"`
69+
7. Copy the existing data
70+
using [offline migration tooling](https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-data-hbase-to-bigtable)
71+
8. After all the data is copied (and verified), enable CBT replication peer. On
72+
HBase shell execute `enable_peer '2'`
73+
9. Eventually, replication will catch up and the 2 databases will be in sync. On
74+
HBase shell execute `status 'replication'` to check status of replication for
75+
peer ("2" in this example)
76+
10. Run validation steps to ensure compatibility and performance on CBT
77+
11. Once ready, switch over to CBT
78+
1. Turn down applications writing to HBase
79+
2. Wait for replication to catch up (This should be fast in the absence of
80+
new writes)
81+
3. Turn on the applications that write directly to CBT.
82+
12. Deprecate the HBase cluster
83+
84+
## Prerequisites
85+
86+
- HBase cluster is configured with the setting `hbase.replication` to `true` in
87+
hbase-site.xml
88+
- Cloud Bigtable instance is created
89+
- Cloud Bigtable instance has all the tables with all column families
90+
- Cloud Bigtable authentication is configured
91+
92+
## HBase configuration
93+
94+
Have the below properties set in `hbase-site.xml` and add it to the HBase
95+
classpath.
96+
97+
```
98+
<property>
99+
<name>google.bigtable.project.id</name>
100+
<value>PROJECT_KEY</value>
101+
<description>
102+
Cloud Bigtable Project ID
103+
</description>
104+
</property>
105+
<property>
106+
<name>google.bigtable.instance.id</name>
107+
<value>INSTANCE_KEY</value>
108+
<description>
109+
Cloud Bigtable Instance ID
110+
</description>
111+
</property>
112+
```
113+
114+
Next, you should configure Cloud Bigtable authentication. Create a service
115+
account and download a json file as shown
116+
[here](https://cloud.google.com/docs/authentication/production#create_service_account)
117+
. Assign the
118+
role [roles/bigtable.user](https://cloud.google.com/bigtable/docs/access-control#roles)
119+
to the newly created service account to grant it write permissions to Cloud
120+
Bigtable. Pass the json file to Cloud Bigtable client by adding the following
121+
to `hbase-site.xml` file.
122+
123+
```
124+
<property>
125+
<name>google.bigtable.auth.json.keyfile</name>
126+
<value>/path/to/downloaded/json/file</value>
127+
<description>
128+
Service account JSON file to connect to Cloud Bigtable
129+
</description>
130+
</property>
131+
```
132+
133+
Please refer
134+
to [HBaseToCloudBigtableReplicationConfiguration](bigtable-hbase-replication-core/src/main/java/com/google/cloud/bigtable/hbase/replication/configuration/HBaseToCloudBigtableReplicationConfiguration.java)
135+
for other properties that can be set.
136+
137+
## Deployment
138+
139+
Use the replication library version corresponding to your HBase version. For
140+
HBase 1.x clusters, please use bigtable-hbase-1.x-replication.jar, for HBase 2.x
141+
versions please use bigtable-hbase-2.x-replication.jar. Following are the steps
142+
to configure HBase to Cloud Bigtable replication:
143+
144+
1. Install Cloud Bigtable replication library in HBase servers (both master and
145+
region servers).
146+
1. Download the library from maven on all the master and region server
147+
nodes.
148+
2. Copy the library to a folder in the HBase class path. For example copy
149+
the jar to /usr/lib/hbase/lib/
150+
2. Add CBT configs to hbase-site.xml as discussed [above](#hbase-configuration).
151+
Specifically, `google.bigtable.project.id`
152+
, `google.bigtable.instance.id` and `google.bigtable.auth.json.keyfile` must
153+
be set.
154+
4. Restart all the HBase master nodes by
155+
running `sudo systemctl restart hbase-master`, this allows the masters to
156+
load the replication jar and be aware of the classes in it. Users should
157+
follow their operational playbooks to perform a rolling restart of the HBase
158+
masters.
159+
5. Restart all the region servers by
160+
running: `sudo systemctl restart hbase-regionserver`on each region server.
161+
Users should follow their operational playbooks to perform a rolling restart
162+
of the HBase cluster.
163+
6. HBase's replication can be enabled at a cluster level/table level or column
164+
family level. `TABLE_CFS` is used to specify column families that should be
165+
replicated. Enable replication to CBT by running this command in hbase shell:
166+
`add_peer '<PEER_ID>', ENDPOINT_CLASSNAME => '
167+
com.google.cloud.bigtable.hbase.HbaseReplicationEndpoint, TABLE_CFS => { "
168+
table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }`
169+
7. All the replicated tables/column families must be present in the target Cloud
170+
Bigtable instance. When you enable HBase replication, changes from the
171+
beginning of the current WAL log will be replicated. Meaning, you will see
172+
changes from before the replication was enabled in Cloud Bigtable. This
173+
behavior is consistent with enabling replication with an HBase cluster.
174+
8. Use your operational playbooks to monitor replication metrics. CBT HBase
175+
replication library will emit standard HBase replication peer metrics.
176+
9. Users can also monitor replication status by running `status 'replication'`
177+
in HBase shell. The metrics for CBT replication will be under the “peer_id”
178+
used in the previous step.
179+
180+
## Error handling
181+
182+
HBase has push based replication. Each region server reads the WAL entries and
183+
passes them to each replication endpoint. If the replication endpoint fails to
184+
apply WAL logs, the WAL will accumulate on HBase regions servers.
185+
186+
If a Bigtable cluster is temporarily unavailable, the WAL logs will accumulate
187+
on region servers. Once the cluster becomes available again, replication can
188+
continue.
189+
190+
For any non-retryable error, like non-existent column-family, replication will
191+
pause and WAL logs will build-up. Users should monitor & alert on replication
192+
progress
193+
via [HBase replication monitoring](https://hbase.apache.org/book.html#_monitoring_replication_status)
194+
. The replication library can not skip a replication entry as a single WAL entry
195+
represents an atomic transaction. Skipping a message will result in divergence
196+
between source and target tables.
197+
198+
### Incompatible Mutations
199+
200+
Certain HBase delete APIs
201+
are [not supported on CBT](https://cloud.google.com/bigtable/docs/hbase-differences#mutations_and_deletions)
202+
. If such mutations are issued on HBase, the CBT client in the replication
203+
library will fail to propagate them and all replication to CBT endpoint will
204+
stall. To avoid such stalling, we will log such mutations and skip them.
205+
Following is a summary of unsupported operations and some supported operations
206+
that can be modified during WAL write.
207+
208+
|Type of mutation |HBase WAL Write behavior |CBT replication library action|
209+
|-----------------|-------------------------|------------------------------|
210+
|DeleteLatestVersion|Resolves the latest version and writes a deletecell with timestamp|Supported, as its normal deletecell
211+
|DeleteFamilyAtVersion|Not modified|Logged and skipped|
212+
|DeleteFamilyBeforeTimestamp|Not modified|Converts it to DeleteFamily if timestamp within a configurable threshold.|
213+
|DeleteFamily|Converts to DeleteFamilyBeforeTimestamp with timestamp=now|See DeleteFamilyBeforeTimestamp|
214+
|DeleteRow|Converts to DeleteFamilyBeforeTimestamp with timestamp=now for all families|See DeleteFamilyBeforeTimestamp|
215+
216+
The goal of this replication library is to allow migration from HBase to CBT.
217+
Since CBT will not support these mutations after the users migrate to CBT, they
218+
are recommended to come up with alternative ways to handle these incompatible
219+
APIs and not issue them while replication is on.
220+
221+
Another special case is mutations with custom cell timestamps. HBase uses
222+
a `long`
223+
to store milliseconds while Cloud Bigtable uses `long` to store microseconds.
224+
This [difference in granularity](https://cloud.google.com/bigtable/docs/hbase-differences#timestamps)
225+
means, HBase can store 1000 times higher cell timestamps than Cloud Bigtable.
226+
The impacted use case is the custom cell timestamp, where customers
227+
use `Long.MAX_VALUE - now` as the cell timestamp. Such timestamps may get
228+
truncated in Cloud Bigtable.
229+
230+
Users can inject custom implementation of IncompatibleMutationHandler. Please
231+
refer
232+
to [IncompatibleMutationAdapter](bigtable-hbase-replication-core/src/main/java/com/google/cloud/bigtable/hbase/replication/adapters/IncompatibleMutationAdapter.java)
233+
javadocs for more details.
234+
235+
## Monitoring
236+
237+
The replication library will emit the metrics into HBase metric ecosystem. There
238+
are 3 kinds of metrics that the replication library will publish:
239+
240+
1. HBase will
241+
track [replication metrics](https://hbase.apache.org/book.html#_replication_metrics)
242+
for the CBT peer on its own. These
243+
include [metrics](https://hbase.apache.org/book.html#_understanding_the_output)
244+
for replication sinks. For example, AgeOfLastShippedLog etc.
245+
2. Cloud Bigtable client side metrics. These will include latencies and failures
246+
of various CBT APIs.
247+
3. Custom metrics from the replication library. For example,
248+
NumberOfIncompatibleMutations.
249+
250+
Please refer to javadocs for class HBaseToCloudBigtableReplicationMetrics for
251+
list of available metrics.
252+
## Troubleshooting
253+
254+
### Replication stalling
255+
256+
#### Symptom: High value of ageOfLastShippedOp.
257+
258+
If the remedial action is not taken it will result in disk full errors.
259+
260+
#### Causes and remediation
261+
262+
Following are the possible causes of replication stalling:
263+
264+
- **Incompatible mutations**: Incompatible mutations can stall replication as
265+
the CBT client will fail to ship them and return a permanent error. Default
266+
IncompatibleMutationHandler strategies shipped with CBTEndpoint log and drop
267+
the incompatible mutations to prevent replication from stalling. If users are
268+
providing custom IncompatibleMutationHandling strategies, they must make sure
269+
that all incompatible mutations are either adapted or dropped.
270+
- **CBT resources do not exist**: If any of the CBT resources do not exist,
271+
replication will stall. Users can resume the replication by creating the
272+
appropriate resources. Most common case will be creation of a new replicated
273+
HBase table which is not present on CBT.
274+
- **Unavailability of CBT cluster**: We recommend users to use a single cluster
275+
routing app profile to ship changes to CBT, as it guarantees ordering of
276+
mutations. But if the target cluster is down, replication will stall. For
277+
larger outages, users may need to route the traffic to other clusters or just
278+
wait for CBT to be available again.
279+
- **Slow CBT cluster**: An under-provisioned CBT cluster can cause replication
280+
to slow down or even stall. Users should monitor the CBT cluster’s CPU and
281+
ideally keep it under 80% utilization. Cloud Bigtable now supports autoscaling
282+
which allows the cluster to scale up when CPU utilization is high.
283+
284+
### Inconsistencies between HBase and CBT
285+
286+
If there are widespread inconsistencies between HBase and CBT, users may need to
287+
restart the migration from the beginning.
288+
289+
#### Causes and remediation
290+
291+
Following are potential causes and steps to avoid them
292+
293+
- **CBTEndpoint using multi cluster routing**: If HBase replication uses multi
294+
cluster routing, it may write to different clusters and CBT replication’s
295+
“last writer wins” conflict resolution may lead to a different order of
296+
mutation than HBase’s ordering.
297+
- **Dropped incompatible mutations**: Users should stop using incompatible
298+
mutations before starting the replication to CBT, since they will not have
299+
access to these APIs after migration. If CBTEndpoint drops the incompatible
300+
mutations, 2 databases will diverge.
301+
- **HBase replication gaps**: There
302+
are [cases](https://hbase.apache.org/book.html#_serial_replication) when HBase
303+
replication does not converge (it is rare). Such cases can lead to divergence.
304+
These are unavoidable and users should perform remediation at the migration
305+
verification step.
306+
307+
### Big Red Button
308+
309+
In extreme situations, Cloud Bigtable may pose a risk to HBase availability. For
310+
example, replication is stalled and cluster is running out of HDFS storage. In
311+
such cases, users should use the big red button and disable replication. Users
312+
should use
313+
`delete_peer 'peer_id'` command from HBase shell. This will delete the Cloud
314+
Bigtable replication peer and allow the WAL logs to be garbage collected.

hbase-migration-tools/bigtable-hbase-replication/bigtable-hbase-replication-core/README.md

Lines changed: 0 additions & 31 deletions
This file was deleted.

0 commit comments

Comments
 (0)