|
| 1 | +# HBase-Bigtable Replication |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +HBase |
| 6 | +provides [async replication](https://hbase.apache.org/book.html#_cluster_replication) |
| 7 | +between clusters for various use cases like disaster recovery, data aggregation |
| 8 | +etc. Bigtable HBase replication library enables Cloud Bigtable to be added as |
| 9 | +HBase cluster replication sink. HBase to Cloud Bigtable replication enables |
| 10 | +customers to keep Cloud Bigtable up to date with the mutations happening on |
| 11 | +their HBase cluster. This feature enables near zero downtime migrations from |
| 12 | +HBase to Cloud Bigtable. |
| 13 | + |
| 14 | +Replication between HBase and Cloud Bigtable will be eventually consistent. This |
| 15 | +is a result of the async nature of HBase replication. Cloud Bigtable HBase |
| 16 | +replication library will guarantee the same ordering guarantee as HBase. |
| 17 | + |
| 18 | +Cloud Bigtable HBase Replication Library is deployed to HBase region servers. |
| 19 | +This jar file contains a replication endpoint responsible for replicating |
| 20 | +mutations to Cloud Bigtable. Similar to HBase, destination Cloud Bigtable |
| 21 | +cluster should have all the resources (table, column families) created before |
| 22 | +enabling replication. You can |
| 23 | +use [HBase schema translator](https://cloud.google.com/blog/products/databases/create-bigtable-tables-from-existing-apache-hbase-tables) |
| 24 | +for creating pre-split Cloud Bigtable tables. |
| 25 | + |
| 26 | +The service account running the replication library should be assigned an IAM |
| 27 | +role |
| 28 | +of [roles/bigtable.user](https://cloud.google.com/bigtable/docs/access-control#roles) |
| 29 | +. Please |
| 30 | +visit [Cloud Bigtable documentation](https://cloud.google.com/bigtable/docs/authentication) |
| 31 | +to configure authentication. The library issues mutateRows RPCs. |
| 32 | + |
| 33 | +## Near zero downtime migration from HBase to Cloud Bigtable |
| 34 | + |
| 35 | +CBT is a natural destination for HBase workloads as it is a managed service |
| 36 | +compatible with the HBase API. Customers running business critical applications |
| 37 | +want to migrate to CBT without taking extended downtime of the applications. CBT |
| 38 | +HBase library is a critical component of such near zero downtime migrations. |
| 39 | + |
| 40 | +HBase to Cloud Bigtable replication enables users to keep their Cloud Bigtable |
| 41 | +in sync with the production HBase cluster without taking a downtime. Adding |
| 42 | +Cloud Bigtable as an HBase replica guarantees that mutations are applied to |
| 43 | +Cloud Bigtable in the same order as on HBase. This is a preferred method for the |
| 44 | +“dual write” step of near zero-downtime migrations because it can guarantee |
| 45 | +ordering and tolerate CBT unavailability without data loss. |
| 46 | + |
| 47 | +Near zero downtime database migration is a multi step procedure. HBase |
| 48 | +replication does not support backfilling of existing data. Hence, CBT HBase |
| 49 | +replication library only streams “current changes”. Users have to |
| 50 | +use [offline migration tools](https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-data-hbase-to-bigtable) |
| 51 | +to copy existing data. In order to avoid any race conditions between backfill |
| 52 | +process and replication writes, users should pause replication before starting |
| 53 | +backfill job, similar to enabling replication for an HBase table with existing |
| 54 | +data. Near zero downtime migrations include the following steps: |
| 55 | + |
| 56 | +1. Install HBase to Cloud Bigtable library on the HBase master and region |
| 57 | + servers |
| 58 | +2. Configure Cloud |
| 59 | + Bigtable [authentication](https://cloud.google.com/bigtable/docs/authentication) |
| 60 | +3. Update hbase-site.xml with destination cloud bigtable project id, instance id |
| 61 | + and service account json file. |
| 62 | +4. Add a CBT replication peer in HBase. On HBase shell |
| 63 | + execute `add_peer '2', ENDPOINT_CLASSNAME => 'com.google.cloud.bigtable.hbase.replication.HbaseToCloudBigtableReplicationEndpoint'` |
| 64 | + . Use add_peer options to enable replication for select tables. |
| 65 | +5. Immediately disable the CBT replication peer, this allows WAL logs to |
| 66 | + accumulate on HDFS. On HBase shell execute: `disable_peer '2'` |
| 67 | +6. Check the replicated tables by executing `list_replicated_tables` and enable |
| 68 | + table level replication by executing `enable_table_replication "table_name"` |
| 69 | +7. Copy the existing data |
| 70 | + using [offline migration tooling](https://cloud.google.com/architecture/hadoop/hadoop-gcp-migration-data-hbase-to-bigtable) |
| 71 | +8. After all the data is copied (and verified), enable CBT replication peer. On |
| 72 | + HBase shell execute `enable_peer '2'` |
| 73 | +9. Eventually, replication will catch up and the 2 databases will be in sync. On |
| 74 | + HBase shell execute `status 'replication'` to check status of replication for |
| 75 | + peer ("2" in this example) |
| 76 | +10. Run validation steps to ensure compatibility and performance on CBT |
| 77 | +11. Once ready, switch over to CBT |
| 78 | + 1. Turn down applications writing to HBase |
| 79 | + 2. Wait for replication to catch up (This should be fast in the absence of |
| 80 | + new writes) |
| 81 | + 3. Turn on the applications that write directly to CBT. |
| 82 | +12. Deprecate the HBase cluster |
| 83 | + |
| 84 | +## Prerequisites |
| 85 | + |
| 86 | +- HBase cluster is configured with the setting `hbase.replication` to `true` in |
| 87 | + hbase-site.xml |
| 88 | +- Cloud Bigtable instance is created |
| 89 | +- Cloud Bigtable instance has all the tables with all column families |
| 90 | +- Cloud Bigtable authentication is configured |
| 91 | + |
| 92 | +## HBase configuration |
| 93 | + |
| 94 | +Have the below properties set in `hbase-site.xml` and add it to the HBase |
| 95 | +classpath. |
| 96 | + |
| 97 | +``` |
| 98 | +<property> |
| 99 | + <name>google.bigtable.project.id</name> |
| 100 | + <value>PROJECT_KEY</value> |
| 101 | + <description> |
| 102 | + Cloud Bigtable Project ID |
| 103 | + </description> |
| 104 | +</property> |
| 105 | +<property> |
| 106 | + <name>google.bigtable.instance.id</name> |
| 107 | + <value>INSTANCE_KEY</value> |
| 108 | + <description> |
| 109 | + Cloud Bigtable Instance ID |
| 110 | + </description> |
| 111 | +</property> |
| 112 | +``` |
| 113 | + |
| 114 | +Next, you should configure Cloud Bigtable authentication. Create a service |
| 115 | +account and download a json file as shown |
| 116 | +[here](https://cloud.google.com/docs/authentication/production#create_service_account) |
| 117 | +. Assign the |
| 118 | +role [roles/bigtable.user](https://cloud.google.com/bigtable/docs/access-control#roles) |
| 119 | +to the newly created service account to grant it write permissions to Cloud |
| 120 | +Bigtable. Pass the json file to Cloud Bigtable client by adding the following |
| 121 | +to `hbase-site.xml` file. |
| 122 | + |
| 123 | +``` |
| 124 | +<property> |
| 125 | + <name>google.bigtable.auth.json.keyfile</name> |
| 126 | + <value>/path/to/downloaded/json/file</value> |
| 127 | + <description> |
| 128 | + Service account JSON file to connect to Cloud Bigtable |
| 129 | + </description> |
| 130 | +</property> |
| 131 | +``` |
| 132 | + |
| 133 | +Please refer |
| 134 | +to [HBaseToCloudBigtableReplicationConfiguration](bigtable-hbase-replication-core/src/main/java/com/google/cloud/bigtable/hbase/replication/configuration/HBaseToCloudBigtableReplicationConfiguration.java) |
| 135 | +for other properties that can be set. |
| 136 | + |
| 137 | +## Deployment |
| 138 | + |
| 139 | +Use the replication library version corresponding to your HBase version. For |
| 140 | +HBase 1.x clusters, please use bigtable-hbase-1.x-replication.jar, for HBase 2.x |
| 141 | +versions please use bigtable-hbase-2.x-replication.jar. Following are the steps |
| 142 | +to configure HBase to Cloud Bigtable replication: |
| 143 | + |
| 144 | +1. Install Cloud Bigtable replication library in HBase servers (both master and |
| 145 | + region servers). |
| 146 | + 1. Download the library from maven on all the master and region server |
| 147 | + nodes. |
| 148 | + 2. Copy the library to a folder in the HBase class path. For example copy |
| 149 | + the jar to /usr/lib/hbase/lib/ |
| 150 | +2. Add CBT configs to hbase-site.xml as discussed [above](#hbase-configuration). |
| 151 | + Specifically, `google.bigtable.project.id` |
| 152 | + , `google.bigtable.instance.id` and `google.bigtable.auth.json.keyfile` must |
| 153 | + be set. |
| 154 | +4. Restart all the HBase master nodes by |
| 155 | + running `sudo systemctl restart hbase-master`, this allows the masters to |
| 156 | + load the replication jar and be aware of the classes in it. Users should |
| 157 | + follow their operational playbooks to perform a rolling restart of the HBase |
| 158 | + masters. |
| 159 | +5. Restart all the region servers by |
| 160 | + running: `sudo systemctl restart hbase-regionserver`on each region server. |
| 161 | + Users should follow their operational playbooks to perform a rolling restart |
| 162 | + of the HBase cluster. |
| 163 | +6. HBase's replication can be enabled at a cluster level/table level or column |
| 164 | + family level. `TABLE_CFS` is used to specify column families that should be |
| 165 | + replicated. Enable replication to CBT by running this command in hbase shell: |
| 166 | + `add_peer '<PEER_ID>', ENDPOINT_CLASSNAME => ' |
| 167 | + com.google.cloud.bigtable.hbase.HbaseReplicationEndpoint, TABLE_CFS => { " |
| 168 | + table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }` |
| 169 | +7. All the replicated tables/column families must be present in the target Cloud |
| 170 | + Bigtable instance. When you enable HBase replication, changes from the |
| 171 | + beginning of the current WAL log will be replicated. Meaning, you will see |
| 172 | + changes from before the replication was enabled in Cloud Bigtable. This |
| 173 | + behavior is consistent with enabling replication with an HBase cluster. |
| 174 | +8. Use your operational playbooks to monitor replication metrics. CBT HBase |
| 175 | + replication library will emit standard HBase replication peer metrics. |
| 176 | +9. Users can also monitor replication status by running `status 'replication'` |
| 177 | + in HBase shell. The metrics for CBT replication will be under the “peer_id” |
| 178 | + used in the previous step. |
| 179 | + |
| 180 | +## Error handling |
| 181 | + |
| 182 | +HBase has push based replication. Each region server reads the WAL entries and |
| 183 | +passes them to each replication endpoint. If the replication endpoint fails to |
| 184 | +apply WAL logs, the WAL will accumulate on HBase regions servers. |
| 185 | + |
| 186 | +If a Bigtable cluster is temporarily unavailable, the WAL logs will accumulate |
| 187 | +on region servers. Once the cluster becomes available again, replication can |
| 188 | +continue. |
| 189 | + |
| 190 | +For any non-retryable error, like non-existent column-family, replication will |
| 191 | +pause and WAL logs will build-up. Users should monitor & alert on replication |
| 192 | +progress |
| 193 | +via [HBase replication monitoring](https://hbase.apache.org/book.html#_monitoring_replication_status) |
| 194 | +. The replication library can not skip a replication entry as a single WAL entry |
| 195 | +represents an atomic transaction. Skipping a message will result in divergence |
| 196 | +between source and target tables. |
| 197 | + |
| 198 | +### Incompatible Mutations |
| 199 | + |
| 200 | +Certain HBase delete APIs |
| 201 | +are [not supported on CBT](https://cloud.google.com/bigtable/docs/hbase-differences#mutations_and_deletions) |
| 202 | +. If such mutations are issued on HBase, the CBT client in the replication |
| 203 | +library will fail to propagate them and all replication to CBT endpoint will |
| 204 | +stall. To avoid such stalling, we will log such mutations and skip them. |
| 205 | +Following is a summary of unsupported operations and some supported operations |
| 206 | +that can be modified during WAL write. |
| 207 | + |
| 208 | +|Type of mutation |HBase WAL Write behavior |CBT replication library action| |
| 209 | +|-----------------|-------------------------|------------------------------| |
| 210 | +|DeleteLatestVersion|Resolves the latest version and writes a deletecell with timestamp|Supported, as its normal deletecell |
| 211 | +|DeleteFamilyAtVersion|Not modified|Logged and skipped| |
| 212 | +|DeleteFamilyBeforeTimestamp|Not modified|Converts it to DeleteFamily if timestamp within a configurable threshold.| |
| 213 | +|DeleteFamily|Converts to DeleteFamilyBeforeTimestamp with timestamp=now|See DeleteFamilyBeforeTimestamp| |
| 214 | +|DeleteRow|Converts to DeleteFamilyBeforeTimestamp with timestamp=now for all families|See DeleteFamilyBeforeTimestamp| |
| 215 | + |
| 216 | +The goal of this replication library is to allow migration from HBase to CBT. |
| 217 | +Since CBT will not support these mutations after the users migrate to CBT, they |
| 218 | +are recommended to come up with alternative ways to handle these incompatible |
| 219 | +APIs and not issue them while replication is on. |
| 220 | + |
| 221 | +Another special case is mutations with custom cell timestamps. HBase uses |
| 222 | +a `long` |
| 223 | +to store milliseconds while Cloud Bigtable uses `long` to store microseconds. |
| 224 | +This [difference in granularity](https://cloud.google.com/bigtable/docs/hbase-differences#timestamps) |
| 225 | +means, HBase can store 1000 times higher cell timestamps than Cloud Bigtable. |
| 226 | +The impacted use case is the custom cell timestamp, where customers |
| 227 | +use `Long.MAX_VALUE - now` as the cell timestamp. Such timestamps may get |
| 228 | +truncated in Cloud Bigtable. |
| 229 | + |
| 230 | +Users can inject custom implementation of IncompatibleMutationHandler. Please |
| 231 | +refer |
| 232 | +to [IncompatibleMutationAdapter](bigtable-hbase-replication-core/src/main/java/com/google/cloud/bigtable/hbase/replication/adapters/IncompatibleMutationAdapter.java) |
| 233 | +javadocs for more details. |
| 234 | + |
| 235 | +## Monitoring |
| 236 | + |
| 237 | +The replication library will emit the metrics into HBase metric ecosystem. There |
| 238 | +are 3 kinds of metrics that the replication library will publish: |
| 239 | + |
| 240 | +1. HBase will |
| 241 | + track [replication metrics](https://hbase.apache.org/book.html#_replication_metrics) |
| 242 | + for the CBT peer on its own. These |
| 243 | + include [metrics](https://hbase.apache.org/book.html#_understanding_the_output) |
| 244 | + for replication sinks. For example, AgeOfLastShippedLog etc. |
| 245 | +2. Cloud Bigtable client side metrics. These will include latencies and failures |
| 246 | + of various CBT APIs. |
| 247 | +3. Custom metrics from the replication library. For example, |
| 248 | + NumberOfIncompatibleMutations. |
| 249 | + |
| 250 | +Please refer to javadocs for class HBaseToCloudBigtableReplicationMetrics for |
| 251 | +list of available metrics. |
| 252 | +## Troubleshooting |
| 253 | + |
| 254 | +### Replication stalling |
| 255 | + |
| 256 | +#### Symptom: High value of ageOfLastShippedOp. |
| 257 | + |
| 258 | +If the remedial action is not taken it will result in disk full errors. |
| 259 | + |
| 260 | +#### Causes and remediation |
| 261 | + |
| 262 | +Following are the possible causes of replication stalling: |
| 263 | + |
| 264 | +- **Incompatible mutations**: Incompatible mutations can stall replication as |
| 265 | + the CBT client will fail to ship them and return a permanent error. Default |
| 266 | + IncompatibleMutationHandler strategies shipped with CBTEndpoint log and drop |
| 267 | + the incompatible mutations to prevent replication from stalling. If users are |
| 268 | + providing custom IncompatibleMutationHandling strategies, they must make sure |
| 269 | + that all incompatible mutations are either adapted or dropped. |
| 270 | +- **CBT resources do not exist**: If any of the CBT resources do not exist, |
| 271 | + replication will stall. Users can resume the replication by creating the |
| 272 | + appropriate resources. Most common case will be creation of a new replicated |
| 273 | + HBase table which is not present on CBT. |
| 274 | +- **Unavailability of CBT cluster**: We recommend users to use a single cluster |
| 275 | + routing app profile to ship changes to CBT, as it guarantees ordering of |
| 276 | + mutations. But if the target cluster is down, replication will stall. For |
| 277 | + larger outages, users may need to route the traffic to other clusters or just |
| 278 | + wait for CBT to be available again. |
| 279 | +- **Slow CBT cluster**: An under-provisioned CBT cluster can cause replication |
| 280 | + to slow down or even stall. Users should monitor the CBT cluster’s CPU and |
| 281 | + ideally keep it under 80% utilization. Cloud Bigtable now supports autoscaling |
| 282 | + which allows the cluster to scale up when CPU utilization is high. |
| 283 | + |
| 284 | +### Inconsistencies between HBase and CBT |
| 285 | + |
| 286 | +If there are widespread inconsistencies between HBase and CBT, users may need to |
| 287 | +restart the migration from the beginning. |
| 288 | + |
| 289 | +#### Causes and remediation |
| 290 | + |
| 291 | +Following are potential causes and steps to avoid them |
| 292 | + |
| 293 | +- **CBTEndpoint using multi cluster routing**: If HBase replication uses multi |
| 294 | + cluster routing, it may write to different clusters and CBT replication’s |
| 295 | + “last writer wins” conflict resolution may lead to a different order of |
| 296 | + mutation than HBase’s ordering. |
| 297 | +- **Dropped incompatible mutations**: Users should stop using incompatible |
| 298 | + mutations before starting the replication to CBT, since they will not have |
| 299 | + access to these APIs after migration. If CBTEndpoint drops the incompatible |
| 300 | + mutations, 2 databases will diverge. |
| 301 | +- **HBase replication gaps**: There |
| 302 | + are [cases](https://hbase.apache.org/book.html#_serial_replication) when HBase |
| 303 | + replication does not converge (it is rare). Such cases can lead to divergence. |
| 304 | + These are unavoidable and users should perform remediation at the migration |
| 305 | + verification step. |
| 306 | + |
| 307 | +### Big Red Button |
| 308 | + |
| 309 | +In extreme situations, Cloud Bigtable may pose a risk to HBase availability. For |
| 310 | +example, replication is stalled and cluster is running out of HDFS storage. In |
| 311 | +such cases, users should use the big red button and disable replication. Users |
| 312 | +should use |
| 313 | +`delete_peer 'peer_id'` command from HBase shell. This will delete the Cloud |
| 314 | +Bigtable replication peer and allow the WAL logs to be garbage collected. |
0 commit comments