Skip to content

refactor: upgrade HBase and replace custom hbase-shaded-endpint#3021

Open
vaijosh wants to merge 3 commits into
apache:masterfrom
vaijosh:Hbase-265-upgrade
Open

refactor: upgrade HBase and replace custom hbase-shaded-endpint#3021
vaijosh wants to merge 3 commits into
apache:masterfrom
vaijosh:Hbase-265-upgrade

Conversation

@vaijosh
Copy link
Copy Markdown

@vaijosh vaijosh commented May 11, 2026

Title

fixes #3016

feat(hbase): upgrade to HBase 2.6.5 and replace custom shaded endpoint with official Apache artifacts

Background

This PR modernizes HugeGraph’s HBase integration by replacing the custom hbase-shaded-endpoint dependency with official Apache HBase 2.6.5 artifacts, and adds a reproducible Docker-based local test environment for HBase backend development/verification.

What changed

1) HBase dependency upgrade (hugegraph-server/hugegraph-hbase/pom.xml)

  • Added hbase.version property: 2.6.5
  • Replaced:
    • com.baidu.hugegraph:hbase-shaded-endpoint:2.0.6
  • With:
    • org.apache.hbase:hbase-endpoint:${hbase.version}
    • org.apache.hbase:hbase-shaded-client:${hbase.version}
  • Added exclusions on hbase-endpoint to avoid pulling heavyweight server/hadoop transitive components not needed by HugeGraph runtime.
  • Kept dependency order (hbase-endpoint before hbase-shaded-client) to preserve AggregationClient/LongColumnInterpreter compatibility.

2) Dockerized HBase standalone environment (new files under docker/hbase/)

  • Added Dockerfile to build HBase 2.6.5 image from official Apache tarballs.
  • Added SHA512 verification with strict default behavior and mirror fallback:
    • primary: downloads.apache.org
    • fallback: archive.apache.org
  • Added robust checksum parsing to support Apache .sha512 formats.
  • Added entrypoint.sh that starts ZooKeeper + Master + RegionServer and blocks until service readiness.
  • Added hbase-site.xml tuned for local standalone/pseudo-distributed usage and HugeGraph defaults.
  • Added docker-compose.hbase.yml with ports, healthcheck, persistent volumes, and overridable download URLs.

3) End-to-end usage and troubleshooting docs (docker/HBASE.md)

  • Added full guide for:
    • starting/stopping HBase in Docker
    • HugeGraph server config/init for HBase backend
    • API sanity checks (schema + vertex + gremlin)
    • troubleshooting common issues and cleanup

4) Dependency allowlist update (install-dist/scripts/dependency/known-dependencies.txt)

  • Removed: hbase-shaded-endpoint-2.0.6.jar
  • Added:
    • hbase-endpoint-2.6.5.jar
    • hbase-shaded-client-2.6.5.jar

Why

  • Align HugeGraph HBase integration with official Apache HBase artifacts.
  • Remove reliance on custom shaded endpoint packaging.
  • Provide a consistent and secure local HBase test setup for contributors and CI-like reproduction.
  • Reduce build/setup friction with documented and reproducible steps.

Impact

  • Scope is limited to HBase module dependencies, Docker test tooling, and dependency metadata/docs.
  • Existing non-HBase backends are not directly affected.

How to verify

docker compose -f docker/hbase/docker-compose.hbase.yml build --no-cache hbase
docker compose -f docker/hbase/docker-compose.hbase.yml up -d
docker compose -f docker/hbase/docker-compose.hbase.yml ps

mvn clean install -pl hugegraph-server/hugegraph-hbase -am -DskipTests

bash install-dist/scripts/dependency/check_dependencies.sh

Hbase upgrade varification ( Hbase Backend version 2.0.6 and client libary version 2.6.5)

  1. Create hbase containers using 2.0.6 version and create graph.
  2. Apply the patch and start hugegraph server ( but keep the hbase 2.0.6 container as it is)
  3. Execute the ggraph queries on data populated in 2.0.6 verify that its succefful ( Hbase client version updated but still able to retrieve data from hbase 2.0.6 version)

Fresh install verification ( Hbase Backebd and client version 2.6.5)

  1. Create Hbase 2.6.5 container.
  2. Apply patch and start hugegraph server.
  3. Creat sample graph and execute some queries

Notes

SHA512 verification remains enforced by default during Docker image build.
ALLOW_UNVERIFIED_DOWNLOAD=true is intended only for trusted/restricted test environments.

…int with official artifacts apache#3016

-Added hbase-shaded-client and hbase-endpoint dependencies instead of custom hbase-shaded-endpoint library.
-Added docker files and HBASE.md containing instructions for HBase backend
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. dependencies Incompatible dependencies of package labels May 11, 2026
@vaijosh vaijosh changed the title [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts #3016 [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts May 12, 2026
@vaijosh vaijosh changed the title [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts #3016 May 12, 2026
@vaijosh vaijosh changed the title [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts #3016 3016: [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts #3016 May 12, 2026
@imbajin imbajin requested a review from Copilot May 15, 2026 07:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the HBase backend by replacing the long-pinned com.baidu.hugegraph:hbase-shaded-endpoint:2.0.6 with the official Apache hbase-endpoint + hbase-shaded-client 2.6.5 artifacts, and ships a Docker-based local HBase test environment plus an end-to-end usage guide so contributors can reproduce HBase-backend validation.

Changes:

  • Upgrade HBase client to 2.6.5 (official Apache artifacts) with transitive exclusions and a dependency-allowlist update.
  • Add a self-contained Docker setup (Dockerfile, entrypoint.sh, hbase-site.xml, docker-compose.hbase.yml) for a standalone HBase 2.6.5 cluster.
  • Add docker/HBASE.md documenting build, run, API sanity checks, and troubleshooting.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
hugegraph-server/hugegraph-hbase/pom.xml Switch HBase deps to official 2.6.5 with transitive exclusions and ordering comment.
install-dist/scripts/dependency/known-dependencies.txt Replace old shaded-endpoint jar with new endpoint/shaded-client jars.
docker/hbase/Dockerfile Build standalone HBase 2.6.5 image with SHA512 verification + mirror fallback.
docker/hbase/entrypoint.sh Start ZK/master/regionserver and wait for readiness, then tail logs.
docker/hbase/hbase-site.xml Standalone/pseudo-distributed HBase config tuned for HugeGraph defaults.
docker/hbase/docker-compose.hbase.yml Compose service with ports, volumes, healthcheck, build args.
docker/HBASE.md End-to-end Docker/HBase backend setup and troubleshooting guide.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/HBASE.md Outdated
Comment thread install-dist/scripts/dependency/known-dependencies.txt
Comment thread hugegraph-server/hugegraph-hbase/pom.xml
Comment thread hugegraph-server/hugegraph-hbase/pom.xml
Comment thread docker/hbase/entrypoint.sh Outdated
Comment thread docker/hbase/Dockerfile Outdated
Comment thread docker/hbase/Dockerfile Outdated
Comment thread hugegraph-server/hugegraph-hbase/pom.xml
Comment thread docker/HBASE.md Outdated
Comment thread docker/HBASE.md
@imbajin imbajin changed the title 3016: [Improve] Upgrade HBase version and replace custom hbase-shaded-endpint with official artifacts #3016 refactor: upgrade HBase and replace custom hbase-shaded-endpint #3016 May 15, 2026
@imbajin imbajin changed the title refactor: upgrade HBase and replace custom hbase-shaded-endpint #3016 refactor: upgrade HBase and replace custom hbase-shaded-endpint May 15, 2026
vaijosh added 2 commits May 15, 2026 17:04
apache#3021
Addressed review comments in this update:
- docker/HBASE.md
  - fixed Quick Start step title to match the actual command (image build)
  - aligned manual API examples with the default local server endpoint base (/graphs)
  - clarified idempotency wording around check_exist behavior
- docker/hbase/entrypoint.sh
  - fixed log glob pattern to match runtime-generated hbase-* log files
  - replaced invalid exec+|| fallback with explicit log-file existence handling
- docker/hbase/hbase-site.xml
  - set hbase.rootdir to explicit file:///tmp/hbase for deterministic local-FS mode
- docker/hbase/Dockerfile
  - switched to stable archive URL as primary source
  - fetch checksum from the actually downloaded source first
  - hardened checksum parsing for grouped SHA512 formats
  - removed stale cleanup path
Replace custom hbase-shaded-endpoint with a streamlined hbase-endpoint.
This reduces the runtime footprint by excluding heavyweight transitive
dependencies not required by the HugeGraph HBase client.

Key exclusions and rationale:
- Server logic: hbase-server (coprocessors run on RS, not client).
- Batch/Async: hbase-mapreduce, hbase-asyncfs, and hbase-replication.
- Hadoop stack: hadoop-client/auth/common/hdfs. HugeGraph uses the
  ZooKeeper registry directly and avoids the YARN/MapReduce stack.
- Legacy logging: log4j 1.x, slf4j-log4j12, and redundant slf4j-api
  versions were purged to eliminate vulnerabilities and conflicts.
- Native/Compression: snappy-java (handled server-side).

Updated known-dependencies.txt to reflect the minimal allowlist.
Improved pom.xml comments to document exclusion rationales and
addressed automated review feedback regarding dependency management.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Incompatible dependencies of package size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improve] Upgrade HBase version and replace custom hbase-shaded-endpoint with official artifacts

2 participants