Upgrade llama.cpp from b9022 to b9049 by bernardladenthin · Pull Request #104 · bernardladenthin/java-llama.cpp

bernardladenthin · 2026-05-07T06:59:21Z

Summary

This PR upgrades the pinned llama.cpp dependency from version b9022 to b9049, incorporating upstream improvements and new features for KV cache state management, FWHT support, and backend initialization.

Key Changes

KV Cache State Management: New LLAMA_STATE_SEQ_FLAGS_ON_DEVICE flag enables on-device KV cache state save/restore without host round-trips. State data format now includes 4-byte magic header and seq_id, making saved state from b9022 incompatible with b9049+.
FWHT Support: New ggml_op_hint enum and ggml_mul_mat_set_hint() function added for Fast Walsh-Hadamard Transform support in graph operations.
Backend Initialization: llama_backend_init() now automatically calls ggml_backend_load_all() if no backends are registered, simplifying initialization flow.
Error Handling: Unsupported model architectures now throw std::runtime_error instead of calling GGML_ABORT, allowing graceful error handling by callers.
Speculative Decoding: Server context checkpoints now support on-device state flags for improved speculative decoding performance.
Dependencies: GGML version bumped from 0.10.2 to 0.11.0; cpp-httplib updated to 0.43.3 with recursion elimination and OOM-safe improvements.

Notes

No JNI layer call-site changes are required. The new on-device state features and FWHT hints are not currently used by the Java bindings but are available for future optimization.

https://claude.ai/code/session_01TZ2Gvm2dyeRVoy5dCzMXNn

Key changes in this range: - New LLAMA_STATE_SEQ_FLAGS_ON_DEVICE flag for on-device KV cache save/restore - State seq data format now prepends 4-byte magic + seq_id header (b9022 state data incompatible) - ggml_op_hint enum + ggml_mul_mat_set_hint() for FWHT support - llama_backend_init() auto-loads backends if none registered - server_prompt_checkpoint_update() gained on_device parameter - GGML version 0.10.2 → 0.11.0, cpp-httplib 0.43.2 → 0.43.3 https://claude.ai/code/session_01TZ2Gvm2dyeRVoy5dCzMXNn

bernardladenthin merged commit c84ea9c into master May 7, 2026
16 checks passed

bernardladenthin deleted the claude/update-b9049-compatibility-b5HeJ branch May 7, 2026 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade llama.cpp from b9022 to b9049#104

Upgrade llama.cpp from b9022 to b9049#104
bernardladenthin merged 1 commit intomasterfrom
claude/update-b9049-compatibility-b5HeJ

bernardladenthin commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bernardladenthin commented May 7, 2026

Summary

Key Changes

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants