Skip to content

Detect enclosing function for Java and export in SARIF#880

Open
xaldama wants to merge 21 commits intomainfrom
xabier.aldama/detect_method_java
Open

Detect enclosing function for Java and export in SARIF#880
xaldama wants to merge 21 commits intomainfrom
xabier.aldama/detect_method_java

Conversation

@xaldama
Copy link
Copy Markdown

@xaldama xaldama commented Apr 20, 2026

What problem are you trying to solve?

SARIF violations only report file, line and column. There is no information about which method or function contains the violation, which makes it harder for consumers to attribute findings to specific code constructs.

What is your solution?

  • Add a method_name field to Violation
  • Implement Java language enclosing function detection using tree-sitter AST traversal
  • Populate method_name in execute_rule after violations are converted from JS to Rust, reusing the already-parsed tree
  • Export the method name in the SARIF output under locations[].logicalLocations with kind: "function"
{
  "locations": [
    {
      "logicalLocations": [
        {
          "kind": "function",
          "name": "name"
        }
      ],
      "physicalLocation": {
        "artifactLocation": {
          "uri": "file_path"
        },
        "region": {
          "endColumn": x,
          "endLine": x,
          "startColumn": y,
          "startLine": y
        }
      }
    }
  ]
}

Alternatives considered

A generic approach using a shared list of function node types across all languages. Rejected because different languages have edge cases that require specific handling (e.g. async def in Python, constructors in Java and C#, arrow functions assigned to variables in JavaScript).

What the reviewer should know

  • The implementation is per language module. In this PR Java is the implemented one. It follows the same structure as the existing import parsers
  • Two variants are exposed: find_enclosing_function (parses from scratch) and find_enclosing_function_with_tree (reuses an existing tree), consistent with the parse_imports / parse_imports_with_tree pattern
  • runtime.rs uses the _with_tree variant since the tree is always available at that point
  • Languages not yet covered will return None by now. They can be added incrementally.
  • If a new language is added and it is not included in the find_enclosing_function_with_tree list the project will not compile.

Copilot AI review requested due to automatic review settings April 20, 2026 13:05
@xaldama xaldama requested a review from a team as a code owner April 20, 2026 13:05
@xaldama xaldama requested a review from bahar-shah April 20, 2026 13:05
@datadog-prod-us1-3

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an EnclosingFunction concept to violations, implements Java enclosing-method detection, and exports that info into SARIF logicalLocations.

Changes:

  • Introduces EnclosingFunction and wires it into the Violation model.
  • Implements Java enclosing function detection (including basic type resolution for signatures).
  • Emits SARIF logicalLocations when Violation.enclosing_function is present and adds an integration-test script.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
misc/integration-test-method-name.sh New integration script to assert SARIF results contain logicalLocations/method info.
crates/static-analysis-kernel/src/model/violation.rs Adds EnclosingFunction and enclosing_function to the core violation model.
crates/static-analysis-kernel/src/analysis/languages/java/methods.rs Implements Java enclosing method/constructor detection and FQN/signature formatting + unit tests.
crates/static-analysis-kernel/src/analysis/languages/java.rs Exposes the new Java methods module.
crates/static-analysis-kernel/src/analysis/languages.rs Adds language-dispatch helpers for enclosing-function lookup and a test to track supported languages.
crates/static-analysis-kernel/src/analysis/ddsa_lib/runtime.rs Populates Violation.enclosing_function during rule execution (per violation).
crates/static-analysis-kernel/src/analysis/ddsa_lib/js/violation.rs Initializes enclosing_function on converted violations.
crates/cli/src/sarif/sarif_utils.rs Emits SARIF logicalLocations for violations with enclosing_function; adds unit test coverage.
crates/cli/src/rule_utils.rs Initializes enclosing_function for secret-to-rule-result conversion + test updates.
crates/cli/src/file_utils.rs Updates tests to include the new enclosing_function field.
crates/cli/src/csv.rs Updates tests to include the new enclosing_function field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/static-analysis-kernel/src/model/violation.rs Outdated
Comment thread misc/integration-test-method-name.sh Outdated
Comment thread misc/integration-test-method-name.sh Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages/java/methods.rs Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages.rs Outdated
Comment thread crates/static-analysis-kernel/src/model/violation.rs Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages.rs Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages.rs Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages.rs Outdated
Comment thread crates/static-analysis-kernel/src/model/violation.rs
Comment thread misc/integration-test-method-name.sh Outdated
Comment thread crates/static-analysis-kernel/src/analysis/languages.rs Outdated
Copy link
Copy Markdown
Collaborator

@jasonforal jasonforal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Left a small nit on making the integration test a bit easier to read.

Also, we're mixing the words "method" and "function" a lot: a file called methods.rs that contains a function called find_enclosing_function...an integration test named method-name.sh...a SARIF field named "function".

Could we just standardize on "function"? I understand "function" is inaccurate for Java (which this PR implements), but it's the de facto default for cross-language terminology, and we'll add many more languages.

# ---------------------------------------------------------------------------
# Rule: flag every local_variable_declaration.
#
# code (base64 of):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bit nicer to do it like

code_b64=$(cat <<'EOF' | base64 | tr -d '\n'

function visit(node, filename, code) {
  const n = node.captures["decl"];
  addError(buildError(n.start.line, n.start.col, n.end.line, n.end.col, "test violation"));
}

EOF
)

# }
#
# tree_sitter_query (base64 of):
# (local_variable_declaration) @decl
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicer to do like

query_b64=$(echo '(local_variable_declaration) @decl' | base64)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants