Detect enclosing function for Java and export in SARIF#880
Detect enclosing function for Java and export in SARIF#880
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Adds an EnclosingFunction concept to violations, implements Java enclosing-method detection, and exports that info into SARIF logicalLocations.
Changes:
- Introduces
EnclosingFunctionand wires it into theViolationmodel. - Implements Java enclosing function detection (including basic type resolution for signatures).
- Emits SARIF
logicalLocationswhenViolation.enclosing_functionis present and adds an integration-test script.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| misc/integration-test-method-name.sh | New integration script to assert SARIF results contain logicalLocations/method info. |
| crates/static-analysis-kernel/src/model/violation.rs | Adds EnclosingFunction and enclosing_function to the core violation model. |
| crates/static-analysis-kernel/src/analysis/languages/java/methods.rs | Implements Java enclosing method/constructor detection and FQN/signature formatting + unit tests. |
| crates/static-analysis-kernel/src/analysis/languages/java.rs | Exposes the new Java methods module. |
| crates/static-analysis-kernel/src/analysis/languages.rs | Adds language-dispatch helpers for enclosing-function lookup and a test to track supported languages. |
| crates/static-analysis-kernel/src/analysis/ddsa_lib/runtime.rs | Populates Violation.enclosing_function during rule execution (per violation). |
| crates/static-analysis-kernel/src/analysis/ddsa_lib/js/violation.rs | Initializes enclosing_function on converted violations. |
| crates/cli/src/sarif/sarif_utils.rs | Emits SARIF logicalLocations for violations with enclosing_function; adds unit test coverage. |
| crates/cli/src/rule_utils.rs | Initializes enclosing_function for secret-to-rule-result conversion + test updates. |
| crates/cli/src/file_utils.rs | Updates tests to include the new enclosing_function field. |
| crates/cli/src/csv.rs | Updates tests to include the new enclosing_function field. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jasonforal
left a comment
There was a problem hiding this comment.
Nice work! Left a small nit on making the integration test a bit easier to read.
Also, we're mixing the words "method" and "function" a lot: a file called methods.rs that contains a function called find_enclosing_function...an integration test named method-name.sh...a SARIF field named "function".
Could we just standardize on "function"? I understand "function" is inaccurate for Java (which this PR implements), but it's the de facto default for cross-language terminology, and we'll add many more languages.
| # --------------------------------------------------------------------------- | ||
| # Rule: flag every local_variable_declaration. | ||
| # | ||
| # code (base64 of): |
There was a problem hiding this comment.
Just a bit nicer to do it like
code_b64=$(cat <<'EOF' | base64 | tr -d '\n'
function visit(node, filename, code) {
const n = node.captures["decl"];
addError(buildError(n.start.line, n.start.col, n.end.line, n.end.col, "test violation"));
}
EOF
)| # } | ||
| # | ||
| # tree_sitter_query (base64 of): | ||
| # (local_variable_declaration) @decl |
There was a problem hiding this comment.
Nicer to do like
query_b64=$(echo '(local_variable_declaration) @decl' | base64)
What problem are you trying to solve?
SARIF violations only report file, line and column. There is no information about which method or function contains the violation, which makes it harder for consumers to attribute findings to specific code constructs.
What is your solution?
{ "locations": [ { "logicalLocations": [ { "kind": "function", "name": "name" } ], "physicalLocation": { "artifactLocation": { "uri": "file_path" }, "region": { "endColumn": x, "endLine": x, "startColumn": y, "startLine": y } } } ] }Alternatives considered
A generic approach using a shared list of function node types across all languages. Rejected because different languages have edge cases that require specific handling (e.g. async def in Python, constructors in Java and C#, arrow functions assigned to variables in JavaScript).
What the reviewer should know