Information Security 24 min read

Understanding AST, SAST, Taint Analysis, and CodeQL for Java Security Scanning

This article explains the fundamentals of abstract syntax trees, Java AST analysis with Spoon, the principles of static application security testing and taint analysis, and demonstrates how to use CodeQL to detect unsafe Fastjson usage and Spring web path bindings in a CI/CD pipeline.

58 Tech

Apr 23, 2021

Understanding AST, SAST, Taint Analysis, and CodeQL for Java Security Scanning

Background Source code security scanning is a critical part of a secure development lifecycle (SDL). In 58 Group's CI/CD pipeline, thousands of builds occur daily, making automated white‑box analysis essential. The article introduces the Java‑focused white‑box capabilities built for this environment.

1. Understanding AST (Abstract Syntax Tree) An AST is a tree‑structured abstract representation of source code syntax, where each node corresponds to a language construct. It is generated by the parser after lexical analysis and is used for subsequent semantic analysis.

Example C code snippet:

while (i<n){</code><code>sum + = A[i++];</code><code>}

2. Java AST with Spoon Spoon is an open‑source library that parses Java source files and builds a rich, manipulable AST supporting Java 11‑13. It can be compiled from source or obtained as a Maven artifact ( https://github.com/INRIA/spoon). Example command to launch Spoon’s GUI on a Spring controller:

java -cp /Users/58src/IDEA/spoon/target/spoon-core-8.4.0-SNAPSHOT-jar-with-dependencies.jar spoon.Launcher -i /Users/58src/IDEA/springboot-mybatis/src/main/java/cn/no7player/controller/HelloController.java --gui

Sample Java controller source used for AST analysis:

package cn.no7player.controller;</code><code>import org.springframework.stereotype.Controller;</code><code>import org.springframework.web.bind.annotation.RequestMapping;</code><code>@Controller</code><code>public class HelloController {</code><code>    @RequestMapping("/hello")</code><code>    public String greeting(@RequestParam(value="name", required=false, defaultValue="World") String name, Model model) {</code><code>        model.addAttribute("name", name);</code><code>        return "hello";</code><code>    }</code><code>}

Spoon’s meta‑model consists of three parts: structural declarations, executable code, and references. All elements inherit from CtElement, and from version 6.1.0 it also models Java 9 modules ( CtModule, CtModuleDirective).

3. SAST (Static Application Security Testing) Basics SAST works by compiling source code to an intermediate representation, then performing semantic, data‑flow, and control‑flow analyses to locate unsafe functions, data leaks, and configuration issues. The analysis pipeline includes source identification, taint propagation, and sanitization detection, finally producing a detailed vulnerability report.

4. Taint Analysis Taint analysis models data flow as a triple <sources, sinks, sanitizers>. Sources are untrusted inputs, sinks are security‑sensitive operations, and sanitizers are transformations that neutralize the data. The process involves identifying sources and sinks, propagating taint through explicit and implicit flows, and applying sanitizers.

Explicit flow follows data dependencies; implicit flow follows control dependencies. Over‑taint and under‑taint problems are discussed, with references to techniques for reducing false positives/negatives.

5. CodeQL Vulnerability Queries CodeQL uses the same <sources, sinks, sanitizers> model. Example query for unsafe Fastjson deserialization defines a custom configuration, identifies remote flow sources, and matches them to UnSafeFastJsonSink nodes. The query selects the vulnerable method access, source, and sink information.

/**</code><code> * @name FastJson deserializing of user-controlled data</code><code> * @description FastJson deserializing user-controlled data may allow attackers to execute arbitrary code.</code><code> * @kind path-problem</code><code> * @problem.severity error</code><code> * @precision high</code><code> * @id java/unsafe-fastjson-deserialization</code><code> * @tags security</code><code> */</code><code>import java</code><code>import semmle.code.java.dataflow.FlowSources</code><code>import semmle.code.java.security.FastJson</code><code>import DataFlow::PathGraph</code><code>class UnsafeFastJsonSinkConfig extends TaintTracking::Configuration {</code><code>  UnsafeFastJsonSinkConfig() { this = "UnsafeFastJsonConfig" }</code><code>  override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }</code><code>  override predicate isSink(DataFlow::Node sink) { sink instanceof UnSafeFastJsonSink }</code><code>}</code><code>from DataFlow::PathNode source, DataFlow::PathNode sink, UnsafeFastJsonSinkConfig conf</code><code>where conf.hasFlowPath(source, sink)</code><code>select sink.getNode().(UnSafeFastJsonSink).getMethodAccess(), source, sink, "Unsafe fastjson deserialization of $@.", source.getNode(), "user input"

Additional queries illustrate how to extract Spring web paths by analyzing controller annotations and method bindings.

Conclusion The article provides a practical guide to building Java AST analysis with Spoon, applying SAST and taint‑analysis concepts, and writing CodeQL queries to detect unsafe Fastjson usage and retrieve Spring MVC paths, supporting automated security checks in large CI/CD environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java AST security static analysis taint analysis SAST CodeQL

Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.