Skip to content

Architecture

This document shows the internal architecture and data flow of cg4j.

ASM Engine Flow

The ASM engine is a lightweight alternative to WALA using the ASM bytecode library with RTA (Rapid Type Analysis) for call resolution. It uses fake-root seeding, lambda synthesis, and a single receiver-event-driven worklist to build precise call graphs.

Kroki

Notes

Class Loaders

  • Primordial: Java runtime classes (JDK) - standard library code
  • Extension: Dependency JARs - external libraries your application uses
  • Application: Target JAR being analyzed - your application code

Keywords

  • RTA (Rapid Type Analysis): Uses instantiated types (from NEW instructions) to prune virtual call targets
  • Worklist Algorithm: Iteratively processes reachable methods and receiver events until no new work remains
  • Receiver Event Dispatch: Incrementally re-dispatches virtual/interface calls as new concrete receiver types are registered
  • ASM: Lightweight Java bytecode manipulation library used for parsing .class files
  • Call Site: A method invocation instruction (INVOKE*) in bytecode
  • INVOKEDYNAMIC: Bytecode instruction used for lambda expressions and method references
  • Lambda Factory: Creates synthetic classes for lambda expressions matching WALA's wala/lambda$... naming; these synthetic classes inherit the caller's loader scope
  • RT (Runtime): Java runtime classes from the JDK (java., javax., etc.)
  • Entry Points: Starting methods for analysis (public, non-abstract methods on public Application classes)
  • Fake Root / <boot>: Synthetic root logic that seeds entry point calls, constructor materialization, and static initializer edges
  • Fixpoint: State where no new reachable methods, receiver events, or edges are discovered