jaegertracing/jaeger

[experiment] Auto-generate documentation for jaeger-v2 configuration structs via AST

Open

#6628 opened on Jan 28, 2025

View on GitHub
 (9 comments) (0 reactions) (0 assignees)Go (18,974 stars) (2,326 forks)batch import
documentationgood first issuehelp wantedv2

Description

We are still blocked on the main issue #6186 by schema-first efforts in OTEL Collector not progressing. I wonder if we could instead use the Go's AST library to navigate the hierarchy of known config structs and extract the comments and other metadata needed for the docs, and/or config examples.

There are various blog posts showing examples of using AST.

The tool could have just a hardcoded list of starting configuration structs, both from Jaeger and from OTEL code base, e.g. cmd/jaeger/internal/extension/jaegerquery/config.go.

The prototype is available in draft PR #7064.

Rough outline of the milestones:

  • add a new subcommand to jaeger-v2 to generate config schema (done in #7064)
  • collect config objects from OTEL component factories (done in #7064)
  • use reflection on those objects to determine additional structs from field types and embedded structs (partially done in #7064)
  • use "golang.org/x/tools/go/packages" to parse the packages containing the structs to get access to other metadata like comments (partially done in #7064)
  • transform collected data into JSON Schema output (partially done in #7064)
  • run 3rd party tools to convert JSON schema into HTML documentation (done in #7064)
  • enhance Jaeger docs to use the output from last step to include in the website as part of the release process

This is another outline of the task from Gemini:

Feature: Generate JSON Schema with Comments and Defaults

Goal: Implement a tool or function that generates JSON schema for a collection of Go objects, incorporating comments as descriptions and using the current field values as defaults.

Implementation Outline:

I. Initialization and Package Loading:

  1. Input:
    • A slice or map of Go objects to generate schemas for.
    • The package paths where the types of these objects are defined.
  2. Load Packages:
    • Utilize the "golang.org/x/tools/go/packages" library to load the specified Go packages.
    • Configure packages.Config to include necessary information for parsing comments and type structures (e.g., NeedTypes, NeedSyntax, NeedName, NeedImports, NeedDeps, NeedFiles, NeedCompiledGoFiles, NeedExportFile, NeedModule).
  3. Type Information:
    • For each input Go object, obtain its reflect.Type using the reflect package for runtime inspection.

II. Reflecting and Parsing Types:

  1. Iterate Through Objects: Loop through each Go object in the input collection.
  2. Get reflect.Type and reflect.Value:
    • Obtain the reflect.Type to analyze the structure.
    • Obtain the reflect.Value to access the current field values for defaults.
  3. Find Corresponding ast.TypeSpec:
    • For the reflect.Type, locate the corresponding ast.TypeSpec within the parsed packages (pkg.Syntax).
    • This will involve traversing the syntax trees and matching the ast.TypeSpec.Name.Name with the Go type's name.
    • Handle potential complexities like embedded types and type aliases.
  4. Extract Field Information: For each field of the reflect.Type:
    • Get the field name (field.Name).
    • Get the field type (field.Type).
    • Extract struct tags (field.Tag), specifically looking for the json tag to determine the JSON property name and omitempty.
    • Get the current value of the field from the reflect.Value (Value.Field(i)).
  5. Extract Comment Information:
    • Locate the corresponding ast.Field in the ast.TypeSpec.
    • Extract the associated comment from ast.Field.Doc or ast.Field.Comment.

III. Building the JSON Schema:

  1. Schema Structure:
    • Define a structure for the generated JSON schema, likely using the "definitions" section for type schemas and a top-level schema referencing these definitions.
  2. Type Mapping:
    • Create a mapping between Go types (from reflect.Type) and their corresponding JSON schema types (e.g., string, integer, boolean, array, object).
    • Handle basic types, slices, maps, and nested structs.
  3. Schema Properties: For each Go field, create a property in the JSON schema:
    • type: Mapped from the Go field type.
    • description: The extracted Go field comment.
    • default: The current value of the Go field (serialized appropriately for JSON schema).
    • Potentially include other keywords like format, nullable, and constraints based on struct tags.
  4. Handling Nested Objects:
    • If a field is another Go object, recursively process its type and add a $ref to its definition in the "definitions" section.
  5. Handling Slices and Maps:
    • For slice and map types, define the items or additionalProperties schema, referencing the schema of the element/value type.

IV. Data Structures:

  • TypeCache (Map: reflect.Type -> *ast.TypeSpec): Caches the mapping between reflect.Type and its ast.TypeSpec to avoid redundant lookups.
  • SchemaDefinitions (Map: string -> map[string]interface{}): Stores the generated JSON schema definitions for each Go type, keyed by the type name.
  • ProcessedTypes (Set: reflect.Type): Tracks already processed Go types to prevent infinite recursion with nested or circular dependencies.
  • FieldInfo (Struct): Holds intermediate information about each field:
    type FieldInfo struct {
        Name        string
        JSONName    string
        Type        reflect.Type
        Value       reflect.Value
        Comment     string
        Tags        reflect.StructTag
    }
    
  • PackageInfo (Struct): Stores information about a loaded Go package, including a mapping of type names to their ast.TypeSpec:
    type PackageInfo struct {
        Package *packages.Package
        TypeSpecs map[string]*ast.TypeSpec
    }
    

V. Output:

  1. Root Schema: Construct the final JSON schema object, including the $schema and the "definitions" section. The root schema might also define properties for the top-level object(s).
  2. Serialization: Serialize the JSON schema structure into a JSON string using encoding/json.

Key Considerations and Challenges:

  • Handling embedded types correctly.
  • Managing type aliases.
  • Detecting and handling circular dependencies between types.
  • Deciding how to handle unexported fields.
  • Mapping custom Go types to appropriate JSON schema types.
  • Implementing robust error handling.
  • Optimizing performance for large and complex type structures.

Contributor guide