jaegertracing/jaeger

[experiment] Auto-generate documentation for jaeger-v2 configuration structs via AST

Open

#6,628 建立於 2025年1月28日

在 GitHub 查看
 (9 留言) (0 反應) (0 負責人)Go (18,974 star) (2,326 fork)batch import
documentationgood first issuehelp wantedv2

描述

We are still blocked on the main issue #6186 by schema-first efforts in OTEL Collector not progressing. I wonder if we could instead use the Go's AST library to navigate the hierarchy of known config structs and extract the comments and other metadata needed for the docs, and/or config examples.

There are various blog posts showing examples of using AST.

The tool could have just a hardcoded list of starting configuration structs, both from Jaeger and from OTEL code base, e.g. cmd/jaeger/internal/extension/jaegerquery/config.go.

The prototype is available in draft PR #7064.

Rough outline of the milestones:

  • add a new subcommand to jaeger-v2 to generate config schema (done in #7064)
  • collect config objects from OTEL component factories (done in #7064)
  • use reflection on those objects to determine additional structs from field types and embedded structs (partially done in #7064)
  • use "golang.org/x/tools/go/packages" to parse the packages containing the structs to get access to other metadata like comments (partially done in #7064)
  • transform collected data into JSON Schema output (partially done in #7064)
  • run 3rd party tools to convert JSON schema into HTML documentation (done in #7064)
  • enhance Jaeger docs to use the output from last step to include in the website as part of the release process

This is another outline of the task from Gemini:

Feature: Generate JSON Schema with Comments and Defaults

Goal: Implement a tool or function that generates JSON schema for a collection of Go objects, incorporating comments as descriptions and using the current field values as defaults.

Implementation Outline:

I. Initialization and Package Loading:

  1. Input:
    • A slice or map of Go objects to generate schemas for.
    • The package paths where the types of these objects are defined.
  2. Load Packages:
    • Utilize the "golang.org/x/tools/go/packages" library to load the specified Go packages.
    • Configure packages.Config to include necessary information for parsing comments and type structures (e.g., NeedTypes, NeedSyntax, NeedName, NeedImports, NeedDeps, NeedFiles, NeedCompiledGoFiles, NeedExportFile, NeedModule).
  3. Type Information:
    • For each input Go object, obtain its reflect.Type using the reflect package for runtime inspection.

II. Reflecting and Parsing Types:

  1. Iterate Through Objects: Loop through each Go object in the input collection.
  2. Get reflect.Type and reflect.Value:
    • Obtain the reflect.Type to analyze the structure.
    • Obtain the reflect.Value to access the current field values for defaults.
  3. Find Corresponding ast.TypeSpec:
    • For the reflect.Type, locate the corresponding ast.TypeSpec within the parsed packages (pkg.Syntax).
    • This will involve traversing the syntax trees and matching the ast.TypeSpec.Name.Name with the Go type's name.
    • Handle potential complexities like embedded types and type aliases.
  4. Extract Field Information: For each field of the reflect.Type:
    • Get the field name (field.Name).
    • Get the field type (field.Type).
    • Extract struct tags (field.Tag), specifically looking for the json tag to determine the JSON property name and omitempty.
    • Get the current value of the field from the reflect.Value (Value.Field(i)).
  5. Extract Comment Information:
    • Locate the corresponding ast.Field in the ast.TypeSpec.
    • Extract the associated comment from ast.Field.Doc or ast.Field.Comment.

III. Building the JSON Schema:

  1. Schema Structure:
    • Define a structure for the generated JSON schema, likely using the "definitions" section for type schemas and a top-level schema referencing these definitions.
  2. Type Mapping:
    • Create a mapping between Go types (from reflect.Type) and their corresponding JSON schema types (e.g., string, integer, boolean, array, object).
    • Handle basic types, slices, maps, and nested structs.
  3. Schema Properties: For each Go field, create a property in the JSON schema:
    • type: Mapped from the Go field type.
    • description: The extracted Go field comment.
    • default: The current value of the Go field (serialized appropriately for JSON schema).
    • Potentially include other keywords like format, nullable, and constraints based on struct tags.
  4. Handling Nested Objects:
    • If a field is another Go object, recursively process its type and add a $ref to its definition in the "definitions" section.
  5. Handling Slices and Maps:
    • For slice and map types, define the items or additionalProperties schema, referencing the schema of the element/value type.

IV. Data Structures:

  • TypeCache (Map: reflect.Type -> *ast.TypeSpec): Caches the mapping between reflect.Type and its ast.TypeSpec to avoid redundant lookups.
  • SchemaDefinitions (Map: string -> map[string]interface{}): Stores the generated JSON schema definitions for each Go type, keyed by the type name.
  • ProcessedTypes (Set: reflect.Type): Tracks already processed Go types to prevent infinite recursion with nested or circular dependencies.
  • FieldInfo (Struct): Holds intermediate information about each field:
    type FieldInfo struct {
        Name        string
        JSONName    string
        Type        reflect.Type
        Value       reflect.Value
        Comment     string
        Tags        reflect.StructTag
    }
    
  • PackageInfo (Struct): Stores information about a loaded Go package, including a mapping of type names to their ast.TypeSpec:
    type PackageInfo struct {
        Package *packages.Package
        TypeSpecs map[string]*ast.TypeSpec
    }
    

V. Output:

  1. Root Schema: Construct the final JSON schema object, including the $schema and the "definitions" section. The root schema might also define properties for the top-level object(s).
  2. Serialization: Serialize the JSON schema structure into a JSON string using encoding/json.

Key Considerations and Challenges:

  • Handling embedded types correctly.
  • Managing type aliases.
  • Detecting and handling circular dependencies between types.
  • Deciding how to handle unexported fields.
  • Mapping custom Go types to appropriate JSON schema types.
  • Implementing robust error handling.
  • Optimizing performance for large and complex type structures.

貢獻者指南