Tree Reconstruction

This script reconstructs the hierarchical path (branch) from the root node to a specified target node within a filesystem-based tree structure. It identifies nodes using UUIDs stored in `node.json` files and traces parent-child relationships to build the branch representation.

This script reconstructs the hierarchical path (branch) from the root node to a specified target node within a filesystem-based tree structure. It identifies nodes using UUIDs stored in node.json files and traces parent-child relationships to build the branch representation.

Purpose

The primary purpose of reconstructor.py is to extract and visualize a specific lineage within a larger, potentially complex tree structure stored across directories and JSON files. Given a target node’s UUID, it finds the node and traces its ancestry back to the root, outputting this specific branch as a structured JSON object.

Usage

Run the script from the command line:

python reconstructor.py <node_uuid> [search_directory] [output_json]
  • <node_uuid>: (Required) The unique identifier (UUID) of the target node from which to reconstruct the tree branch.
  • [search_directory]: (Optional) The directory where the script should search recursively for node.json files. Defaults to output/ if not specified.
  • [output_json]: (Optional) The file path where the resulting JSON output should be saved. If omitted, a path is automatically generated within the output/reconstructor/ directory using a timestamp and a new UUID.

Examples:

# Reconstruct from UUID, search in default 'output/' directory
python reconstructor.py e2dd9b38ab194156

# Reconstruct from UUID, search in a specific directory
python reconstructor.py e2dd9b38ab194156 output/hallucinate-tree

# Reconstruct, search in specific directory, and save to a specific file
python reconstructor.py e2dd9b38ab194156 output/hallucinate-tree my_tree.json

Input (Filesystem)

The script expects a directory structure, typically generated by other scripts like hallucinate-tree.py or expand-node.py. This structure consists of:

  • A base search directory (e.g., output/).
  • Nested subdirectories, where each directory potentially represents a node in the tree.
  • node.json files within these directories. Each node.json file should contain at least:
    • "uuid": A unique string identifier for the node.
    • "step": A string describing the node or step it represents.
  • For nodes that are not the root of the tree, the node.json file must also contain:
    • "parent_uuid": The UUID of the parent node, linking it in the hierarchy.

Key Functions

  • find_node_by_uuid(search_uuid, base_dir): Recursively walks the base_dir, opens node.json files, and returns the parsed data of the node matching the search_uuid.
  • get_parent_chain(node_data, search_dir): Starts with the target node_data and iteratively uses find_node_by_uuid to find the parent node based on parent_uuid. It continues this process until it reaches a node without a parent_uuid (the root). Returns an ordered list of node data dictionaries representing the path from the root to the target node.
  • build_tree_branch(node_chain): Takes the ordered list from get_parent_chain and constructs a nested dictionary structure that mirrors the hierarchical branch. Each level contains the step and uuid of a node and a children list (which, in this reconstructed branch, will contain at most one child).
  • parse_command_args(): Parses command-line arguments (sys.argv) to extract the node_uuid, search_directory, and output_json path, handling optional arguments and defaults.
  • main(): The main execution function. It coordinates the process: calls parse_command_args, sets defaults, calls find_node_by_uuid, get_parent_chain, and build_tree_branch, generates metadata using utils.create_output_metadata, prepares the final output dictionary, determines the output file path using utils.get_output_filepath, and saves the result using utils.save_output.
  • utils functions: Relies on helper functions from utils.py for saving output (save_output), creating standard metadata (create_output_metadata), and determining the output file path (get_output_filepath).

Reconstruction Logic

  1. Find Target Node: The script starts by searching the specified search_directory (or output/ by default) recursively for a node.json file whose uuid matches the provided <node_uuid>. This is done by the find_node_by_uuid function.
  2. Trace Ancestry: Once the target node is found, the get_parent_chain function takes its data. It looks for the parent_uuid field and calls find_node_by_uuid again to locate the parent node’s data. This process repeats, moving up the hierarchy using parent_uuid links, until a node without a parent_uuid is encountered (this is considered the root of the branch). This results in an ordered list of all nodes from the root down to the target node.
  3. Build Branch Structure: The build_tree_branch function takes the ordered list of nodes. It creates a nested dictionary structure where each node from the list becomes a level in the dictionary. The root node forms the base, and subsequent nodes are added as children to the previous node, effectively recreating the specific branch’s hierarchy.

Output

The script generates a JSON file containing the reconstructed tree branch and associated metadata. The structure of the JSON output is:

{
  "uuid": "...", // UUID generated for this reconstruction process
  "task": "Tree Reconstruction", // Name of the task
  "date_created": "...", // Timestamp of creation
  "time_taken": "...", // Time elapsed for the script execution
  "source_uuid": "...", // The target node UUID provided as input
  "search_directory": "...", // The directory searched for nodes
  "path_length": N, // Integer: Number of nodes in the reconstructed branch
  "reconstructed_tree": { // Nested dictionary representing the branch
    "step": "Root Node Step",
    "uuid": "root-node-uuid",
    "children": [
      {
        "step": "Child Node Step",
        "uuid": "child-node-uuid",
        "children": [
          // ... further nested nodes down to the target node
          {
            "step": "Target Node Step",
            "uuid": "target-node-uuid",
            "children": []
          }
        ]
      }
    ]
  }
}

The reconstructed_tree provides a clear, hierarchical view of the path from the root node down to the specific node identified by the source_uuid.