Assemble
This script reads data from various JSON files located in a specified directory, processes it, and uses a Jinja2 template (`page-template.html`) to generate a final HTML output file. It handles metadata, content structuring (like trees and timelines), breadcrumbs, and alternative approaches.
This script reads data from various JSON files located in a specified directory, processes it, and uses a Jinja2 template (page-template.html) to generate a final HTML output file. It handles metadata, content structuring (like trees and timelines), breadcrumbs, and alternative approaches.
Purpose¶
The primary purpose of assemble.py is to aggregate structured data from multiple JSON sources and render it into a standardized HTML page format using a predefined template. It serves as the final step in generating documentation or report pages from processed data.
Usage¶
python assemble.py <files_dir> <output_dir>
<files_dir>: The directory containing the input JSON files (1.json - 9.json),breadcrumbs.txt, and any alternative approach files (alt*.json).<output_dir>: The directory where the generated HTML file will be saved.
Input Files¶
The script expects the following files within the <files_dir>:
1.json: Contains page metadata (title, summary, status, progress, etc.).2.json: Contains the main hierarchical tree structure data for the primary approach.3.json: Contains timeline data, including historical events and future predictions.4.json: Contains data related to challenges, obstacles, or risks.5.json: Contains information about adoption stages or maturity levels.6.json: Contains implementation assessment details, steps, or levels.7.json: Contains Return on Investment (ROI) analysis data, including timeframes and key benefits.8.json: Contains information about relevant future technologies or trends.9.json: Contains industrial specifications, such as performance metrics and implementation requirements.breadcrumbs.txt: A plain text file containing the breadcrumb path string (e.g.,/category/sub-category).alt*.json(Optional): JSON files starting withaltand ending with.json, each representing an alternative approach or tree structure.
Key Functions¶
read_json_file(file_path): Reads a specified JSON file and returns its content parsed as a Python dictionary. Handles file opening and JSON decoding.process_bold_text(text): Uses regular expressions to find text enclosed in double asterisks (**text**) and replaces it with HTML<strong>text</strong>tags.generate_tree_preview_text(tree_data): Generates a compact, ASCII-art representation of the tree structure found intree_data. It includes the root, main steps (children), and initial sub-steps (grandchildren), using shortened names and UUIDs for brevity. Designed for previews, like in alternative approach cards.process_metadata(metadata, breadcrumb_str): Takes the raw metadata dictionary (from1.json) and the breadcrumb string. It processes fields likeprogress_percentage(handling different keys and ensuring numeric conversion), formats thesummaryinto paragraphs, sets default values for missing fields, adds standard contributor text and the current date (last_updated), and includes the raw breadcrumb string.process_breadcrumbs(breadcrumbs_str): Parses the raw breadcrumb string (e.g.,/category/sub-category) into a structured list of dictionaries. Each dictionary represents a breadcrumb link withname(formatted title) andurl(path to the parent index, orNonefor the last item). Includes a ‘Home’ link.main(): Orchestrates the entire process:- Parses command-line arguments (
<files_dir>,<output_dir>). - Constructs paths to all expected input JSON files and
breadcrumbs.txt. - Loads data from mandatory JSON files (
1.json,2.json) usingread_json_file. - Loads data from optional JSON files (3-9) if they exist, otherwise uses empty dictionaries.
- Reads the
breadcrumbs.txtfile if it exists. - Finds and loads alternative approach files (
alt*.json), generating preview text for each usinggenerate_tree_preview_text. - Sets up the Jinja2 environment, loading templates from the
templatesdirectory and adding theprocess_bold_textfilter andgenerate_tree_preview_text,enumerateglobals. - Loads the
page-template.htmltemplate. - Calls
process_metadataandprocess_breadcrumbsto prepare metadata and navigation data. - Processes data from optional JSONs (timeline, challenges, adoption, ROI, future tech, specs) into structured lists suitable for the template.
- Builds the final context dictionary containing all processed data.
- Renders the Jinja2 template with the context.
- Determines the output HTML filename based on the metadata’s
slugortitle, sanitizing it. - Creates the output directory if it doesn’t exist.
- Writes the rendered HTML content to the output file.
- Includes error handling for file not found, JSON decoding errors, and other exceptions.
- Parses command-line arguments (
Template Processing¶
The script utilizes the Jinja2 templating engine to generate the final HTML output:
- Environment Setup: A Jinja2
Environmentis configured to load templates from thetemplates/directory. Settings likeautoescape=True,trim_blocks=True, andlstrip_blocks=Trueare enabled for security and cleaner template rendering. - Template Loading: The primary template
page-template.htmlis loaded from the environment. - Custom Filters/Globals:
process_bold: A custom filter is added to allow{{ some_text | process_bold }}syntax in the template for converting Markdown bold to HTML bold.generate_tree_preview_text: Made available as a global function if needed directly within the template (though primarily used during alternative processing).enumerate: The built-inenumeratefunction is added as a global for easier iteration with indices in the template.
- Context Data: A comprehensive Python dictionary (
context) is prepared, containing all the processed data extracted and transformed from the input JSON files (metadata, tree structure, timeline entries, challenges, adoption stages, implementation steps, ROI points, benefits, future technologies, specifications, breadcrumbs, and alternative approaches). - Rendering: The
template.render(context)method is called to inject the context data into thepage-template.html, producing the final HTML string.
Output¶
The script generates a single HTML file located in the specified <output_dir>.
- Filename: The name of the HTML file is automatically generated based on the
slugfield found in the processed metadata (1.json). Ifslugis not present, it falls back to thetitlefield. The name is converted to lowercase, spaces are replaced with hyphens, and characters other than alphanumerics, hyphens, or underscores are removed to ensure it’s filesystem-safe. A default name likeoutput.htmlis used if no suitable title/slug is found. - Content: The content of the file is the fully rendered HTML page, combining the structure from
page-template.htmlwith the data processed from all the input JSON files.