Industrial Specifications Generation
This script utilizes a large language model (LLM), prompted to act as an industrial engineering specialist, to generate comprehensive industrial and commercial specifications for a user-provided topic. It expects the LLM to return a structured JSON object containing detailed information relevant to professionals in the field.
This script utilizes a large language model (LLM), prompted to act as an industrial engineering specialist, to generate comprehensive industrial and commercial specifications for a user-provided topic. It expects the LLM to return a structured JSON object containing detailed information relevant to professionals in the field.
Purpose¶
The primary goal of this script is to automate the generation of detailed industrial specifications, including performance metrics, implementation requirements, relevant standards, key suppliers, and operational considerations for a specific topic.
Usage¶
The script is executed from the command line:
python specifications-industrial.py <input_json> [output_json] [-saveInputs] [-uuid="UUID"] [-flow_uuid="FLOW-UUID"]
<input_json>: (Required) Path to the input JSON file containing the topic and configuration.[output_json]: (Optional) Path where the output JSON file should be saved. If omitted, a default path is generated.-saveInputs: (Optional) Flag to save the system and user prompts sent to the LLM into theflow/<flowUUID>/inputs/directory.-uuid="UUID": (Optional) Custom UUID for the output file metadata.-flow_uuid="FLOW-UUID": (Optional) UUID for the flow, used for saving inputs if-saveInputsis specified.
Input Files¶
The script requires an input JSON file (<input_json>) with the following structure:
{
"topic": "The specific industrial topic (e.g., High-Volume Lithium-ion Battery Manufacturing)",
"model": "gemma3", // Or another LLM model identifier
"parameters": {
// Optional LLM parameters (e.g., temperature, max_tokens)
}
}
topic: A string describing the subject for which specifications are needed.model: The identifier for the LLM to use (defaults togemma3if not provided).parameters: An optional object containing parameters to pass to the LLM during generation.
Key Functions¶
main(): Parses command-line arguments usinghandle_command_args, loads the input JSON usingload_json, orchestrates the specification generation by callinggenerate_industrial_specifications, determines the output path usingget_output_filepath, creates metadata usingcreate_output_metadata, combines metadata with the results, and saves the final output usingsave_output. It also handles setting the globalflowUUID.generate_industrial_specifications(input_data, save_inputs=False): Extracts thetopic,model, andparametersfrom the input data. Constructs the system and user prompts for the LLM. Optionally saves these prompts usingsaveToFile(fromutils). Callschat_with_llm(fromutils) to interact with the LLM. Processes the LLM response usingextract_json_from_response.extract_json_from_response(response): Attempts to robustly extract and parse a JSON object from the LLM’s potentially messy response string. It tries direct parsing, then looks for JSON withinjson ...or...code fences, and finally searches for content between the first{and last}. It usessanitize_json_stringbefore attempting to parse.sanitize_json_string(json_str): Removes invalid control characters (ASCII 0-31, excluding tab, newline, carriage return) from a string using regular expressions to prevent JSON parsing errors.- Utility Functions (
utils): The script relies heavily on shared functions fromutils.py, includingload_json,save_output,chat_with_llm,create_output_metadata,get_output_filepath,handle_command_args, andsaveToFile.
LLM Interaction¶
The script crafts specific prompts to guide the LLM:
- System Prompt: Instructs the LLM to act as an AI assistant specialized in industrial engineering and specifications. It requests a comprehensive overview for the given topic, including performance metrics and implementation requirements, emphasizing precision with numerical values/ranges and practical technical details relevant to professionals.
- User Prompt: Provides the specific
topicfrom the input file. It explicitly requests the output format as a JSON object with specific top-level keys:performance_metrics: Array of objects (name, value/range, description).implementation_requirements: Array of objects (name, specification, description).industry_standards: Array of relevant standards/certifications.key_suppliers: Array of major equipment/technology suppliers.operational_considerations: Array of important operational factors. The prompt stresses the need for realistic values for industrial/commercial scale and demands only valid JSON output, without any surrounding text or formatting.
- LLM Call: The
chat_with_llmfunction (fromutils) is used to send these prompts to the specified LLM and retrieve the response.
JSON Handling¶
Robust JSON handling is crucial due to the variability of LLM outputs:
- Sanitization (
sanitize_json_string): Before any parsing attempt, the raw LLM response is cleaned to remove control characters that are invalid in JSON strings but might be present in the LLM output. - Extraction (
extract_json_from_response): This function implements a multi-stage strategy to find and parse the JSON:- Tries parsing the sanitized response directly.
- If that fails, looks for markdown code fences (
json ...or...) and attempts to parse the content within them. - If that fails, finds the first opening curly brace
{and the last closing curly brace}and attempts to parse the substring between them. This approach increases the likelihood of successfully extracting the desired JSON even if the LLM includes introductory text, explanations, or incorrect formatting. These helper functions are identical to those used inreturn-analysis.py.
Output¶
The script generates a JSON file (at the path determined by get_output_filepath or specified by the user) containing:
process_metadata: Standard metadata including script name, start time, duration, and UUID (generated bycreate_output_metadata).industrial_specifications: The structured JSON object containing the specifications as generated by the LLM and successfully parsed byextract_json_from_response.
If extract_json_from_response fails to return a valid JSON object from the LLM response, the script prints an error message, shows the raw response, and exits with a non-zero status code.