Chatbot for Downtime Data Opens New Paths in Production
From operational disruption to data-driven insights – How an AI chatbot based on Retrieval Augmented Generation is revolutionizing the analysis of machine failures.
Introduction: When Machine Downtime Becomes an Information Problem
In modern manufacturing environments, minimizing downtime is crucial, as every minute of machine inactivity causes significant costs and destroys production time. Despite software support through Manufacturing Execution Systems (MES), much potential for rapid data analysis remains untapped. Reports on failures are often unstructured and scattered across different systems, making analysis difficult. This is where a prototype comes in, based on Large Language Models (LLMs) in combination with Retrieval Augmented Generation (RAG). The goal is to retrieve, interpret, and utilize downtime data in natural language immediately and precisely for optimizations in daily production operations.
Downtime Data and Its Untapped Treasures
In the specific project example at Hoffmann Neopac AG, a leading manufacturer of high-quality plastic packaging, all plant disruptions are recorded as “Downtime Notes” directly on the line panel. Additionally, performance data such as speed and rejection rates per machine are recorded via sensors. The following challenges arise:
- Downtime data is often unstructured, as it exists in freely formulated notes.
- Differentiated analyses, such as which materials frequently cause downtime or which errors occur more frequently on certain lines, require well-founded data analysis skills.
- Many employees do not have the technical know-how or time to create complex evaluations. The result: Valuable insights often remain unused because data preparation seems too time-consuming.
RAG as the Key: How AI Models Incorporate External Knowledge Sources
RAG addresses precisely this problem. While classic language models (LLMs) derive their “world” only from the vast amounts of text in their training dataset, RAG extends the model with an additional knowledge source. This additional knowledge can be a database, a file system, or the internet.

Figure 1 illustrates the flow of the RAG workflow (Source [1] )
The process occurs in three steps:
- Retrieve: A query, such as “Show me the most common causes of errors in August on Line 1,” is sent by the user to a retriever. This searches a knowledge source and retrieves relevant information.
- Augment: The found entries are combined with the original query and passed to the language model (LLM) as an extended context ({Query + Information}).
- Generate: The LLM creates a contextual and fact-based response.
This process prevents invented content (“hallucinations”) and enables precise answers from company-specific data without having to retrain the model itself.
Neo4j and Chatbot: Architecture for Structured Data
For the system to understand connections between product, line, and failure category, a graph database approach is used. The prototype employs Neo4j AuraDB [2] , where entities such as “Downtime” and “TubeLine” are stored as nodes and relationships like “OCCURS_ON” (downtime on a line) as edges. Such a graph model is particularly suitable for LLMs as it can be easily represented in natural language. LLMs recognize linguistic patterns and relationships in the graph structure. The query language Cypher [3] , specifically developed for graph databases, is also more intuitive and clearer than SQL (Structured Query Language), especially with complex relationships between multiple entities.
Figure 2 illustrates the architecture of the chatbot and shows the interaction between the individual components:
- Chatbot Frontend: Via a Streamlit dashboard [4] , users can make requests in natural language. The frontend forwards the request to the LangChain Agent, which generates an appropriate query and returns the result as plain text.
- Chatbot API: This is where the actual RAG workflow runs, controlled by the LangChain library [5] . The LangChain Agent decides, based on the question, whether to use the Neo4j Cypher Chain (for structured data queries) or the Downtime Vector Chain (for semantic searches). The corresponding queries are forwarded to the Neo4j AuraDB, and the results flow into the answer generation.
- Neo4j Database: Contains all processed data from the various data sources.
Cypher Chain and Vector Chain: Two Approaches for Different Data
For structured information such as production data, the Cypher Chain is suitable, where the LLM researches in the graph database using Cypher queries. It’s different with Downtime-Notes, which contain short, similar disruption reports. Here, semantic vector search (Vector Chain) is used. Downtime-Notes are converted into vectors using embedding models and stored. These models translate words or sentences into numerical vectors to capture similar terms. This way, semantically equivalent formulations are also recognized. Neo4j can be flexibly extended as a vector database. Queries are likewise converted into embedding vectors and compared using distance metrics to find matching disruption reports, even when the formulations differ from each other.
Evaluation: Accuracy, Cost, Inference Time
A central concern was the evaluation of different AI models. In an initial prototype phase, LLMs from OpenAI were tested [6] . The results for 30 test questions (English and German) show:
- gpt-4o-mini proved to be the “sweet spot”: High accuracy (over 86% in English, up to 90% in German) combined with low costs and short response times.
- gpt-4o delivered similarly precise results in some cases, but was many times more expensive.
- gpt-3.5-turbo was affordable but sometimes inaccurate for complex queries.
Especially for companies that direct many questions per week to their chatbot, inference time and costs are decisive criteria. Here, the lighter 4o-mini model offers an attractive balance.
Conclusion: A New Standard for Data-Based Decisions
The LLM-RAG chatbot demonstrates how AI models can create real added value in industry. Instead of time-consuming research, specialists quickly receive well-founded answers about downtime, causes of errors, and performance figures. The interplay of graph database and Retrieval Augmented Generation provides precise, context-based insights. The use of AI in the production environment has long gone beyond forecasts or predictive maintenance. With RAG solutions, company-specific data sources can be efficiently utilized to make informed decisions. The result is higher efficiency, lower costs, and improved transparency in production – a clear step towards the Smart Factory.
References
1 A. Kimothi, „1 Large Language Models and the Need for Retrieval Augmented Generation,“ in A Simple Guide to Retrieval Augmented, Manning Publications, 2024, pp. 1-17.
2 Neo4j Inc., „Neo4j AuraDB: Fully Managed Graph Database,“ 2025. [Online]. Available: https://neo4j.com/product/auradb/. [Zugriff am 10 03 2025].
3 Neo4j Inc., „Cypher Manual,“ 2025. [Online]. Available: https://neo4j.com/docs/cypher-manual/current/introduction/. [Zugriff am 10 03 2025].
4 Snowflake Inc., „A faster way to build and share data apps,“ 2024. [Online]. Available: https://streamlit.io/. [Zugriff am 10 03 2025].
5 LangChain, „Applications that can reason. Powered by LangChain,“ 2025. [Online]. Available: https://www.langchain.com/. [Zugriff am 10 03 2025].
6 OpenAI, „OpenAI Platform,“ 2025. [Online]. Available: https://platform.openai.com/docs/models. [Zugriff am 10 03 2025].

Leave a Reply
Want to join the discussion?Feel free to contribute!