MongoDB Inc.

06/18/2024 | News release | Distributed by Public on 06/18/2024 08:17

Unified Namespace Implementation with MongoDB and MaestroHub

In the complex world of modern manufacturing, a crucial challenge has long persisted: how to seamlessly connect the physical realm of industrial control systems with the digital landscape of enterprise operations. The International Society of Automation's ISA-95 standard, often visualized as the automation pyramid, has emerged as a guiding light. As shown below, this five-level hierarchical model empowers manufacturers to bridge the gap between these worlds, unlocking a path toward smarter, more integrated operations.

Figure 1: In the automation pyramid, data moves up or down one layer at a time, using point-to-point connections.

Manufacturing organizations face a number of challenges when implementing smart manufacturing applications due to the sheer volume and variety of data generated. An average factory produces terabytes of data daily, including time series data from machines stored in process historians and accessed by supervisory control and data acquisition (or SCADA) systems. Additionally, manufacturing execution systems (MES), enterprise resource planning (ERP) systems, and other operations software generate vast amounts of structured and unstructured data. Globally, the manufacturing industry generates an estimated 1.9 petabytes of data annually.

Manufacturing leaders are eager to leverage their data for AI and generative AI projects, but a Workday Global Survey reveals that only 4% of the survey's respondents believe their data is fully accessible for such applications. Data silos are a significant hurdle, with data workers spending an average of 48% of their time on data search and preparation.

A popular approach to making data accessible is consolidating it in a cloud data warehouse and then adding context. However, this can be costly and inefficient, as dumping data without context makes it difficult for AI developers to understand its meaning and origin, especially for operational technology time series data.

Figure 2: Pushing uncontextualized data to a data warehouse and then adding context is expensive and inefficient.

All these issues underscore the need for a new approach-one that not only standardizes data across disparate shop floor systems, but also seamlessly weaves context into the fabric of this data. This is where the Unified Namespace (UNS) comes in.

Figure 3: Unified Namespace provides the right data and context to all the applications connected to it.

Unified Namespace is a centralized, real-time repository for all production data. It provides a single, comprehensive view of the business's current state. Using an event-driven architecture, applications publish real-time updates to a central message broker, which subscribers can consume asynchronously. This creates a flexible, decoupled ecosystem where applications can both produce and consume data as needed.

Figure 4: UNS enables all the enterprise systems to have one centralized location to get the data they need for what they want to accomplish.

MaestroHub and MongoDB: Solving the UNS challenge

Initially introduced in 2011 at the Hannover Fair of Industrial Technologies, the core idea behind Industry 4.0 is to establish seamless connectivity and interoperability between disparate systems used in manufacturing. And UNS aims to solve this.

Over the past five years, we have seen interest in UNS ramping up steadily, and now manufacturers are looking for practical ways to implement it. In particular, a question we're frequently asked is where does UNS actually live.

To answer that question, we need to look at popular architecture patterns, and the pros and cons of each. The most common pattern is implementing UNS in an MQTT broker. An MQTT broker will act as an intermediary entity that receives messages published by clients, filters the messages by topic, and distributes them to subscribers. The reason most manufacturers choose MQTT is it is an open architecture that is easy to implement. However, the challenge with just using the MQTT broker is that the clients don't get historical data access (which will be required to build the analytical and AI applications). Another approach can be to just dump all the data in a data warehouse and then add context to it. This can solve the problem of historical data access but it is an inefficient approach to standardize messages after they have been landed in the data warehouse in the cloud.

A superior solution for comprehensive, real-time data access is combining a single source of truth (SSoT) Unified Namespace platform like MaestroHub with a flexible multi-cloud data platform like MongoDB Atlas. MaestroHub creates SSoT for industrial data, resulting in an up to 80% reduction in integration effort for brownfield facilities.

Figure 5: MaestroHub SSoT creates a unified data integration layer, saving up to 50% of time in data contextualization (Source: MaestroHub).

MaestroHub provides the connectivity layer to all data sources on the factory floor, along with contextualization and data orchestration. This makes it easy to connect the data needed for the UNS, enrich it with more context, and then publish it to consumers using the protocol that works best for them.

Under the hood, MaestroHub stores metadata of connections, instances, and flows, and uses MongoDB as the database to store all this data. MongoDB's flexible data modeling patterns reduce the complexity of mapping and transforming data when it's shared across different clients in the UNS. Additionally, scalable data indexing overcomes performance concerns as the UNS grows over time.

Figure 6: MaestroHub and MongoDB together enable a real-time UNS plus long-term storage.

MongoDB: The foundation for intelligent industrial UNS

In the quest to build a unified namespace system (UNS) for the modern industrial landscape, the choice of database becomes paramount. So why turn to MongoDB?

  • Scalability and high availability: It scales effortlessly, both vertically and horizontally (sharding), to handle the torrent of data from sensors, machines, and processes. Operational Technology (OT) systems generate vast amounts of data from these sources, and MongoDB ensures seamless management of that information.

  • Document data model: Its adaptable document model accommodates diverse data structures, ensuring a harmonious flow of information. A Unified Namespace (UNS) must handle data from any factory source, accommodating structure variations. MongoDB's flexible schema design allows different data models to coexist in a single database, with schema extensibility at runtime. This flexibility facilitates the seamless integration of new data sources and types into the UNS.

  • Real-time data processing:MongoDB Change Streams and Atlas Device Sync empower third-party applications to access real-time data updates. This is essential for monitoring, alerting, and real-time analysis within a UNS, enabling prompt responses to critical events.

  • Gen AI application development with ease:Atlas Vector Search efficiently performs semantic searches on vector embeddings stored in MongoDB Atlas. This capability seamlessly integrates with large language models (LLMs) to provide relevant context in retrieval-augmented generation (RAG) systems. Given that the Universal Name Service (UNS) functions as a single source of truth for industrial applications, connecting gen AI apps to retrieve context from the UNS database ensures accurate and reliable information retrieval for these applications.

With the foundational database established, let's explore MaestroHub, a platform designed to leverage the power of MongoDB in industrial settings.

The MaestroHub platform

MaestroHub is a provider of a SSoT for industrial data, specifically tailored for manufacturers. It achieves this through:

  • Data connectors: MaestroHub connects to diverse data sources using 38 different industrial communication protocols, encompassing OT drivers, files, databases (SQL, NoSQL, time series), message brokers, web services, cloud systems, historians, and data warehouses. The bi-directional nature of 90% of these protocols ensures comprehensive data integration, leaving no data siloed.

  • Data contextualization based on ISA-95: Leveraging ISA-95 Part 2, MaestroHub employs a semantic hierarchy and a clear naming convention for easy navigation and understanding of data topics. The contextualization of the payload is not just limited to the unit of measure AND definitional but also contains Enterprise/Site/Area/Line/Cell details, which are invaluable for analytics studies. Data contextualization is an important feature of a UNS platform.

  • Logic flows/rule engine: Adhering to the UNS principle "Do not make any assumptions on how the data will be consumed," the data should flow flexibly from sources to brokers and from brokers to consumers in terms of rules, frequencies, and multiple endpoints. MaestroHub allows you to set rules such as Always, OnChange, OnTrue, and WhileTrue, where you can dynamically determine the conditions using events and inputs via JavaScript.

  • Insights created by MaestroHub: MaestroHub provides real-time diagnostics of data health by leveraging Prometheus, Elasticsearch, Fluentd, and Kibana. Network problems, changed endpoints, and changed data types are automatically diagnosed and reported as insights. Additionally, MaestroHub uses NATS for queue management and stream analytics, buffering data in the event of a network outage. This allows IT and OT teams to monitor, debug, and audit logs with full data lineage.

Conclusion

The ISA-95 automation pyramid presents significant challenges for the manufacturing industry, including a lack of flexibility, limited scalability, and difficulty integrating new technologies. By adopting a Unified Namespace architecture with MaestroHub and MongoDB, manufacturers can overcome these challenges and achieve real-time visibility and control over their operations, leading to increased efficiency and improved business outcomes.

Read more on how MongoDB enables Unified Namespace via its multi-cloud developer data platform.

We are actively working with our clients on solving Unified Namespace challenges. Take a look at our Manufacturing and Industrial IoT page for more stories or contact us through the web form in the link.