Altair Engineering Inc.

07/16/2024 | News release | Distributed by Public on 07/16/2024 05:08

Enterprise Knowledge Graphs to the Rescue

Many organizations that rely on effective analysis of data from multiple sources, formats, and structures are aware of the concept of data lakes and fabrics. Such organizations are in various stages of adoption, from awareness, to skepticism, to appreciation, to even preference. However, many organizations that implemented data lakes realized that inserting their data into a "lake," or otherwise collocating data, doesn't provide the seamless or unified access they once envisioned. The term "data swamp" arose from this disillusionment. The data swamp predicament is actually "iteration two" of the data cloud. But loading data into a common store like HDFS or S3, for example, doesn't result in the global, holistic integration of data assets. Now, data fabrics and data mesh are in vogue. For the data fabric/mesh to reach its potential, we need a common mode or model to see, access, understand, and trust enterprise data.

Enter the knowledge graph, which moves people from working with rows and columns of stored data to an automated approach that models data like how people think.

Knowledge Graphs: An Overview

Knowledge graphs can describe, contextualize, and link data contained in all enterprise data sources, structured and unstructured. Knowledge graphs can provide the common model for enterprise data assets so users and automated processes enjoy visibility, access, and understanding of available information within an enterprise.

Simply put, the "knowledge" part of knowledge graphs represents the expressive, unambiguous metadata model used to specify the entities, concepts, and salient relationships that exist in the underlying enterprise data sources. In fact, knowledge representation models allow aware software to infer additional information from existing information, a form of machine reasoning that makes implicit information explicit.

For example, one might specify that an "employee" is a kind of "person." The knowledge graph may contain a reference to John Q. Smith as an employee - from there, the system can infer that John Q. Smith is also a person. Another example might include the situation where the concept of "aspirin" might be represented in multiple ways, but the knowledge graph connects each reference to the same concept. One benefit of each of these examples is the queries are more powerful because the system makes inferences and relates things for users. In other words, users interact in a more intuitive, more powerful way with enterprise information with knowledge graphs.

Diving Deeper into Knowledge Graphs

Self-Describing

Knowledge graphs are self-describing, which means that when a client - human or machine - accesses a knowledge graph, the data and (rich) metadata are encapsulated in one bundle or package. Fortified with the data and metadata the user processes or otherwise understands the bundle without additional, external assistance required.

Common Business Meaning

Knowledge graphs are schema-less, which means there's one structure for all enterprise data represented in the knowledge graph. This results in a homogeneous information access layer for all clients to fetch from and contribute to the knowledge graph. One important implication of homogeneity in structure and format is the mitigation of the costly, ever-present extract, transfer, load (ETL) process so prevalent in information technology today. Of course, incoming data to the knowledge graph must be converted to the knowledge graph form. Yet over time the knowledge graph becomes the rule, not the exception. Additionally, clients build interfaces to connect to the knowledge graph, which promotes data standardization, reuse, and modularity. Imagine the cost savings alone in mitigating endless "couplings" between data and services!

Consistent Representation

A simple, yet powerful structure underlies the self-describing and schema-less characteristics of the knowledge graph. In the parlance, this "molecular" structure is called a fact. A fact consists of three essential elements: a subject (S), predicate (P), and object (O), where predicates are concepts defined in one or more ontologies that give shape to the data.

The combination of the three elements - SPO - is often called a triple, or statement. Mathematically, it's considered a directed graph. Conceptually, it's a machine interpretable sentence. For example, {anti-inflammatory drug, reduces, pain} represents a triple. Harkening back to our reasoning example, we could say {aspirin, subclass of, anti-inflammatory drug} and our reasoner will conclude {aspirin, reduces, pain}. While this might be intuitive for humans, software computes it automatically - thus making implicit information explicit.

Using facts (or triples) as the molecular structure for all data (and metadata) contained in the knowledge graph, one can envision how the knowledge graph provides a consistent, simple representation, much like water is composed of H2O. So, unlike the data swamp, the knowledge graph achieves the vision of the data fabric.

Automated

The expressive metadata, common structural model, and other factors described combine to synthesize or integrate data, so humans don't. From the user's perspective, the data is connected and organized semantically such that they no longer transact queries among multiple data services to retrieve and then combine data. For example, using contemporary business intelligence (BI) tools, users are required to develop elaborate ETL scripts to fetch data and model it in a way that their users will recognize, and then load it into their analytics tools. The knowledge fabric completes much of this for the users. In other words, users navigate and select desired portions of the knowledge graph and use their personalized views as input into their BI tools of choice. No more tedious and brittle ETL schemes over disparate datasets. Let the enterprise knowledge graph platform do that work so humans can focus on more cognitive tasks!

Knowledge Graphs Connect

No person or enterprise is an island. Since the advent of data storage mechanisms, namely databases, people have collectively modeled data in an inward-oriented way. In today's world, interconnectedness and interoperability rule the day.

Knowledge graphs are built to connect or link data (and metadata). In the knowledge graph paradigm, federating projects, organizations, partners, ecosystems, and so on is a natural, low-cost activity. This helps enterprises adapt to unforeseen challenges with minimal, if any, disruption.

Portable and Future-Proof

Knowledge graphs described and built using open standards are portable across data storage and management systems. In other words, the graphs interoperate across graph stores and analytics. In fact, a knowledge graph can be made to look like a relational database! Portability reduces vendor lock-in and increases flexibility. Knowledge graphs also future-proof data since they're self-describing - meaning future tools will have no problem introspecting a dataset and making sense of it.

Ask Any Questions

Lastly, knowledge graphs represent all required data in a business-oriented model which is intuitive to users, managers, and other stakeholders. Uniquely, the Altair® Graph Studio™ knowledge graph solution loads knowledge graphs on demand and allows users to answer ad hoc questions of the entire knowledge graph based on security policies. Users can answer questions now that they didn't know they had yesterday. The "gap" between technology and business users narrows, and time from idea to answer shrinks.

Knowledge graphs provide the foundation or infrastructure for empowering autonomous artificial intelligence (AI) agents and systems. To be sure, to realize autonomy at scale, the data infrastructure must be more homogeneous, rich, and intelligent. This is precisely what knowledge graphs deliver.

To learn more about Graph Studio, visit https://altair.com/altair-graph-studio.