Ontolog Forum
Draft Summit Synthesis
Overview
- LLM challenges align well with Ontology capabilities
- Combining the strengths of LLMs and ontologies/knowledge graphs to overcome weaknesses of each
- The Fall Series
- Discussed “hybrid systems”, provided motivation for developing them, and demonstrated applications/sandboxes based on them
- Highlighted need to keep exploring areas of collaboration, and improving both ontology and LLM development and use
- Various architectures/frameworks show different interactions between ontologies and LLMs
- Several with explicit feedback loops
Broader Thoughts
Deborah McGuinness
The Evolving Landscape: Generative AI, Ontologies, and Knowledge Graphs
- AI is changing business landscape
- Necessary to understand strengths and weaknesses of LLMs
- “AI will not replace most knowledge professionals but many knowledge professionals who do not collaborate with generative AI will be replaced”
- “Generative AI explosion provides … a unique opportunity to shine and a time to rethink our methods”
- LLMs are “usefully wrong” – providing information to help you think
Gary Marcus
No AGI (and no Trustworthy AI) without Neurosymbolic AI
- Hypothesis: Scale is all you need
- Has been funded more than any other hypothesis in AI history and made progress
- But has failed to solve very many problems: AGI, autonomous driving, common sense, bias issues, reliability, trustworthiness, ...
- Tech leaders are starting to back away from this hypothesis
- Hubert Dreyfus: Climbing ever larger trees will not get one to the moon (early 1970s)
- [Deep learning is] a better ladder, but a better ladder doesn't necessarily get you to the moon
- We still desperately need neurosymbolic AI but it won't be enough to get to AGI
- Intelligence is multi-faceted: we should not expect one-size-fits-all solutions
- Looking for a quick win is distracting us from the hard work that we actually need to do
Anatoly Levenchuk
Hybrid Reasoning, the Scope of Knowledge, and What Is Beyond Ontologies?
- A cognitive system/agent is a cognitive architecture with a collection of KGs, LLMs and other knowledge representations
- Cognitive architecture refers to both a theory about the structure of the human mind and to a computational instantiation of such a theory used in the fields of artificial intelligence (AI) and computational cognitive science (https://en.wikipedia.org/wiki/Cognitive_architecture)
- Where KGs are discriminative declarations of “what is in the world” and LLMs are generative
- Both have roles in knowledge evolution
- “Looking at LLMs as chatbots is the same as looking at early computers as calculators. We're seeing an emergence of a whole new computing paradigm, and it is very early.”
John Sowa and Arun Majumdar
Trustworthy Computation: Diagrammatic Reasoning With and About LLMs
- Large language models cannot do reasoning, but find and apply reasoning patterns from training data
- Important to note that “thinking in language” is only one form of reasoning
- Systems developed by Permion use LLMs for summarization/synthesis
- But restrict responses based on the ontology
- Combine LLMs with a “scaffolding model” (vector, matrix and tensor-based) => ontology and methods of diagrammatic reasoning based on conceptual graphs (CGs)
- Where ontology is derived/tailored to policies, rules, and specifications of the project or business
Fabian Neuhaus
Ontologies in the era of large language models – a perspective
- Argument 1: Attempts to automate ontology development are based on a misunderstanding of what ontology engineers do
- Ontology engineers create consensus
- Argument 2: There is no ontology hidden in the weights of the LLM
- Very good at navigating ambiguities and different perspectives
- But does not resolve ambiguities, have logical consistency or persistent ontological commitments
John Sowa
Without Ontology, LLMs are clueless
- LLMs are a powerful technology, remarkably similar to a joke in 1900.
- Dump books in a machine, turn a crank, and expect a stream of knowledge to flow through the wires.
- The results are sometimes good and sometimes disastrous.
- LLM methods are excellent for translation, useful for search, but unreliable for generating new combinations.
- A lawyer used them to find precedents for a legal case.
- It generated an imaginary precedent and created a citation that seemed to be legitimate.
- But the opposing lawyer found that the citation was false.
- Ontology states criteria for testing the results of LLMs.
- Anything generated by LLMs is just a guess (hypothesis).
- If it's inconsistent with the ontology or with a verified database, it can be rejected as false.
A look across the industry
Kurt Cagle
Complementary Thinking: Language Models, Ontologies and Knowledge Graphs
- Mapping LLMs to ontologies/KGs
- Matching LLM concepts to KG instances over specific classes such as schema.org or NIEM
- Using a RAG (Retrieval Augmented Generator) plug-in to communicate with an ontology/KG and add to the node-sets or control output transformation
- Reading Turtle, RDF-XML and JSON-LD
- Mapping ontologies/KGs to LLMs
- Using URI/IRI references in data and obtaining results with those references
- Adding KG embeddings (vector space representations) to LLM training corpus
Tony Seale
How Ontologies Can Unlock the Potential of Large Language Models for Business
- LLM and ontology “reinforcing feedback loop of continuous improvement”
- Using ontology/KG to place “guardrails” on LLM outputs
- Using LLMs to aid in maintenance and extension of ontology
- Information as a continuous stream (~LLMs) or discrete chunks (~KGs)
- Analogy to System 1 (intuitive/instinctual) and System 2 (reasoning based) thinking
Yuan He
DeepOnto: A Python Package for Ontology Engineering with Deep Learning and Language Models
- DeepOnto
- Python package for ontology engineering with deep learning and LMs
Hamed Babaei Giglou
LLMs4OL: Large Language Models for Ontology Learning
- Results:
- We explored LLMs potential for OL through our introduced conceptual framework, LLMs4OL.
- Extensive experiments on 11 LLMs across three OL tasks demonstrate the paradigm’s proof of concept.
- The obtained empirical results show that foundational LLMs are not sufficiently suitable for ontology construction that entails a high degree of reasoning skills and domain expertise.
- When LLMs effectively fine-tuned they just might work as suitable assistants, alleviating the knowledge acquisition bottleneck, for ontology construction.
- A codebase with detailed results is shared: https://github.com/HamedBabaei/LLMs4OL
- Future:
- Still, we need to explore more recent LLMs.
- Incorporate more ontologies in this study.
- Build a benchmark dataset that considers more domains.
- Optimize three LLMs4OL tasks.
Demos of hybrid systems
Evren Sirin
Stardog Voicebox: LLM-Powered Question Answering with Knowledge Graphs
- Stardog Voicebox combines LLM and graph database technology to:
- Take a description of ontology and create it
- Turn natural language query into SPARQL
- Provide context for decisions and debug/repair queries
- Built on:
- Open-source foundational model, MPT-30B
- Fine-tuned with ~20K SPARQL queries
- Vector embedding and search via MiniLM-L6-v2 language model
Prasad Yalamanchik
Harvest Knowledge From Language - Harness the power of Large Language Models and Semantic Technology
- TextDistil
- Inputs – text documents; Outputs – NQuad files and JSON
- Models trained on domain-specific variables, and training data labeled using taxonomy
- Ontology for organization/semantics (human defined)
- Query in NL parsed to ontology concepts and used to generate query to KG
- Triples returned with provenance from ingested documents
- LLM used to summarize response
Andrea Westerinen
Populating Knowledge Graphs: The Confluence of Ontology and Large Language Models
- Overview of open-source tooling to parse news articles (Deep Narrative Analysis, DNA)
- Create knowledge stores with data from text stored in RDF graphs
- Enabling aggregation of textual information within and across documents
- To efficiently compare and analyze collections of text to understand patterns, trends, …
- Prompts sent to OpenAI chat completion API for:
- Narrative analysis
- Rhetorical devices and viewpoint interpretations
- Sentence analysis
- Linguistics (tense, voice, errors, …), rhetorical devices and mapping to ontology
- LLM JSON responses (already mapped to the ontology) used to generate RDF
- Which is stored in graph database
Deborah McGuinness
- Applications of LLMs at RPI
- Collaborative KG generation by leveraging LLMs for refinement and population (value restrictions and instances) of an existing ontology, in partnership with human
- Enhancing wine and cheese ontology
- But could also provide concepts that are a starting point for a new ontology, for human consideration
- LLM/KG Fact Checker (ChatBS) “sandbox” with questions submitted (multiple times) to OpenAI completion API and entity linking to Wikidata for validation
- Collaborative KG generation by leveraging LLMs for refinement and population (value restrictions and instances) of an existing ontology, in partnership with human
Till Mossakowski
Modular design patterns for neural-symbolic integration: refinement and combination
- Neural networks can extend ontologies of structured objects: from neuro to symbolic
- Ontology pre-training can improve transformer performance: from symbolic to neuro
- We can beat purely symbolic and purely neural baselines
- Design patterns as systematic building blocks => towards a theory of neuro-symbolic engineering
- Future work: Novel neural embeddings for ontologies
Markus J. Buehler
Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning
- Navigating generated knowledge graphs can result in new scientific insights