Actions

OntologySummit2024/Synthesis: Difference between revisions

Ontolog Forum

(Created page with "= Draft Summit Synthesis = == Overview == * LLM challenges align well with Ontology capabilities ** Combining the strengths of LLMs and ontologies/knowledge graphs to overcome weaknesses of each * The Fall Series ** Discussed “hybrid systems”, provided motivation for developing them, and demonstrated applications/sandboxes based on them ** Highlighted need to keep exploring areas of collaboration, and improving both ontology and LLM development and use ** Various ar...")
 
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Draft Summit Synthesis =
= Draft Summit Synthesis =
A proposed working outline for a synthesis on the way to a Communique:
1. Intro
Neuro-Symbolic AI (NeSy) AI, is defined briefly as not new  but recent advances in ML using neural nets suggest new, hybrid  approach in artificial intelligence that combine the strengths of two established approaches.
1.1 Full AI view uses Figure 1 is from the 2017 Ontology Summit is a high level model depicting 3 AI components and is taken from our 2017 ontology summit.  This shows some limits to a naive view of the new field.
1.2 Introduce AGI and factors like Emergence
1.3 GoFAI section
The older and maybe more mature symbolic AI or good Old Fashion AI approach uses symbols and rules to represent knowledge and reasoning. In comparison to current neuro architecture symbolic approaches offer interpretability and more certainty while allowing flexibility.
* The idea that AI is changing business landscape
1.4 Limitations and issues with Connectionist Approaches
However there are limitations such as noted by machine learning theorist Leslie Valiant (2003.) who pointed to a key challenge for what he called intelligent cognitive behavior which makes explicit some of what was Implied in Figure 1’s three part diagram.
Summmay of Marcus and Sowa points.
Also Deborah McGuinness on''The Evolving Landscape: Generative AI, Ontologies, and Knowledge Graphs''
** Necessary to understand strengths and weaknesses of LLMs
** “AI will not replace most knowledge professionals but many knowledge professionals who do not collaborate with generative AI will be replaced”
2. A simpler question is how each approach might help the other without system integration.Currently we can be sure that LLMs can help in finding relevant, published resources in various text forms.
So a  question is, can they automatically extract and structure something useful given what they can find and process from texts?
3. Why a Neuro-Symbolic AI Hybrid ? (Suggest using a SWOT approach to the writing and use the AI triangle -learning, reasoning and knowledge throughout)
3.1  Hybrid Architectures
There are several possible neurosymbolic architecture types that have been outline by Kautz
3.2 Some key benefits of Neuro-Symbolic AI
4. Discussion of Potential applications: Markus J. Buehler’s talk on material science “ Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning” would provide material as well as the others listed in the Summary and Demos of hybrid systems. 
5. Risks and Issues
6.  Conclusions


== Overview ==
== Overview ==
Line 81: Line 118:
* Information as a continuous stream (~LLMs) or discrete chunks (~KGs)
* Information as a continuous stream (~LLMs) or discrete chunks (~KGs)
** Analogy to System 1 (intuitive/instinctual) and System 2 (reasoning based) thinking
** Analogy to System 1 (intuitive/instinctual) and System 2 (reasoning based) thinking
=== '''Evren Sirin''' ===
''Stardog Voicebox: LLM-Powered Question Answering with Knowledge Graphs''
* Stardog Voicebox combines LLM and graph database technology to:
** Take a description of ontology and create it
* Turn natural language query into SPARQL
** Provide context for decisions and debug/repair queries
* Built on:
** Open-source foundational model, MPT-30B
** Fine-tuned with ~20K SPARQL queries
** Vector embedding and search via MiniLM-L6-v2 language model
=== '''Yuan He''' ===
=== '''Yuan He''' ===
''DeepOnto: A Python Package for Ontology Engineering with Deep Learning and Language Models''
''DeepOnto: A Python Package for Ontology Engineering with Deep Learning and Language Models''
Line 104: Line 131:
** A codebase with detailed results is shared: https://github.com/HamedBabaei/LLMs4OL
** A codebase with detailed results is shared: https://github.com/HamedBabaei/LLMs4OL
* Future:
* Future:
* Still, we need to explore more recent LLMs.
** Still, we need to explore more recent LLMs.
* Incorporate more ontologies in this study.
** Incorporate more ontologies in this study.
* Build a benchmark dataset that considers more domains.
** Build a benchmark dataset that considers more domains.
* Optimize three LLMs4OL tasks.
** Optimize three LLMs4OL tasks.


== Demos of hybrid systems ==
== Demos of hybrid systems ==
=== '''Evren Sirin''' ===
''Stardog Voicebox: LLM-Powered Question Answering with Knowledge Graphs''
* Stardog Voicebox combines LLM and graph database technology to:
** Take a description of ontology and create it
* Turn natural language query into SPARQL
** Provide context for decisions and debug/repair queries
* Built on:
** Open-source foundational model, MPT-30B
** Fine-tuned with ~20K SPARQL queries
** Vector embedding and search via MiniLM-L6-v2 language model
=== '''Prasad Yalamanchik''' ===
=== '''Prasad Yalamanchik''' ===
''Harvest Knowledge From Language - Harness the power of Large Language Models and Semantic Technology''
''Harvest Knowledge From Language - Harness the power of Large Language Models and Semantic Technology''

Latest revision as of 15:13, 10 April 2024

Draft Summit Synthesis

A proposed working outline for a synthesis on the way to a Communique:

1. Intro Neuro-Symbolic AI (NeSy) AI, is defined briefly as not new but recent advances in ML using neural nets suggest new, hybrid approach in artificial intelligence that combine the strengths of two established approaches.

1.1 Full AI view uses Figure 1 is from the 2017 Ontology Summit is a high level model depicting 3 AI components and is taken from our 2017 ontology summit. This shows some limits to a naive view of the new field. 1.2 Introduce AGI and factors like Emergence

1.3 GoFAI section

The older and maybe more mature symbolic AI or good Old Fashion AI approach uses symbols and rules to represent knowledge and reasoning. In comparison to current neuro architecture symbolic approaches offer interpretability and more certainty while allowing flexibility.

  • The idea that AI is changing business landscape

1.4 Limitations and issues with Connectionist Approaches However there are limitations such as noted by machine learning theorist Leslie Valiant (2003.) who pointed to a key challenge for what he called intelligent cognitive behavior which makes explicit some of what was Implied in Figure 1’s three part diagram. Summmay of Marcus and Sowa points. Also Deborah McGuinness onThe Evolving Landscape: Generative AI, Ontologies, and Knowledge Graphs

    • Necessary to understand strengths and weaknesses of LLMs
    • “AI will not replace most knowledge professionals but many knowledge professionals who do not collaborate with generative AI will be replaced”

2. A simpler question is how each approach might help the other without system integration.Currently we can be sure that LLMs can help in finding relevant, published resources in various text forms.

So a question is, can they automatically extract and structure something useful given what they can find and process from texts?

3. Why a Neuro-Symbolic AI Hybrid ? (Suggest using a SWOT approach to the writing and use the AI triangle -learning, reasoning and knowledge throughout) 3.1 Hybrid Architectures There are several possible neurosymbolic architecture types that have been outline by Kautz

3.2 Some key benefits of Neuro-Symbolic AI

4. Discussion of Potential applications: Markus J. Buehler’s talk on material science “ Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning” would provide material as well as the others listed in the Summary and Demos of hybrid systems.

5. Risks and Issues

6. Conclusions


Overview

  • LLM challenges align well with Ontology capabilities
    • Combining the strengths of LLMs and ontologies/knowledge graphs to overcome weaknesses of each
  • The Fall Series
    • Discussed “hybrid systems”, provided motivation for developing them, and demonstrated applications/sandboxes based on them
    • Highlighted need to keep exploring areas of collaboration, and improving both ontology and LLM development and use
    • Various architectures/frameworks show different interactions between ontologies and LLMs
    • Several with explicit feedback loops

Broader Thoughts

Deborah McGuinness

The Evolving Landscape: Generative AI, Ontologies, and Knowledge Graphs

  • AI is changing business landscape
    • Necessary to understand strengths and weaknesses of LLMs
    • “AI will not replace most knowledge professionals but many knowledge professionals who do not collaborate with generative AI will be replaced”
    • “Generative AI explosion provides … a unique opportunity to shine and a time to rethink our methods”
  • LLMs are “usefully wrong” – providing information to help you think

Gary Marcus

No AGI (and no Trustworthy AI) without Neurosymbolic AI

  • Hypothesis: Scale is all you need
    • Has been funded more than any other hypothesis in AI history and made progress
    • But has failed to solve very many problems: AGI, autonomous driving, common sense, bias issues, reliability, trustworthiness, ...
    • Tech leaders are starting to back away from this hypothesis
    • Hubert Dreyfus: Climbing ever larger trees will not get one to the moon (early 1970s)
    • [Deep learning is] a better ladder, but a better ladder doesn't necessarily get you to the moon
  • We still desperately need neurosymbolic AI but it won't be enough to get to AGI
    • Intelligence is multi-faceted: we should not expect one-size-fits-all solutions
    • Looking for a quick win is distracting us from the hard work that we actually need to do

Anatoly Levenchuk

Hybrid Reasoning, the Scope of Knowledge, and What Is Beyond Ontologies?

  • A cognitive system/agent is a cognitive architecture with a collection of KGs, LLMs and other knowledge representations
    • Cognitive architecture refers to both a theory about the structure of the human mind and to a computational instantiation of such a theory used in the fields of artificial intelligence (AI) and computational cognitive science (https://en.wikipedia.org/wiki/Cognitive_architecture)
  • Where KGs are discriminative declarations of “what is in the world” and LLMs are generative
  • Both have roles in knowledge evolution
  • “Looking at LLMs as chatbots is the same as looking at early computers as calculators. We're seeing an emergence of a whole new computing paradigm, and it is very early.”

John Sowa and Arun Majumdar

Trustworthy Computation: Diagrammatic Reasoning With and About LLMs

  • Large language models cannot do reasoning, but find and apply reasoning patterns from training data
  • Important to note that “thinking in language” is only one form of reasoning
  • Systems developed by Permion use LLMs for summarization/synthesis
    • But restrict responses based on the ontology
  • Combine LLMs with a “scaffolding model” (vector, matrix and tensor-based) => ontology and methods of diagrammatic reasoning based on conceptual graphs (CGs)
    • Where ontology is derived/tailored to policies, rules, and specifications of the project or business

Fabian Neuhaus

Ontologies in the era of large language models – a perspective

  • Argument 1: Attempts to automate ontology development are based on a misunderstanding of what ontology engineers do
    • Ontology engineers create consensus
  • Argument 2: There is no ontology hidden in the weights of the LLM
    • Very good at navigating ambiguities and different perspectives
    • But does not resolve ambiguities, have logical consistency or persistent ontological commitments

John Sowa

Without Ontology, LLMs are clueless

  • LLMs are a powerful technology, remarkably similar to a joke in 1900.
    • Dump books in a machine, turn a crank, and expect a stream of knowledge to flow through the wires.
  • The results are sometimes good and sometimes disastrous.
    • LLM methods are excellent for translation, useful for search, but unreliable for generating new combinations.
    • A lawyer used them to find precedents for a legal case.
    • It generated an imaginary precedent and created a citation that seemed to be legitimate.
    • But the opposing lawyer found that the citation was false.
  • Ontology states criteria for testing the results of LLMs.
    • Anything generated by LLMs is just a guess (hypothesis).
    • If it's inconsistent with the ontology or with a verified database, it can be rejected as false.

A look across the industry

Kurt Cagle

Complementary Thinking: Language Models, Ontologies and Knowledge Graphs

  • Mapping LLMs to ontologies/KGs
    • Matching LLM concepts to KG instances over specific classes such as schema.org or NIEM
    • Using a RAG (Retrieval Augmented Generator) plug-in to communicate with an ontology/KG and add to the node-sets or control output transformation
    • Reading Turtle, RDF-XML and JSON-LD
  • Mapping ontologies/KGs to LLMs
    • Using URI/IRI references in data and obtaining results with those references
    • Adding KG embeddings (vector space representations) to LLM training corpus

Tony Seale

How Ontologies Can Unlock the Potential of Large Language Models for Business

  • LLM and ontology “reinforcing feedback loop of continuous improvement”
    • Using ontology/KG to place “guardrails” on LLM outputs
    • Using LLMs to aid in maintenance and extension of ontology
  • Information as a continuous stream (~LLMs) or discrete chunks (~KGs)
    • Analogy to System 1 (intuitive/instinctual) and System 2 (reasoning based) thinking

Yuan He

DeepOnto: A Python Package for Ontology Engineering with Deep Learning and Language Models

  • DeepOnto
  • Python package for ontology engineering with deep learning and LMs

Hamed Babaei Giglou

LLMs4OL: Large Language Models for Ontology Learning

  • Results:
    • We explored LLMs potential for OL through our introduced conceptual framework, LLMs4OL.
    • Extensive experiments on 11 LLMs across three OL tasks demonstrate the paradigm’s proof of concept.
    • The obtained empirical results show that foundational LLMs are not sufficiently suitable for ontology construction that entails a high degree of reasoning skills and domain expertise.
    • When LLMs effectively fine-tuned they just might work as suitable assistants, alleviating the knowledge acquisition bottleneck, for ontology construction.
    • A codebase with detailed results is shared: https://github.com/HamedBabaei/LLMs4OL
  • Future:
    • Still, we need to explore more recent LLMs.
    • Incorporate more ontologies in this study.
    • Build a benchmark dataset that considers more domains.
    • Optimize three LLMs4OL tasks.

Demos of hybrid systems

Evren Sirin

Stardog Voicebox: LLM-Powered Question Answering with Knowledge Graphs

  • Stardog Voicebox combines LLM and graph database technology to:
    • Take a description of ontology and create it
  • Turn natural language query into SPARQL
    • Provide context for decisions and debug/repair queries
  • Built on:
    • Open-source foundational model, MPT-30B
    • Fine-tuned with ~20K SPARQL queries
    • Vector embedding and search via MiniLM-L6-v2 language model

Prasad Yalamanchik

Harvest Knowledge From Language - Harness the power of Large Language Models and Semantic Technology

  • TextDistil
    • Inputs – text documents; Outputs – NQuad files and JSON
    • Models trained on domain-specific variables, and training data labeled using taxonomy
    • Ontology for organization/semantics (human defined)
    • Query in NL parsed to ontology concepts and used to generate query to KG
    • Triples returned with provenance from ingested documents
    • LLM used to summarize response

Andrea Westerinen

Populating Knowledge Graphs: The Confluence of Ontology and Large Language Models

  • Overview of open-source tooling to parse news articles (Deep Narrative Analysis, DNA)
    • Create knowledge stores with data from text stored in RDF graphs
    • Enabling aggregation of textual information within and across documents
    • To efficiently compare and analyze collections of text to understand patterns, trends, …
  • Prompts sent to OpenAI chat completion API for:
    • Narrative analysis
    • Rhetorical devices and viewpoint interpretations
    • Sentence analysis
    • Linguistics (tense, voice, errors, …), rhetorical devices and mapping to ontology
  • LLM JSON responses (already mapped to the ontology) used to generate RDF
    • Which is stored in graph database

Deborah McGuinness

  • Applications of LLMs at RPI
    • Collaborative KG generation by leveraging LLMs for refinement and population (value restrictions and instances) of an existing ontology, in partnership with human
      • Enhancing wine and cheese ontology
      • But could also provide concepts that are a starting point for a new ontology, for human consideration
    • LLM/KG Fact Checker (ChatBS) “sandbox” with questions submitted (multiple times) to OpenAI completion API and entity linking to Wikidata for validation

Till Mossakowski

Modular design patterns for neural-symbolic integration: refinement and combination

  • Neural networks can extend ontologies of structured objects: from neuro to symbolic
  • Ontology pre-training can improve transformer performance: from symbolic to neuro
  • We can beat purely symbolic and purely neural baselines
  • Design patterns as systematic building blocks => towards a theory of neuro-symbolic engineering
  • Future work: Novel neural embeddings for ontologies

Markus J. Buehler

Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning

  • Navigating generated knowledge graphs can result in new scientific insights