Ontolog Forum
Session | Track C Session 2 |
---|---|
Duration | 2 hour |
Date/Time | Apr 19 2017 18:30 GMT |
9:30am PDT/12:30pm EDT | |
5:30pm BST/6:30pm CEST | |
Convener | DonnaFritzsche and RamSriram |
Ontology Summit 2017 Track C Session 2
Ontologies and Reasoning
Video Teleconference: https://bluejeans.com/768423137
Meeting ID: 768423137
Chat room: http://bit.ly/2lRq4h5
Please use the chatroom above. Do not use the video teleconference chat, which is only for communicating with the moderator.
When you use the Video Conference URL above, you will be given the choice of using the computer audio or using your own telephone. Some attendees had difficulties when using the computer audio choice. If this happens to you, please leave the meeting and reenter it using the telephone choice. You will be given a telephone number to call along with an access code.
Agenda
- Introduction by Donna Fritzsche and Ram D. Sriram Slides
- 12:30 – 1pm
- Title: Combining Statistics and Semantics to Turn Data into Knowledge Slides
- Speaker: Dr. Lise Getoor
- Abstract: Addressing inherent uncertainty and exploiting structure are fundamental to turning data into knowledge. Statistical relational learning (SRL) builds on principles from probability theory and statistics to address uncertainty while incorporating tools from logic to represent structure. In this talk I will overview our recent work on probabilistic soft logic (PSL), an SRL framework for collective, probabilistic reasoning in relational domains. PSL is able to reason holistically about both entity attributes and relationships among the entities, along with ontological constraints. The underlying mathematical framework supports extremely efficient inference. Our recent results show that by building on state-of-the-art optimization methods in a distributed implementation, we can solve large-scale knowledge graph extraction problems with millions of random variables orders of magnitude faster than existing approaches.
- Bio: Lise Getoor is a professor in the Computer Science Department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests are in machine learning and reasoning with uncertainty, applied to graphs and structured data. She also works in data integration, social network analysis and visual analytics. She has multiple best paper awards, an NSF Career Award, and is an Association for the Advancement of Artificial Intelligence (AAAI) Fellow.[ She has edited a book on Statistical relational learning that is a main reference in this domain. She has published many highly cited papers in academic journals and conference proceedings. She has also served as action editor for the Machine Learning Journal, JAIR associate editor, and TKDD associate editor. She is a board member of the International Machine Learning Society, has been a member of AAAI Executive council, was PC co-chair of ICML 2011, and has served as senior PC member for conferences including AAAI, ICML, IJCAI, ISWC, KDD, SIGMOD, UAI, VLDB, WSDM and WWW.
- 1pm – 1:30 pm
- Title: Reasoning about Scientific Knowledge with Workflow Constraints: Towards Automated Discovery from Data Repositories Slides
- Speaker: Dr. Yolanda Gil
- Abstract: The automation of important aspects of scientific data analysis would significantly accelerate the pace of science and innovation. Although there has been a lot of work done towards that automation, the hypothesize-test-evaluate discovery cycle is still largely carried out by hand by researchers. A great challenge is capturing a wide range of knowledge involved in scientific discovery processes. In this talk, I will describe our ongoing research on capturing scientific knowledge about data and analytic processes to assist scientists in analyzing data systematically and efficiently while providing customized explanations of their findings. I will present our representations of hypotheses, their provenance, their evolution, and the methods to test and reassess them. Our representations combine ontologies and metadata together with constraints and workflows. These representations are used in the DISK framework for automated discovery from data repositories, which tests user-provided hypotheses using expert-grade data analysis strategies and reassesses hypotheses when more data becomes available. We have used DISK to reproduce the findings of a seminal article in cancer multi-omics. I will discuss open avenues of research in intelligent systems for scientific discovery.
- Bio: Dr. Yolanda Gil is Director of Knowledge Technologies and Associate Division Director at the Information Sciences Institute of the University of Southern California, and Research Professor in the Computer Science Department. She received her M.S. and Ph. D. degrees in Computer Science from Carnegie Mellon University, with a focus on artificial intelligence. Her research is on intelligent interfaces for knowledge capture, which she investigates in a variety of projects concerning knowledge-based planning and problem solving, information analysis and assessment of trust, semantic annotation and metadata, and community-wide development of knowledge bases. In recent years, Dr. Gil has collaborated with scientists in different domains on semantic workflows, metadata capture, social knowledge collection, and computer-mediated collaboration. She is a Fellow of the Association for Computing Machinery (ACM), and Past Chair of its Special Interest Group in Artificial Intelligence (SIGAI). She is also Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), and was elected as its 24th President in 2016.
- 1:30pm – 2:00pm
- Title: Applications of Ontologies to Biologically Inspired Design Slides
- Speaker: Dr. Spencer Rugaber
- Abstract: Biologically inspired design (BID) is a methodology for designing engineeringsystems by using analogies from the natural world. Given that knowledge of biological systems is normally provided in the form of textual documents, the question becomes how to extract design knowledge about biological processes from scientific articles? Among the difficulties that arise is relating the vocabulary of the engineer needing ideas with that of the biologist authoring the documents. Ontologies can play an important role in addressing this problem. In this talk, we present IBID, a semi-automated approach for extracting models of biological systems from documents describing them. IBID's use of ontologies and the reasoning approaches it uses, subsumption and analogy, are discussed as well as applications of the IBID approach to other domains.
- Bio: Dr. Spencer Rugaber is a faculty member in the College of Computing at the Georgia Institute of Technology. His research interests are in the area of Software Engineering, specifically reverse engineering and program comprehension, software evolution and maintenance and software design. Dr. Rugaber has served as Program Director for the Software Engineering and Languages Program at the U. S. National Science Foundation and as as Vice-Chairman of the IEEE Technical Committee on Reverse Engineering.
Attendees
- Alan Rector
- Alex Shkotin
- Barry Nouwt
- Benjamin Grosof
- Bill DeSmedt
- Bobbin Teegarden
- Bob Schloss
- Christi Kapp
- Christof Hasse
- David Newman
- Debora Lina Ciriaco
- Donna Fritzsche
- Eric Scott
- Evan Wallace
- Gary Berg-Cross
- Gavin Matthews
- Jae-wook Ahn
- JongHo Shin
- Jose Parente de Oliveira
- Ken Baclawski
- Lavern Pritchard
- Lise Getoor
- Lynne Frederickson
- Maria Chang
- Mark Underwood
- Max Petrenko
- Michael van Bekkum
- Mike Bennett
- Mike Bobak
- Nancy Wiegand
- Ognyan Kulev
- Ram D. Sriram
- Ravi Sharma
- Rebecca Tauber
- Rob Hausam
- Russell Reinsch
- Russ Reinsch
- Spencer Rugaber
- Terry Longstreth
- Todd Schneider
- Torsten Hahmann
- Valerie Charron
- Yolanda Gil
Proceedings
Donna Fritzsche: resource: http://psl.linqs.org/
Donna Fritzsche: Probabilistic Soft Logic tools
TerryLongstreth: Is there a downloadable version of this talk (Dr. Lise Getoor)?
KenBaclawski: @TerryLongstreth: The video recording of the meeting will be available and can be downloaded.
ravisharma: did we lose the sound?
Donna Fritzsche: welcome ravi!
Donna Fritzsche: no
TerryLongstreth: Sorry Ken, I meant the slides. I wanted to investigate the cs.umd url on one of the early pages.
Donna Fritzsche: @ravi - do you have sound back?
ravisharma: yes had to redial
ravisharma: donna thanks
Donna Fritzsche: @ravi - I had trouble last week, I used the downloadable app today and it is better.
Donna Fritzsche: Thank-you Lise, very interesting talk!
MikeBennett: What are Omics?
ravisharma: I guess multiple items like genomics!
David Newman: I do not see any slides being presented.
Maria Chang: yes, genomics, proteomics, etc. I think that term is used like the term "ism" but I'm not sure.
MikeBennett: @Maria @Ravi thanks - she said something just after I posted my question that made me wonder if that's what was meant, but it's good to be clear.
ravisharma: Yolanda - what are current efforts at aligning reasoning process and consensus on at least digital capture of what scientists call knowledge?
ravisharma: your reference list was a bit dated or at least missed out deeper learning algorithms for search of repositories at Worldwidescience.org at DOE
ravisharma: Yolanda - your work shows perhaps that there is a ned for upper ontology of reasoning, hypothesis leading to knowledge (KR) at least!
Ram D. Sriram: In my opinion OMICS deals with understanding -- in detail -- the various mechanisms at different levels of abstraction, in a particular domain. I thinks OMICS term started with the biology field and has spread to other fields. For example, MATIOMICS (material genome) and SOCIOMICS (Social Network Analysis). Atul Butte, originally at Stanford, wrote an interesting article on this topic.
Maria Chang: @Ram interesting
ravisharma: Yolanda - your hypothesis and workflow shown is a bit deeper than the Deep science repository search engines of doe BUT IT WOULD BE NICE TO SEE HOW THESE REASONING METHODS - YOURS AND doeS INTERACT OR EVEN POSSIBLY INTEROPERATE WITH EACH OTHER TO FIND MORE ACCURATE matches not only at concepts but at exact data sets or knowledge components. Sorry for Capital TYP mistake.
Russ Reinsch: @yolanda - What standards were developed from the OPM
Mark Underwood: Hi Russ
KenBaclawski: @TerryLongstreth: The slides are now available at http://bit.ly/2oPgw7W
ravisharma: how do we get your slides also Yolanda?
TerryLongstreth: @Ken; thanks
Russ Reinsch: Hi Mark
ravisharma: Yolanda - how do you handle interdomain concepts and terms and entities overlap e.g. diagnosis trige using medline etc.
ravisharma: i can be heard
Russ Reinsch: @yolanda - What standards were developed from the OPM
Mark Underwood: Great topic on eScience; I wrote a piece on this general topic for National Geographic via Daily Beast. Wish I'd seen this reference first. It seems to be under-researched. Standards for open data and experimental framework conventions are key. Progress may be, as this talk suggests, domain-specific
Mark Underwood: Yolanda - is there a role for workflow standards like BPMN ?
Gary Berg-Cross: @ken must have been session 2 of track B
EvanWallace: How applicable is Yolanda's work to BPM style workflow (e.g. as represented in BPMN or XPDL)?
MikeBennett: Calling something "ibid" is going to make references interesting...
Gary Berg-Cross: I meant session 1 in March of track B
David Newman: Would like to reach out to Lise Getoor. Is it possible to share her email?
Ram D. Sriram: @david: Check out Lise's URL. The bio and links should be on the Summit Page.
Mark Underwood: Gary, I was thinking of RDA in the context of this previous talk. Relevant?
ravisharma: Spencer - kindly describe how you overcame the UML profiles needed for ontologies as we were encountering these gaps in UML_ODM (ontological Definition Metamodel ) OMG efforts led by Elisa Kendall?
Gary Berg-Cross: Hu
Yolanda Gil: @RussReinsch: The W3C PROV standard was designed for provenance on the web. It had more than 60 implementations at the time we released it (https://www.w3.org/TR/prov-implementations/), so it has had an effect across many areas. For example, I worked with OGC to explore its use for spatial information (https://portal.opengeospatial.org/files/?artifact_id=58967).
Gary Berg-Cross: @Mark, yes Yolanda's work is very relevant to the RDA interest in data management process. We included a small part of this in our early work group effort.
ravisharma: Evan - good question and I thought she componentized workflow and even ranked the possible 3 choices in her example, it would be nice to know if her workfloows exceed the BPMN notaton?
Yolanda Gil: @RaviSharma: I think science gateways and your Worldwidescience.org is very complementary. We have not focused on discovering data sources, we assume we are given the data sources. We focus on what slice of the data is relevant for the user hypothesis/question. So for example, our system is given TCGA (The Cancer Genome Atlas) as a data source, but has to figure out what slice of all the data it needs (eg for one hypothesis it looks for genomics data about a particular patient population, for another it looks for proteomics data, etc.)
Yolanda Gil: @RaviSharma: You make a good point about the need aligning ontologies for scientific reasoning processes, and for an upper ontology of reasoning, hypothesis leading to knowledge. I think we are still not at the point of doing that, we need to first explore how to capture such knowledge for specific domains and then generalize. We have collaborators interested in applying DISK for geoscience, that would help us see how generalizable our representations are.
ravisharma: Yolanda - thanks, for automated knowledge extraction, the search and discovery, sorting relevant data and then applying finer workflow and ecision support might be an integrated approach, I have no examples of beneficiaries for such a heavy effort!
Yolanda Gil: My colleagues use the term cancer multi-omics to refer to all the kinds of analyses done for cancer in genomics, proteomics, transcriptomics, epigenomics, etc.
Russ Reinsch: @yolanda - thank you
Yolanda Gil: @MarkUnderwood: Could you please post a link to your National Geographic piece? Thanks!
Yolanda Gil: @MarkUnderwood: You asked about workflow standards. I have seen many over the years, and they are all very useful. For us, we want our representation to cover both workflow planning and workflow execution. We want to be aligned with the W3C PROV standard, Also, our workflows are DAGs (Directed Acyclic Graphs), so it is typically a subset of what those standards support. I am sure our representations can be easily mapped to those, as we have shown in our work by using both the PROV and OPM standards.
ravisharma: Spencer - I looked briefly at your and Goel's groups SBF efforts. What came to mind is also the need for aggregated functional simulation - say attempting birds navigation system to inertial navigation systems? at what aggregation levels are low hanging fruits in your efforts at applying bio-learned knowledge for design of systems?
Mark Underwood: @Yolanda, unfortunately, the editors excised my paragraphs about eScience automation http://thebea.st/1ke06Cm
Mark Underwood: @Yolanda This piece accompanied a National Geo TV 6 part series
ravisharma: how much aditional benefit do we get by adding bio aspects
Russ Reinsch: @Dr. R - fascinating presentation
KenBaclawski: @Gary Berg-Cross: It was in Track B session 2: http://ontologforum.org/index.php/ConferenceCall_2017_03_15
ravisharma: Spencer - thanks for the explanation. aggregation levels in applying bio-subsystems is still an open q
Mark Underwood: @Russ We reference PROV-O in NIST Big Data WG version 1, though in passing
Russ Reinsch: @mark - noted.
Mark Underwood: Excellent presentations, thanks all. Ping us on Twitter @ontologysummit !
Russ Reinsch: What is the date of the symposium
KenBaclawski: The Symposium is at NIST on May 15 and 16.
Russ Reinsch: Thanks