Ontolog Forum
Chat Transcript for Ontology Summit 2018 Contexts in the Open Knowledge Network Session 1
(The chat was slightly edited to fix most spelling errors and soaphub artefacts.)
[12:07] David Whitten: @guha Thank you for the great idea of microtheories. I enjoyed your thesis many years ago.
[12:09] Guha: thank you!
[12:09] Ravi Sharma: Gary - Yes great set of speakers.
[12:10] Ravi Sharma: Greetings Ramanathan Guha
[12:11] David Whitten: OKN - Open Knowledge Network - three talks. each 20 minutes followed by questions. Gary Berg-Cross Introducing. OKN - data/knowledge graphs + executables/web calls. Vicki Tardif Holland (google knowledge graphs team) + two more.
[12:12] David Whitten: She will share slides ?
[12:14] DouglasRMiles: https://s3.amazonaws.com/ontologforum/OntologySummit2018/OpenKnowledgeNetwork/Schema.org-and-OKN--VikiTardifHolland_20180214.pdf
[12:15] David Whitten: schema.org focused on many different formats: micro-data or JSON - polymorphisms - Google, Yahoo, Microsoft, Yandex
[12:16] John Sowa: So far, all of this is trivial info. Everybody knows this.
[12:17] DouglasRMiles: btw: I love how David Whitten takes summary notes
[12:17] David Whitten: Some URLs for schema.org (such as for William Shakespeare) are provided by Wikidata
[12:17] DouglasRMiles: (in this chat that is)
[12:18] David Whitten: @doug thanx. I was worried I would distract folks from talking, but it does help me remember what was discussed.
[12:19] David Whitten: schema.org focuses on classes and relations. maybe 1 of 3 reference Schema.org meanings.
[12:20] John Sowa: There is nothing new in this talk. Would somebody please ask her to skip to something new?
[12:20] DouglasRMiles: @JohnSowa it will be interesting to hear how "rule of least power" of the Semantic Web does and doesn't cripple the logics of applied ontology
[12:21] David Whitten: Schema.org - when to pare down, is schema.org simple enough for webmasters to use? Use Core + Extensions philosophy with more websites : ???.schema.org - focus complexities in extensions. simple www.schema.org to the general team, use subject matter experts for extension web sites.
[12:21] ChristiKapp: Well - there is one point about complexity being a good thing. Schema.org has very little for the special purpose vehicles (motorcycles, trikes, snowmobiles) that is useful. Just saying that as the information is more specialized, the schemas need to follow suit with more details
[12:22] DouglasRMiles: @JohnSowa so hearing what Vicky is saying sets the scope of what Semantic Web expects out of their ontologies
[12:22] MikeBennett: Another interesting thing about schema.org that is not always made clear is the polysemic nature of its terms. This makes it not-an-ontology but is often confused with one.
[12:22] Guha: John, we were asked for a talk about Schema.org and that is what Vicki is talking about.
[12:23] David Whitten: External extensions also exist. Pinterest + GS1 notable examples.
[12:23] Jack Ring: @Sowa. How do you know there is nothing new to the 26 participants? Relax.
[12:24] John Sowa: Douglas, I'm not saying that she doesn't know anything. I'm just saying that so far, this talk is a waste of time. It should have been moved to the end,so that we could just hang up.
[12:24] Ravi Sharma: Notes: Vicki said the W3C can use its development experts to create the standards suggested by domain experts?
[12:24] Guha: John, please feel free to hang up now.
[12:25] DouglasRMiles: @Guha: Well Vicky is doing an awesome job in my opinion describing schema.org/RDF :) So.. this is going to make it easier for Guha to contrast what Cycorp did compared to Schema.org :P
[12:25] ChristiKapp: It's interesting to me to revisit this topic - because even her examples are reminding me of how impossible it is to categorize anything related to specialty vehicles anywhere on the internet (regardless of type of classification). To me it feels like specialized topics that have small market presence are analogous to small towns being cut off when superhighways came in. They became irrelevant because they were not accessible. Just like information that cannot be classified, or products that cannot be found b/c schema.org is insufficient, mean that those non-commodity products eventually become irrelevant.
[12:25] Ravi Sharma: Notes: Extensions based on domains such as auto
[12:25] DouglasRMiles: (second sentence was towards JohnSowa)
[12:26] Michael Wessel: i also have a question
[12:26] John Sowa: Ravi, please don't ask such questions. She is just rambling.
[12:27] ChristiKapp: And will we lost knowledge from the world's overall knowledge base if we do not allow/encourage very detailed, specialty knowledge to be classified in a really easy manner for lay-people?
[12:27] David Whitten: Question for Vicki: what is schema.org's plan for history of ideas, who endorses them, version control, and when terms are no longer "valid" ?
[12:27] Ravi Sharma:
[12:27] Michael Wessel: my question is: what is the plan for continuing the RDFS support of schema.org
[12:28] Michael Wessel: it seems that most recent versions are microformats / json ld only?
[12:28] John Sowa: More rambling.
[12:28] Guha: More obnoxiousness from Sowa
[12:29] Ravi Sharma: @Vicki - thanks for answering Q on vocabularies, implying that we can connect subdomain vocabularies from schema.org
[12:29] Jack Ring: This seems to address ways of mining existing data for correlatives. Does it provide a way to discern what's not in the milieu?
[12:30] David Whitten: Andrew Moore and RV Guha are spokespeople for OKN.
[12:30] Ravi Sharma: @Mayank Kejriwal welcome
[12:30] MikeBennett: I think John is understandably concerned about timing, since I lost control of the timing last week (and did some rambling of my own) at the expense of questions. Apologies for that. I have confidence in Gary's time management.
[12:31] David Whitten: RV Guha will focus on high level view with details in questions. Knowledge Graphs are light on inference.
[12:33] Ravi Sharma: @guha -On personal assistants such as Google Home, today there is no capability to connect TV to itself as audio amplifier, but on the other hand sometimes it answers a complex question? When will common usage increase?
[12:33] David Whitten: Knowledge Graphs treated as assets. Broader community is excluded. Guha uses example of Commercial internet changing DARPA Internet. Low upfront costs, no permission structure, no security, no commerce.
[12:33] pfps: Is there a good pointer to the GS1 ontology that is part of schema.org?
[12:34] Vicki Tardif Holland: https://www.gs1.org/gs1-smartsearch
[12:34] BobbinTeegarden: Guha's slides available?
[12:35] Gary Berg-Cross: @PFPS hopefully Vicki is reading the chat and can provide a link for you.
[12:35] David Whitten: OKN has small set of core protocols (DJW:like http?) and small data models
[12:36] Ravi Sharma: @Guha - we have deep data archive searches such as worldwidescience.org.
[12:37] BobSchloss: @Guha Will you discuss any kind of "consistency" or "coherence" services needed for the multi-contributor sourced Open Knowledge Net? Or is the problem of contractions and relative significance of different assertions in the graph left to the OKN using software?
[12:37] DouglasRMiles: and @pfps https://www.gs1.org/docs/EDI/xml/3.3/GS1_XML_3.3_Publication.zip
[12:37] pfps: @Vicki: Yes, I saw that, but where is there a document that gives us the actual ontology?
[12:38] David Whitten: Curated data - possible but high resource use to setup/standardize. Datasets are driving narrow research. Wikidata vs. Schema.org. - both broad coverage. Centralized control vs. Decentralized.
[12:39] BobSchloss: 50% of US/EU Commerce email using schema.org tags... was this driven by the fact that email services such as GMAIL understand these tags?
[12:39] Patrick Maroney: [OPERATIONAL ISSUE] To the Organizers: a few moment into the second speaker, my system was rendered inoperable due to 100% consumption of CPU by the Bluejeans Application on MAC OSX. Force Quitting BlueJeans restored nominal behavior.
[12:40] Vicki Tardif Holland: @pfps: The browsable version is at https://www.gs1.org/voc/. There are machine-readable versions for download at https://www.gs1.org/gs1-smartsearch/1-6
[12:40] John Sowa: Guha, Thank you for having slides that show some preparation. Any speaker who rambles without any preparation needs to be told that it's not appreciated.
[12:41] David Whitten: Schema.org is too centralized. too web-focused. use web as analogy for knowledge support. How to find/stitch together disparate data, knowledge. Can graph theory (DJW: CG's ?) handle this?
[12:41] pfps: @Douglas: OK, I got that, but when I unzipped it there appears to be a lot of non-product stuff - invoices, orders, transport, etc. Where is the product ontology?
[12:41] Ravi Sharma: @guha - what would n-ary provide in terms of accuracy of search? more parameters to include or something else?
[12:42] Cory Casanave: Merging a global graph without context (including time) would result in some unfortunate conclusions.
[12:43] DouglasRMiles: you have to unzip a little deeper.. for example.. BMS_Package_Transport_Capacity_Requirements_r3p3p0_i1_01Mar2017.zip\Schemas\gs1\shared
[12:44] David Whitten: How to coordinate vocabulary, View as Table, want effort to be on the order of the number of columns in a table versus the number of rows in a table (DJW: doesn't this depend on the schema itself?)
[12:45] pfps: @Vicki: OK, I'm now in a bunch of tiny web pages, all different. Is there a way of getting the ontology in some form that I can see the forest for the trees?
[12:46] Ravi Sharma: @guha - when are we going to handle aggregates or entities above data, i.e. at meaningful data aggregation such as an entity that would allow us to build and interconnect datasets (aggregates) to synthesize knowledge?
[12:46] pfps: @Vicki: As I bounce around the web pages, it looks as of the product ontology is small. For example, there appears to be nothing under Beverage.
[12:46] BobbinTeegarden: Where are Guha's slides?
[12:47] DouglasRMiles: @pfps@ Admittedly
<xsd:import namespace="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader" schemaLocation="../../sbdh/StandardBusinessDocumentHeader.xsd"/> <xsd:import namespace="urn:gs1:shared:shared_common:xsd:3" schemaLocation="../shared/SharedCommon.xsd"/> <xsd:import namespace="urn:gs1:ecom:ecom_common:xsd:3" schemaLocation="eComCommon.xsd"/>
[12:47] DouglasRMiles: One would need to grovel all the sites
[12:47] KenBaclawski: Pay Hayes asks: I can't rise a hand, but question: how does the large network protect itself from hostile attacks of false data?
[12:47] Ravi Sharma: @ken - can we get a pdf of Guha's presentation?
[Added later] KenBaclawski: @Ravi - yes they have been posted on the meeting page.
[12:47] pfps: @Pat: blockchain :-)
[12:48] Gary Berg-Cross: @pat, We will ask you Q after John.
[12:49] Russ Reinsch: Is there an approved way of saving the chatlog, or is it auto saved to a member's local machine?
[12:49] pfps: @Douglas: What's what XML supposed to do?
[12:50] Ravi Sharma: Notes: John asked derive knowledge by reading books? What Google and other companies are doing to make progress?
[12:50] DouglasRMiles: @pfps I've looked around for a while for real world uses of RDF/Metadata.. what i am seeing in GS1 one of the first actual ideal uses
[12:50] Vicki Tardif Holland: @pfps: I am not sure which pages you are on. The GPC standard is at https://www.gs1.org/gpc/dec-2017
[12:50] David Whitten: The Electronic Health Record system I use (VistA) has over 50,000 columns, not counting vocabularies such as ICD-10 or SNOMED CT. There are a huge number of people formalizing a medical data alone.
[12:51] Gary Berg-Cross: @Russ Ken saves the chat and posts it on the session page.
[12:51] Russ Reinsch: Graci
[12:51] pfps: @Douglas: You appear to be saying that there are no more product categories in GS1, i.e., nothing for fruit juice and thus nothing between Beverage and Ocean Spray Medium Pulp Organic Orange Juice 32-count 8oz tetra-pack box.
[12:52] David Whitten: @sowa reminds us of Allen AI which is using word-problems and diagrams in geometry to understand (much as humans do)
[12:52] DouglasRMiles: @pfps i am saying if we loaded all the .xml/rdf/xsd files we would get that connection between Beverage and Ocean Spray
[12:53] Ravi Sharma: Notes: John asked Students in Geometry they would interpret diagram and Q (Text) together and they do pretty good. Watson also doing it So we have to understand the text then less likelihood of pattern matching.
[12:53] FrankOlken: Could folks who are not speaking mute their audio to reduce echo?
[Added later] Ken Baclawski: @Frank - I have been muting people as quickly as I can. Sorry about the echos.
[12:53] David Whitten: @sowa is concerned about text-only representations used for shallow pattern matching.
[12:53] DouglasRMiles: @pfps though we wont see an upper ontology in a single file
[12:54] DouglasRMiles: @Vicky (unless i am wrong)
[12:54] DouglasRMiles: (about what i just said)
[12:54] Ravi Sharma: Notes: Guha said schema.org does not do it. but John wants to get to semantics. Guha there is history of AI systems, they have systems that are spectacular but shallow!
[12:54] pfps: I think that John's concern is that schema.org just doesn't have the facilities to support knowledge acquisition.
[12:54] FrankOlken: How does schema.org envision resolving name conflicts?
[12:54] Ravi Sharma: Notes: John said multiple datasets,
[12:56] pfps: @Douglas: Yes, there might be an Ocean Spray page stating that one of their products is a beverage. What I'm missing is the ability to say that that beverage is fruit juice.
[12:56] DouglasRMiles: Though I understand to "us ontologists" we wont get our semantics we wanted to see. .but we will get datasets to create semantics
[12:56] David Whitten: Pat Hayes asked about how to protect false-data attacks. Guha warns that "falseness" depends on the observer. The web doesn't protect us, so why should this?
[12:57] Ravi Sharma: Notes: Pat 's Q - how do you protect from hostile data? Guha - idea behind is weather based on location, resolution is important?
[12:57] Ravi Sharma: Notes: Cory - how do you account for timeframes imp for context, otherwise inaccurate conclusions?
[12:58] David Whitten: @cory asks - do we have a flat context ? how do we resolve consistency?
[12:58] David Whitten: DJW: I ask how do we handle micro-theories in a global space ?
[12:58] Vicki Tardif Holland: @DouglasRMiles correct. I don't know if a single file containing all of GS1's vocabulary.
[12:59] MikeBennett: The polysemic nature of schema.or also means that the interpretation of a given tag would presumably depend on context - e.g. Loan may be a product, a contract or a draw down.
[12:59] KenBaclawski: Pat Hayes continued with: The difference between the Web and this Net is, the Web is read by human beings. I wasn't worried about honest differences of opinion so much as the kinds of thing we have seen with international hostilities using Facebook and other social media.
[12:59] ToddSchneider: Are there any attempts to 'mine' the labels/identifier/names (used on the web) to 'correlate' (or map) them to 'things' in a foundational/upper ontology?
[13:01] Ravi Sharma: Notes: Guha suggestion is to chip at it a bit at a time?
[13:02] KenBaclawski: Alessandro commented on Pat's question: some people are looking into Blockchain for that problem @Pat Message from Alessandro Oltramari: https://www.slideshare.net/hedugaro/strategies-for-integrating-semantic-and-blockchain-technologieshttps://www.slideshare.net/hedugaro/strategies-for-integrating-semantic-and-blockchain-technologies
[13:02] DouglasRMiles: Question to Guha is it going to be possible for OKN to create a hierarchy between Metadata documents?
[13:02] ToddSchneider: Ken, could you copy the Bluejeans 'Chat' messages into the Soaphub chat?
[Added later] Ken Baclawski: @Todd - Will do.
[13:02] pfps: @Douglas, Vicki: So I finally got to https://www.gs1.org/gpc-food-beverage-tobacco/dec-2017 and downloaded the files there, which at least has more product categories. But then I don't understand how this connects to schema.org. I see a text file and a spreadsheet, but nothing that looks like it can be combined with the schema.org ontology.
[13:03] Russ Reinsch: I guess I need to do more than unmute
[13:03] pfps: I thought I was making a joke about blockchain, but I guess I should have known better.
[13:03] ChristiKapp: Pat Hayes continued: And I agree, central authority is not a good idea, but some kind of social mechanism for removing bad data, along the lines of Wikipedia? It needs some apparatus of policing to intervene at times but is not 'central'
[13:03] KenBaclawski: Pat Hayes replied to Alessandro: And I agree, central authority is not a good idea, but some kind of social mechanism for removing bad data, along the lines of Wikipedia? It needs some apparatus of policing to intervene at times but is not 'central'. Message from Pat Hayes: Thanks, AO
[13:04] Russ Reinsch: My question was about what Guha was saying earlier, that "the problem is not URIs," rather the selection of what extension to prioritize?
[13:04] Ravi Sharma: @guha- when do we extensively use aggregates of meaningful data that integrate not only text but image or video features?
[13:04] Russ Reinsch: I didn't hear the last part accurately
[13:05] DouglasRMiles: I am concerned that Scheme.org needs a curation system such as CYC
[13:05] Guha: i cant hear anything
[13:05] Russ Reinsch: Something he said earlier
[13:05] DouglasRMiles: (At least to get things aligned up)
[13:05] Russ Reinsch: Note the part in quotation marks
[13:07] DouglasRMiles: Market = Adoptions SQL Admins
[13:11] David Whitten: Mayank Kejriwal is starting his presentation re "Context-Rich social uses of knowledge graphs"
[13:12] David Whitten: DJW: not sure how knowledge graphs are different from semantic nets.
[13:14] Ravi Sharma: @Guha and @John - thanks for answering, may be soon items such as vocabularies and filtered visual extractions that are meaningful, together will provide us knowledge terms that perhaps future student use like we use data today? i.e. for relating as in RDBMS and also as in UML models, etc.
[13:16] Ravi Sharma: @Mayank K. - What constitutes "good" and how you filter it?
[13:18] David Whitten: Domain specific search of JSON docs, web sites, etc.
[13:19] David Whitten: used by law-enforcement. Cross links to search for missing people
[13:19] Ravi Sharma: @Mayank - the Law enforcement Graph that you showed, what aspects of that would be covered in NIEM vocabularies or NIEM Conformant Terms?
[13:19] David Whitten: backpage.com used as archive?
[13:20] David Whitten: creates dossier about telephone numbers, pictures, etc.
[13:22] Gary Berg-Cross: @Mayank & @Pat People/groups who want to deceive an application will also know some of the justifications you use to decide on knowledge validity.
[13:22] Ravi Sharma: Notes: Mayanak said ontology based on purpose to identify important entities such as suspicious phone numbers?
[13:24] David Whitten: analogy with psycho-analysts : we can't see inference procedure, nor knowledge used to infer conclusions.
[13:25] David Whitten: argues many ontology designers may compromise ability to maintain ontology.
[13:26] David Whitten: Domains include who is the user , what questions do they ask, and what examples exist?
[13:26] Gary Berg-Cross: Does each "domain" have 1 microtheory?
[13:28] David Whitten: what is useful? depends on utility. is "noise" in data similar to "garbage" or "context" - defined by what it is NOT, rather than what it is.
[13:29] John Sowa: What is useful depends entirely on your intentions. All those words are related: useful, goal, purpose, intention.
[13:29] Ravi Sharma: @Mayank - how do we filter noise to give meaningful info, you imply examination of patterns of usage?
[13:29] David Whitten: @gary, I doubt a "domain" has only one microtheory. Even with a knowledge spindle representation, you have at least three microtheories.
[13:30] Ravi Sharma: @John - yes I see what you have been saying in past sessions.
[13:31] David Whitten: The vocabulary microtheory provides constants that need to be defined, the middle microtheory provides general category level rules, the data/instance microtheories provide specific uses of them.
[13:31] David Whitten: all three microtheories are linked by "genlMt" links into a spindle.
[13:32] David Whitten: I reiterate my personal question: how is this knowledge graph from OKN different from a semantic net from yesteryear ?
[13:33] Ravi Sharma: @John - I agree with Intention.
[13:33] MikeBennett: There needs to be a good way of presenting contextual components of e.g. upper ontology based partitions (e.g. when / how to define something as relative v independent) for any effort where multiple people will contribute, otherwise ontology will get muddled.
[13:34] Ravi Sharma: @john - it implies a conscious aware entity and thus also helps us reduce a very large or unbound problem and reduce it to a subdomain related issues and Contexts
[13:35] Russ Reinsch: My audio was wrecked the whole time he was talking about the 2nd type of questions
[13:37] Ravi Sharma: @Russ - do you want to type and we will read your Qs. otherwise you may want to review the recordings?
[13:37] Russ Reinsch: Oh yes I would like to review the recordings.
[13:37] MikeBennett: What we need is a better way to get to graphical / UI based framing of semantic queries that normal people can use.
[13:39] Russ Reinsch: Not sure why my machine is not working well with blue jeans. I've not had video at all.
[13:40] David Whitten: DJW: The issue of provenance is relevant to the issue of lying information sources.
[13:41] Guha: I need to leave for my next meeting in 5 min. So, if there is any question directed at me, I can take that
[13:41] Ravi Sharma: @Ram Sriram - Is there any emerging test site like NIST reflectors were, that validate or confirm the truth of many website postings, at least for specified items such as news?
[13:42] Gary Berg-Cross: @Guha OK, thanks and thanks for being generous with your time and thoughts.
[13:43] Ravi Sharma: @Ram - at least something to repudiate wrong postings?
[13:43] John Sowa: Semantic networks were widely discussed until the semantic web caused all previous knowledge to disappear.
[13:43] Ravi Sharma: @Ram - on Govt sites
[13:44] Ram D. Sriram: @Ravi: Not to the best of my knowledge (at least in the current context)
[13:44] John Sowa: For some history, see http://jfsowa.com/pubs/semnet.pdf
[13:46] Gary Berg-Cross: BTW, Next week's session is "Contexts for Integration and Interoperability Session"
[13:46] MikeBennett: Hmmm, could one run a distributed ledger-type thing that minted annotations with confidence / provenance behind them??
[13:47] Gary Berg-Cross: Session 2 of the OKN track is 28, March 2018 Consider the role of Context Speakers are:
Charles Klein (or someone from CycCorp) to discuss context & the use of microtheories in Cyc.
Vinh Nguyen (Kno.e.sis Center, Wright State University) Semantic Web foundation on representing, reasoning and traversing Contextual Knowledge Graphs and
Amit Sheth (Kno.e.sis Center): Evolving a Health KG.
[13:50] KenBaclawski: B. Ulicny, K. Baclawski and A. Magnus. New Metrics for Blog Mining. In Proc. SPIE - Volume 6570 Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security, B. Dasarathy (Ed). (April 9, 2007) [1]
[13:51] Gary Berg-Cross lowered your hand
[13:51] Ravi Sharma: Notes: Mayank said this can be done manually - that the shown Law enforcement schema can be mapped to NIEM Schema. Or else a new schema between these two can be generated.
[13:52] Ravi Sharma: @mayank - I wanted to know this in context of identifying the false elements in the schema may be introduced by intruders etc?
[13:53] Ravi Sharma: @Mayank - one would have to identify valid sites and valid postings?
[13:55] TerryLongstreth: Timeliness - another slippery concept. I've most often heard it used in terms of utility of a datum; did I get this in time for it to be useful?
[13:56] John Sowa: For historical info, primary sources are important. For a railroad accident in 1905, a copy of "Engineering News" from 1905 is probably more relevant than a modern source. But the page rank of a 1905 document is likely to be low.
[13:59] TerryLongstreth: (continuing) So, in a context discussion, timeliness is relevant only to the context (and intentions) of the recipient of a datum.
[13:59] Ravi Sharma: Notes: Provenance, credibility, buyers, sensationalism, systems that tell about plausibility, were discussed, to reduce noise, new technologies, for decision support, knowledge representation by inferences, and by putting together with knowledge extraction would be valuable.
[14:00] Mayank: @ravi DIG crawls from the Web using domain discovery, which is a combination of crawling and reinforcement learning. In general, identifying relevant data is very difficult but hopefully OKN can help with that!
[14:00] David Eddy: Aspects of these conversations remind me of the joke of looking for lost keys... under the streetlight. But the keys are actually over in the dark.
[14:00] DouglasRMiles: I keep looking for an ontology of conversational intentions
[14:00] DouglasRMiles: With asserts/queries about constructing such a conversational model
[14:01] DouglasRMiles: I mean the intentions would be to populate a model
[14:01] ChristiKapp: What is the reference to timeliness document that was shown?
[14:01] DouglasRMiles: (both the conversational model and the model being discussed)
[14:03] ChristiKapp: NM - found it
[14:03] Ravi Sharma: Mayank - thanks for your responses and please see more Qs on chat
[14:03] DouglasRMiles: @Christi link
[14:06] DouglasRMiles: n/m found it was in the last link :)
[14:48] David Whitten: I think RV Guha's pdf is: https://www.nitrd.gov/nitrdgroups/images/9/96/OKN_Moore_Guha.pdf