Ontolog Forum
Joint DATA.GOV-ONTOLOG "Big Open Data" Session - Thu 2012_05_10
Session Topic: "Fostering 'Big Open Data' in government through Open Collaboration - invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives"
Session Co-chair: JeanneHolm (Data.gov / NASA-JPL) & PeterYim (Ontolog / CIM3) - slides
Panel Briefings from:
- ChrisMusialek (Data.gov / GSA) - "Empowering City Developers with Federal Data" - slides
- AndrewNicklin (New York City) - "Opening up municipal government data: past, present, and future" - slides
- JoelNatividad (Ontodia) - "Smart Cities and Big Open Data" - slides
Archives
- Abstract
- Agenda
- Prepared presentation material (slides) can be accessed by clicking on each of the title links below:
- [ 0-Holm ] . [ 1-Musialek ] . [ 2-Nicklin ] . [ 3-Natividad ]
- Audio recording of the session [ 1:36:56 ; mp3 ; 11.1 MB ]
- Transcript of the online chat during the session
- Additional Resources
Conference Call Details
- Date: Thursday, 10-May-2012
- Start Time: 9:30am PDT / 12:30pm EDT / 6:30pm CEST / 5:30pm BST / 16:30 UTC
- ref: World Clock
- Expected Call Duration: 1.5~2.0 hours
- Dial-in:
- Phone (US): +1 (206) 402-0100 ... (long distance cost may apply)
- ... [ backup nbr: (415) 671-4335 ]
- when prompted enter PIN: 141184#
- Skype: joinconference (use the PIN above) ... generally free-of-charge, when connecting from your computer)
- for skype users who have trouble with finding the Skype Dial pad ... it's under the "Call" dropdown menu as "Show Dial pad"
- Phone (US): +1 (206) 402-0100 ... (long distance cost may apply)
- Shared-screen support (VNC session), if applicable, will be started 5 minutes before the call at: http://vnc2.cim3.net:5800/
- view-only password: "ontolog"
- if you plan to be logging into this shared-screen option (which the speaker may be navigating), and you are not familiar with the process, please try to call in 5 minutes before the start of the session so that we can work out the connection logistics. Help on this will generally not be available once the presentation starts.
- people behind corporate firewalls may have difficulty accessing this. If that is the case, please download the slides above (where applicable) and running them locally. The speaker(s) will prompt you to advance the slides during the talk.
- In-session chat-room url: http://webconf.soaphub.org/conf/room/ontolog_20120510
- instructions: once you got access to the page, click on the "settings" button, and identify yourself (by modifying the Name field from "anonymous" to your real name, like "JaneDoe").
- You can indicate that you want to ask a question verbally by clicking on the "hand" button, and wait for the moderator to call on you; or, type and send your question into the chat window at the bottom of the screen.
- thanks to the soaphub.org folks, one can now use a jabber/xmpp client (e.g. gtalk) to join this chatroom. Just add the room as a buddy - (in our case here) ontolog_20120510@soaphub.org ... Handy for mobile devices!
- Discussions and Q & A:
- Nominally, when a presentation is in progress, the moderator will mute everyone, except for the speaker.
- To un-mute, press "*7" ... To mute, press "*6" (please mute your phone, especially if you are in a noisy surrounding, or if you are introducing noise, echoes, etc. into the conference line.)
- we will usually save all questions and discussions till after all presentations are through. You are encouraged to jot down questions onto the chat-area in the mean time (that way, they get documented; and you might even get some answers in the interim, through the chat.)
- During the Q&A / discussion segment (when everyone is muted), If you want to speak or have questions or remarks to make, please raise your hand (virtually) by clicking on the "hand button" (lower right) on the chat session page. You may speak when acknowledged by the session moderator (again, press "*7" on your phone to un-mute). Test your voice and introduce yourself first before proceeding with your remarks, please. (Please remember to click on the "hand button" again (to lower your hand) and press "*6" on your phone to mute yourself after you are done speaking.)
- Please review our Virtual Session Tips and Ground Rules - see: VirtualSpeakerSessionTips
- RSVP to peter.yim@cim3.com appreciated, ... or simply just by adding yourself to the "Expected Attendee" list below (if you are a member of the team.)
- This session, like all other Ontolog events, is open to the public. Information relating to this session is shared on this wiki page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10
- Please note that this session may be recorded, and if so, the audio archive is expected to be made available as open content, along with the proceedings of the call to our community membership and the public at-large under our prevailing open IPR policy.
Attendees
- Attended (along with registered participants who may have joined but missed the roll call):
- Jeanne Holm (co-chair)
- Peter P. Yim (co-chair)
- JoelNatividad
- AndrewNicklin
- ChrisMusialek
- Sami Baig
- George Thomas
- Jesse Wang
- Joanne Luciano
- Hasan Sayani
- Shiva Saberi
- Deirdre Lee
- Martha Przysucha
- Lisa Hartigan
- Jack Park
- Ken Baclawski
- Bob Schloss
- RichardLee
- Kingsley Idehen
- Mike Bennett
- David Mason
- MarieJeanMeurs
- David Collins
- Matthew Kaufman
- Leo Obrst
- Kyoungsook Kim
- Terry Longstreth
- Frank Olken
- Ed Dodds
- Tom Tinsley
- Jack Ring
- Jerry Smith
- Pavithra Kenjige
- Ali Hashemi
- Anushri Mishra
- BobLojek
- DeborahMcGuinness
- Denise Warzel
- Elizabeth Florescu
- MarkDixon
- Michael Grüninger
- jgabriel
- lisa_h
- sdupd_glenn
- Nancy Wiegand
- Expecting:
-
- (please add yourself to the list if you are a member of the community, or, rsvp to <peter.yim@cim3.com>)
- Regrets:
- Mike Dean
- Amanda Vizedom (have conflict, but will view/listen to archive later)
- Todd Schneider
- ...
Abstract
Fostering 'Big Open Data' in government through Open Collaboration - invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives
This is the first of two sessions jointly organized by the US federal data.gov initiative and Ontolog. This follows quite naturally from a few very exciting recent events, notably:
- (a) the recent US federal government's thrust toward developing and leveraging 'Big Data'
- (b) the 3-month long Ontology Summit 2012 series of events that just finished a few weeks ago that was focused around the theme 'Ontology for Big Systems' and
- (c) New York City's award of their 'Big Apps 3.0' grand prize to NYCfacets (developed by a member of this community, in an applications that leverages ontology and semantic technologies); and the City's 'open data' initiative that followed.
During today's session, we will look at the NYCfacets app, the New York City open data initiative and contemplate how open collaborative community effort can help foster 'Big Open Data'.
Agenda
Fostering 'Big Open Data' in government through Open Collaboration
- Session Format: this is a virtual session conducted over an augmented conference call
- 1. Opening (chair) - Jeanne Holm [10 min.] ... [ slides ]
- 2. Panel briefings [20 min. each]
- ChrisMusialek - "Empowering City Developers with Federal Data"
- AndrewNicklin - "Opening up municipal government data: past, present, and future"
- JoelNatividad - "Smart Cities and Big Open Data"
- 3. Q & A and open discussion [All: ~30 min.] - (moderated by the chair) -- please refer to process above
- 4. Wrap-up / Announcements - (chair)
Proceedings
Please refer to the above
IM Chat Transcript captured during the session
see raw transcript here.
(for better clarity, the version below is a re-organized and lightly edited chat-transcript.)
Participants are welcome to make light edits to their own contributions as they see fit.
-- begin in-session chat-transcript --
Peter P. Yim: Welcome to the
Joint DATA.GOV-ONTOLOG "Big Open Data" Session - Thu 2012-05-10
Session Topic: "Fostering 'Big Open Data' in government through Open Collaboration
- invited presentation on NYCFacets and introduction to New York City's 'Open' initiatives"
Session Co-chair: Jeanne Holm (Data.gov / NASA-JPL) & Peter P. Yim (Ontolog / CIM3)
Panel Briefings:
- ChrisMusialek (Data.gov / GSA) - "Empowering City Developers with Federal Data"
- AndrewNicklin (New York City) - "Opening up municipal government data: past, present, and future"
- JoelNatividad (Ontodia) - "Smart Cities and Big Open Data"
Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10
Mute control: *7 to un-mute ... *6 to mute
Can't find Skype Dial pad? ... it's under the "Call" dropdown menu as "Show Dial pad"
Proceedings:
Peter P. Yim: Attention ALL: because of time constraints from some of our panelists, we will have to
start promptly today. ... therefore if you have any logistics questions, please be ready to ask them
as soon as you get online, and before we mute everyone!
anonymous morphed into Ed Dodds
anonymous morphed into AndrewNicklin
Peter P. Yim: Hi Andrew!
anonymous morphed into JoelNatividad
Peter P. Yim: Hi Joel, Hi Terry ... and everyone!
anonymous morphed into jgabriel
jgabriel: Hi everyone!
JoelNatividad: Howdy everyone!
anonymous morphed into Deirdre Lee
anonymous1 morphed into Hasan Sayani
anonymous morphed into Jeanne Holm
anonymous morphed into David Mason & MarieJeanMeurs
anonymous1 morphed into Sami Baig
David Mason & MarieJeanMeurs: hey all
Jesse Wang: Hi, all! Good morning, afternoon, evening!
Sami Baig: Hi all
anonymous morphed into chrismusialek
anonymous morphed into Tom Tinsley
anonymous1 morphed into Ed Dodds
anonymous1 morphed into sdupd_glenn
Jack Park: Hi
Peter P. Yim: == Jeanne Holm presenting the intro slides now ...
anonymous morphed into BobLojek
Peter P. Yim: == ChrisMusialek presenting ...
Jeanne Holm: ChrisMusialek is now presenting Empowering City Developers with Federal Data. Slides can
be downloaded at the session page (above)
Jesse Wang: what is the search engine used in data.gov? did you develop your own text/query
analyzer/parser?
Jeanne Holm: Jesse--we are using Bing as the search engine as part of USA.gov's search capability.
one of the things we all want to improve on Data.gov is the search capability. It's currently
limited by both the complexity of the queries you can build, as well as the fact that it only
searches the metadata of the data tool or dataset. We are moving toward a federated model that would
allow us to search the metadata or other attributes of the tools and data sources that are made
accessible.
Jack Ring: Has Data.gov calibrated the false positives and false negatives achieved with keywords?
Jeanne Holm: Jack--I'll have to check on the false positive and false negative calibration.
anonymous morphed into Denise Warzel
Deirdre Lee: Latest W3C Editor's draft of Data Catalog Vocabulary (dcat) (managed by W3C Government
Linked Data (GLD) WG): http://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html
anonymous1 morphed into MarkDixon
AndrewNicklin: Will data.gov look at third-party API key & rate management tools instead of rolling
your own?
Jeanne Holm: Andrew--Data.gov is definitely looking at third-party API tools and any emergent
standards in this area. The intent is that we are a connector amongst a set of data of national
interest, even beyond just the federal government.
Peter P. Yim: == AndrewNicklin presenting ...
anonymous1 morphed into Elizabeth Florescu
AndrewNicklin: The site I mentioned for NYC open data standards is http://www.nyc.gov/datastandards
Peter P. Yim: == JoelNatividad presenting ...
Jack Ring: What will be the relevance of Ward Cunningham's expedition into Federating wiki's?
Jack Ring: Jeanne, Pls do clarify the FP-FN results because both are likely to be dismal.
Jeanne Holm: Jack--completely agree. Just need to verify what's been done.
Ed Dodds: I saw a story about crowdfunding movies this a.m. (passer.by) and it got me thinking: are
there any crowdfunded open data efforts anybody has heard of? Any fellowships sponsored by
foundations?
Jeanne Holm: Ed--like a Kickstarter for open data? Interesting...
DeborahMcGuinness: i like this idea and would be happy to point our students to such a call. I would
also be willing to help sell such a message
Ed Dodds: Yes, I've seen a few stories proposing crowdfunding for investigative reporting
recently--haven't seen if they were actually successful
Jack Park: @Jeanne, re federation: I will be giving a talk at a bigdata meetup with these slides
http://www.slideshare.net/jackpark/big-datasciencemeetup-final
Jeanne Holm: Jack--Very interesting presentation. Is the meeting open for others to attend?
Jack Ring: Jeanne, when you are ready to escape the limits of key words and rapidly assay data with
respect to large, complex, interconnected cominatorial networks I will be happy to offer some
insights. Long buried in highly classified systems but now patented in soon to be implemented in a
chip equivalent to 3,600 microprocessors on a grid.
Jack Park: @Jeanne, afaik it's sold out but contact me jackpark[at]topicquests.org -
http://www.meetup.com/Big-Data-Science/events/51766642/
Jeanne Holm: Thanks Jack.
Jack Ring: @JackPark, I think your slides evidence great work. Thank you. Pls consider joining us at
the Symposium in July at San Jose, particularly the Sunday workshop. http://isss.org/world/index.php
Jack Park: @JackRing I would love to attend isss but it's just not in the budget; my friend Judith
Rosen is giving a tutorial I'd really like to attend. Your workshop, if it's open, I'll try to
attend. Many thanks
Jeanne Holm: Are there any questions for the speakers? We are about to go to question and answer...
Peter P. Yim: when we start, we will ask people to click on their "hand" buttons (lower right) ... and
queue folks up for Q&A and remarks ... amke sure you test your voices first, and start by telling us
who you are.
Deirdre Lee: I have to head out now, but thanks for lots of interesting presentations
Jeanne Holm: Thanks Deirdre!
Leo Obrst: Thanks all, must leave.
Peter P. Yim: == open Q&A and discussion now ...
anonymous morphed into Pavithra Kenjige
Jeanne Holm: Peter P. Yim: Presentations were fantastic. Congratulations to federal people who started
the movement in opening up government data; to NY developers for providing open data; and to Joel
and everyone who provided technology in helping Joel's app stand out from the crowd.
Jeanne Holm: Peter P. Yim: Next week's discussion will focus on the technical details of implementing
some of these solutions.
Jack Ring: Is anyone concerned with cybersecurity/privacy?
AndrewNicklin: Jack Ring: there are two aspects to our approach to security. First is not letting out
sensitive info (comparatively easy); Second is - potentially - evaluating whether our data, when
combined with outside information poses more risks.
sdupd_glenn: We'd love to see some case studies of municipal opendata in order to pitch to
management the benefits of a public-facing GIS system coupled with ERP data (merged visually with
other public data)
sdupd_glenn: a lot of our staff understands the potential of all this, but are unable to articulate
its benefit to the higher ups who control the purse
Jeanne Holm: Are challenges a good way to get developers to focus on and consume government data? Are
there better ways?
Jeanne Holm: JoelNatividad responded: The first time we submitted to a challenge was just to do
something with our partners. The second time was really to accomplish something. It wasn't about the
money, but the recognition and ability to build something useful was what drew us.
sdupd_glenn: sdupd_glenn: For the private sector, yes. For public agencies, the challenge is how to
incentivize the action of making data public in the first place
Mike Bennett: I have to go now - thanks for great presentations
Jeanne Holm: Thanks Mike!
Ed Dodds: It might be that the start up weekend or hackathon model of drawing everyone together
geographically for 48 hours (though I much prefer virtual innovation clusters such as Ontolog) might
be a tactic, especially if you could find sponsorships from firms who are likely to consume the
data, add their own and make a profit.
Ed Dodds: Nonprofits, community foundations, united way types also stand to benefit and could have
skin in the game
JoelNatividad: And to Ed's point, that is what we want to do at Ontodia. We want to collaborate Open
Data with all kinds of databases, both public and private.
JoelNatividad: And do what Bloomberg did for Finance data, and do it for Open Data.
Jeanne Holm: Ed--The hackathon model is good, but as you point out it's really important to have a
business model that brings those ideas to a sustainable service.
AndrewNicklin: @JeanneHolm, Ed Dodds: in terms of sustainability, we've also (informally,
unofficially) considered tiering access to our services such that the costs of operating open data
platform can be recovered from high-volume commercial users.
Terry Longstreth: @JeanneHolm - I agree that sustainability needs to be considered. Moreover, data
ages quickly, and there's little in today's talks about maintaining data qualilty and timeliness
JoelNatividad: @TerryLongstreth, in NYCFacets, that's why we derived "extrametadata" to characterize
and score each dataset
AndrewNicklin: @TerryLongstreth that's why automation is really important.
Ed Dodds: @JeanneHolm, all: strenuously agree. Caveat: just because something *should* be valuable
doesn't mean the market has "eyes to see" at the time a product is launched
JoelNatividad: [in our "extrametadata"] we score it along freshness, sparseness, uniqueness, no of
downloads, views, etc., and we plan to make the scoring algorithm transparent and not opaque, so
publishers can respond; and in the future, we do plan to do time series as well, but not yet.
JoelNatividad: @TerryLongstreth, we're actively tracking the wikidata effort and will sync up with
that
JoelNatividad: so "facts" and unstructured free form text are separated
anonymous1: In regards to competitions concerning opengov-- Chicago recently held a contest to
encourage app development and recieved a toyal of 60 submissions. Chicago's he open data portal
stats include:
328 datasets 470,000+ embeds 1000 + user views 50+ apps
Ed Dodds: Toronto's @buzzdata tries to socialize static data sets (streams too maybe?) marketing,
news gathering, conferences, higher education all could benefit; but I think until we get a mass of
aggregated micropayments for data feeds the challenge to fund will continue
Jack Park: Great conference. Many thanks to the speakers.!
Kingsley Idehen: Please upload the slides to slideshare etc..
Jeanne Holm: @Kingsley the slides are at
http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10 for today's call.
Kingsley Idehen: @JeanneHolm: I've seen the presentations, but just suggesting slideshare for broader
audience etc..
Ed Dodds: + 1 Kingsley
JoelNatividad: We will upload the nycfacets overview to slideshare as well
Ed Dodds: There may be a few up at http://www.slideshare.net/eddodds/ already
Jack Park: A thought: having slides up at slideshare means they can be viewed without downloading.
sdupd_glenn: slideshare, yes please
sdupd_glenn: Can I please suggest that someone or agency put together a series of webinars for
agencies that know opendata is crucial but cannot get C-suite or management approval to get an
opendata program started in the first place (all of us local municipalities and districts)!
Peter P. Yim: thank you all, great session!
JoelNatividad: Thanks everyone! Special mention to Peter for all the great work to make this
possible!
Sami Baig: Thank you all!
anonymous morphed into lisa h
sdupd_glenn: we local municipalities feel like we get to applaud the state and federal efforts but
have no funding or champions to help us get off the ground. We can't participate without your help!
Jeanne Holm: @sdupd_glenn I'm happy to help out with providing discussions on the value of open data
to cities and municipalities. Part of Cities.Data.gov (coming soon!) will be to do that as well.
Feel free to reach out to me at jholm@jpl.nasa.gov
sdupd_glenn: Jeanne Holm: Great I'd love to discuss with your team. We've been in touch with you
already via Barbara Moreno
Peter P. Yim: come back, same time next week, when we will cover the technical aspects of the same
subject next Thursdau (May-17)
Peter P. Yim: -- session ended: 11:18am PDT --
-- end of in-session chat-transcript --
... More Questions
- For those (who are members of the Ontolog community) who have further questions or remarks on the topic, please post them to the [ontolog-forum] so that everyone in the community can benefit from the discourse.
- information about joing the Ontolog community can be found here.
Additional Resources
- the US federal Data.gov initiative - http://data.gov
- Cities.data.gov initiative - http://cities.data.gov
- The 'NYCfacets' app - http://nyc.pediacities.com/facets/
- announcement on NYCFacets winning the Grand Prize at NYCBigApps 3.0
- NYC Open Data - http://nyc.gov/datastandards or http://nycopendata.pediacities.com/wiki/index.php/NYC_Open_Data
- webcast of the White House 'Big Data' event of 29-Mar-2012 - http://www.nsf.gov/news/news_videos.jsp?cntn_id=123607&media_id=72174
- initiatives and related program solicitations highlighted during the above event can be found on the NSF press release at: http://www.nsf.gov/news/news_summ.jsp?cntn_id=123607
- a presentation of the upcoming NITRD 'Big Data Challenge' at the 13-Apr-2012 OntologySummit2012_Symposium - http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2012_Symposium#nid397A
- SMWcon-Spring2012 - http://smwcon.projecthalo.com/index.php/SMWCon_Spring_2012
- OpenOntologyRepository (OOR) - http://oor.net
- Ontology Summit 2012: "Ontology for Big Systems"
- read the OntologySummit2012_Communique - the collaboratively authored work product by the Ontology Summit community that took 3 months in rigorous discourse to develop
- Quick! - the solicitation for endorsement of this communique is still open (it closes by end-of-day 12-May-2012). Therefore, if you can read through it, and sends in your endorsement quickly, you will be permanently added to the roster of endorsers for this historical document! ( ... instructions for endorsement are available near the top of the communique page.)
- Join us, same time next week, when we will feature a sequel to today's session. Next Thursday's (2012.05.17) session is entitled: "Implementing 'Big Open Data' in government through Open Collaboration" - case examples and possibilities" where we will spend a bit more time on the technical details, and expose the community on some of the state-of-the-art in implementations.
Audio Recording of this Session
- To download the recording of the session, click here
- the playback of the audio files require the proper setup, and an MP3 compatible player on your computer.
- Conference Date and Time: 10-May-2012 9:34am~11:18am PDT
- Duration of Recording: 1 Hour 37 Minutes
- Recording File Size: 11.1 MB (in mp3 format)
- suggestions:
- its best that you listen to the session while having the respective presentations opened in front of you. You'll be prompted to advance slides by the speaker.
- Take a look, also, at the rich body of knowledge that this community has built together, over the years, by going through the archives of noteworthy past Ontolog events. (References on how to subscribe to our podcast can also be found there.)
For the record ...
How To Join (while the session is in progress)
- 1. Call in from a phone or from skype: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10#nid3AVO
- 2. Open chat in a new browser window: http://webconf.soaphub.org/conf/room/summit_20120510
- 3. Download presentations for each speaker here: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_05_10#nid3AVE
- or, 3.1 (access our shared-screen vnc server, if you are not behind a corporate firewall)