Archive for the ‘Dataweb’ Category

Phil Windley on XDI

Thursday, August 5th, 2010

Phil Windley, co-founder and CTO of Kynetx (among the many hats he wears), wrote his own rules language, KRL, to “program the Web”. So when Phil writes the following about XDI after he and his team did a two-day deep dive on XDI with XDI4J project founder Markus Sabadello and I, it means a lot.

I haven’t been posting much about XDI because the OASIS XDI Technical Committee (which I co-chair) is still working on the XDI 1.0 technical specs. But since our philosophy has been to code everything in at least one implementation first before committing it to a spec, and since the core XDI graph model and metagraph model are now very solid, by the time the specs come out there will already be multiple operational XDI services.

I hope to finally get time to do many more posts about XDI this fall. In the meantime if you want to learn more, ping me about different ways to get involved.

Joe Andrieu Cuts the Gordian Data Ownership Knot

Thursday, January 21st, 2010

Joe Andrieu has a wonderful way of cutting the Gordian knot on complex socio-technical topics, with clear prose, compelling arguments, and clever illustrations that explain why you should look at something decidedly differently.

Now he wields that knife on the very knotty “problem” of data ownership.

I passionately agree with Joe (and his Kantara Working Group co-chair Iain Henderson) on this subject; I suspect it’s because my perspective on it was long ago warped by the lens of XDI, which itself is a new way of thinking about data.

Turn the telescope to look at personal data from the standpoint of who controls its  sharing with whom, and many pieces finally come into focus.

Keep that in mind as we move into an XDI-enabled world.

Personal Data Stores – The Time is Coming

Monday, December 28th, 2009

This entire fall has been intense with work, thus the paucity of posts here. The holidays brings a welcome respite and a chance to catch up with a few key mental threads.

One of them is the growing awareness of the need for what the VRM community calls personal data stores (PDS). The concept is relatively simple: an online store for your own personal data — anything from classic PII (personally identifiable information), such as your identity and contact data, to any other data that you generate or control (files, blog posts, pictures, papers, music, videos, etc.)

Three things have surprised me about PDS:

  1. How generally accepted the notion is by almost anyone who spends much time online, even folks well outside the identity community. It’s a relatively intuitive idea as soon as you understand the basic premise that individual people should have their own data source online.
  2. How many names have been applied to the same general concept. As I indicated, PDS is only the term applied by the VRM community. The same general concept has been called probably a dozen other names. Here’s an excellent blog post by Mark Dixon that calls it a Personal Identity-Persona Service and a Security Identity Bank Vault.
  3. How hard it is to implement. Though there have been several attempts, such as the Mine! Project, nothing has come remotely close to catching on yet.

I have several theses as to why this is so (and yes, the need for a Internet data sharing standard like XDI is high on the list), but I’ll save those for another blog post.

Here, I’ll just conclude with a simple prediction: it’s a threshold problem. Once the first practical solution for PDS starts to take hold, it will catch on and grow just like the first social networks did. The only question is what application will provide that initial traction.

The Permissioned Web: Open Does Not Mean Public Domain

Wednesday, May 13th, 2009

At the Glue Conference this week I’m enjoying a great set of speakers lined up by Eric Norlin on the topic of how everything in the networked universe gets glued together using Web 2.0 tools and beyond. (The talk Mitch Kapor gave this morning was worth the trip all by itself.)

In a few minutes I’ll be on a panel called Implementing the Open Web. In chatting with Lloyd Hilaiel of Yahoo, Kevin Mullins of MIT, and Phil Windley of Kynetx about this topic last night, we hit on one key point that Phil articulated this way: “People tend to conflate ‘open’ with ‘public domain’, i.e.,  that anything that qualifies as open must be freely available to all.”

It struck me how true this is. It reminds me of the Richard Stallman quote describing open source (cited in the Wikipedia Gratis versus Libre article): “Think free as in free speech, not free beer.”

In terms of data on the Open Web, what this means that even though a particular pool of data may be available via an open standard, publicly-accessible interface, it does NOT mean this data must be publicly available to anyone. If that were true, the whole concept of a personal data store — a key premise of VRM (Vendor Relationship Management) — would not be possible.

So what makes any system or node participating in the Web “open” is not that its data is public, but that the metadata and services for accessing it are available via a publicly discoverable, open-standard interface. The public discovery portion of this is the goal of the XRD work now underway at the XRI Technical Committee at OASIS (based on the original XRDS work – see this blog post by Eran Hammer-Lahav of Yahoo to understand the differences). The open standard portion is the output of IETF, W3C, OASIS, and all the other SSOs (standards-setting organizations) for the net. (The potential of the Open Web Foundation, once it finishes its bootstrap stage, is to make this process of creating open standards even more lightweight and distributed.)

This combination – open discovery of open interfaces accessible over open protocols – is the DNA of the Open Web. And it applies equally to both public and private data. In fact it can finally open up what might be called the Permissioned Web - the Web of all all data that any one party has permission from other parties to access.

That would lead us to the need for integrating identity and permissions with the data, which brings us to the motivations for XDI as a semantic data sharing format/protocol – but my panel is about to start so that will have to be another post.

Joe Nails it Open

Sunday, July 13th, 2008

Joe Andrieu nails another super post (where DOES he find the time to write/draw all of these???), this time about what it means for a platform to really be open.

My favorite part is that he doesn’t just do it in words — he does it in pictures, deliciously simple and understandable graphics that make it really clear what he means by “open platform”. In short, it’s the protocol, stupid!

Or as Joe puts it:

Level 4 platforms allow developers to build applications anywhere–on a website, on your desktop, even on your cell phone–and those applications can talk to any number of platform providers without restriction, using standard open protocols. Many of us have heard of the most successful protocols: SMTP, POP, HTTP, HTML, TCP/IP, RSS, but most users know these by the applications they enable: email, the World Wide Web, the Internet, blogs.

It’s the perfect message before the VRM Workshop starting tomorrow, and of course it’s exactly what we’re driving towards with XDI. One day I hope Joe can say the same thing about XDI – most users will never have heard of it or the dataweb model of data sharing, but they’ll know the application – VRM!

The Data Sharing Summit: Problems and Solutions

Friday, September 7th, 2007

Certain events scream out for live blogging. The Data Sharing Summit is one of them. So these are my notes from first half of Day 1. (Then why are they being posted at midnight, you ask? Because there was too damn much to talk about during the second half of the day. More on that tomorrow.)

First, this is the list of problems that attendees want to see addressed:

  • The distributed schema mapping problem – how do you map across zillions of different local schemas?
  • The “Social Web Bill of Rights” or “identity rights agreement” problem – how can you have “Creative Commons licenses for data sharing”?
  • The protocol problem – how do you move social graph data around?
  • The “too many IDs” problem – how can we not require more IDs (even with OpenID there is starting to be a proliferation of IDs)?
  • The directory or “friend discovery” problem – how do you find other people in the social graph (a “People’s Guidestar”)?
  • The addressing problem – how can data be addressed in a consistent manner across distributed locations?
  • The user privacy and control problem (also called the “fear” or “surprise” problem) – how can users not be spooked by the idea of their social graph data “getting loose”; how can they maintain control over portable social graph data?
  • The granular access control problem – how can control be easily brought down to the individual attribute level, e.g., date of birth?
  • The regulation problem – how can social graph portability be accomplished within the bounds of data sharing regulations that currently do not permit certain types of personal data to be shared across certain jurisdictions?
  • The safety problem – how can portable social graphs not be subject to the same spam, phishing, and phraud problems as email and the Web?
  • The political problem – how can we make it “politically necessary” for sites and applications to offer social network graph export?
  • The “friend description problem” – how can we have a interoperable means of providing richer description of “friend” relationships?
  • The calendar sharing problem – of all the different types of social graph data, how specifically can we reach alignment over sharing of calendar data?
  • The adoption problem – what are the compelling uses of social graph portability that will drive large-scale adoption?
  • The internationalization problem – how can attribute sharing work across all world languages?
  • The user experience problem – how can social graph sharing operations be made simple and understandable to everyday Web users?
  • The operational problem – how will large-scale data sharing affect network loads, caching, firewalls, security perimeters, etc.?
  • The “invitation fatigue” problem – how can we stop being overwhelmed by yet another source of messages and “click-to-accept” links?

Second, this is the list of solutions being offered at the DSS:

  • An OpenID interoperability testing service (Marc Canter)
  • A new open source project & community for social data portability using Higgins and Higgins context providers.
  • A community dictionary service for schema mapping (Markus Sabadello, Drummond Reed, Paul Trevithick)
  • Different companies offering the potential to have open APIs for sharing their social graph data (AOL/AIM, Yahoo, Google, Cyworld).
  • OpenID-based attribute exchange (Dick Hardt & Sxip)
  • An open API format for social network portability and sync’ing (Brad Fitzpatrick and David Recordon)
  • A social network export service (Upscoop from Rapleaf)

Third, here are the demos that were shown before lunch:

  • Cloudtripper: Paul Trevithick and Markus Sabadello showed how Higgins in conjunction with Higgins context providers (code chunks that know how to talk to specific data sources) can be used to pull a user’s social graph data together directly to their own desktop client.
  • Community Dictionary Service (CDS): Markus Sabadello and I demo’d a new service contributed to the Identity Schemas Working Group at Identity Commons. Intended to help solve the schema mapping problem for highly distributed data sharing, the CDS is a “Wikipedia for machines” – a way for applications to discover and map elements from different data schemas. (I’ll blog a bunch more about this after the Summit is over, but please do see it for yourself.)
  • FOAF crawler: David Recordon (now back at Six Apart) showed a service that crawls public FOAF, XFN, or other relationship metadata to produce aggregated social graphs.
  • Pownce: Leah Culver demo’d a social network aggregation service that lets users aggregate their own social graph.
  • XRI-based data sharing: Mike Mell showed an implementation of a data sharing solution based on XRI structured identifiers for La Leche League International.

The Golden Spike Meeting of Higgins and XDI

Wednesday, January 17th, 2007

May 3, 2006, mid-afternoon. The second Internet Identity Workshop had just wrapped up. It was so thick with sessions and discussions that Paul Trevithick, Andy Dale, and I just kept passing each other in the halls saying, “We need to talk!” but never having the time to actually do it.

We finally agreed to meet in one of the conference rooms after the main event was over. We migrated to a whiteboard and started drawing pictures to help us answer the key questions that kept coming up over the past two days, “How exactly are Higgins and XDI different? What does Higgins do that XDI doesn’t and vice versa?”

There was great irony in this. Besides heading the Higgins project, Paul is a member of the XDI Technical Committee (TC) at OASIS. Besides being the leading implementer of XDI, Andy has been a member of the Higgins project. And I’ve been working with both of them for several years now. Yet still none of us had a really good answer to this question.

As we kept drawing and redrawing the diagrams on the whiteboard, wrestling with how things lined up, I noticed Kaliya, Doc Searls, Phil Windley (collectively the organizers of IIW), plus several other late-stayers had joined the room and were happily monitoring to our progress. They were as interested in the outcome as we were!

I stil remember the late afternoon sun streaming in though the second-story window of the Computer History Museum as I pondered that whiteboard. All three of us had the unsettling feeling that there was much more to this story than we were able to divine off this particular diagram at this particular time. And then poof, our fifteen minutes was up and we all had to split for our respective trains, planes, and automobiles.

But the question was NOT answered, and it kept gnawing at the three of us. It was still there at Digital ID World in September, only now it was starting to surface in another direction: the relationship of OpenID 2.0 (which supports XRI) and Higgins. Could Higgins support both OpenID authentication and CardSpace authentication of the same digital subject? If so, was a URL or an XRI the common identifier of the subject? And how would this relate to attribute sharing?

Again the three of us swore we needed to get in a room together to get to the bottom of this and finally answer these questions – for ourselves, and for everyone else that was asking. We even knew of at least one context where we might get that opportunity – a new project from Paul Hawken’s Natural Capital Institute called WISER (World Index for Social and Environmental Responsibility) that will provide an indexing and data sharing platform for the entire international NGO/civil society sector. It looked like an effort on which we could all collaborate.

Still it took until December for the WISER Commons project to gel to the point where we could finally schedule three days together last week to develop a recommended identity and data sharing architecture for WISER.

As planned, the first day we spent understanding the requirements of this groundbreaking project (about which I’ll blog more soon). This gave us just what we wanted: several flagship use cases against which we could compare the Higgins and XDI architectures in detail. The next morning we sequestered ourselves in front of a white board in a conference room at Andy’s offices. We took the first use case and started diagramming it. Step by step by step we worked through how it would be implemented using the Higgins framework and the XDI protocol. But this time, where before we had drawn big boxes and circles and arrows…we started drilling down. Blowing up each box into its subcomponents and drawing the next level of circles and arrows…and when we got stuck, drilling down to yet another level below that.

As expected, there was a boatload of terminology frustration on both sides. Higgins uses “context”, “contextref”, “digital subject”, “subjectref”, and “context-unique ID” or “CUID”. XDI uses “authority”, “type”, “instance”, “i-name”, “i-number”, and “cross-reference”. But as we slowly peeled the onions, we began recognizing intersection points from which we could start mapping the terms.

For example, we knew going into it that both Higgins and XDI were based on schema-independent, context-independent data models, and those models are fundamentally based on RDF subject/predicate/object graphs. But it wasn’t until we peeled the onion all the way down to these core data models and started drawing the RDF graphs that we found ourselves not only on solid ground…

…but common ground. Acres of it. Whole continents of it. In fact, as we used to say at the gold dredging operation where I worked in Alaska, we hit bedrock – and that bedrock extended all the way under the mountain range.

Suddenly for the first time we were no longer looking at each other as “the other way of doing it”. Instead we saw we were both on the same side, building fundamentally the same thing: an interoperable way of sharing data between any two systems and applications.

The next morning when we reconvened with the WISER Commons team we hit upon the perfect analogy: it was exactly like the transcontinental railway projects in the 1800s. Higgins was building from East to West (Paul being from Boston), i.e., from the user-interface and application layer down towards the protocol layer (Paul coming from a background in page layout and desktop publishing). Andy and I and the rest of the XDI TC had been building from West to East (Andy being based in Berkeley and me in Seattle), i.e., from the protocol layer up towards the application and UI layer (Andy coming from the enterprise database and messaging world).

And although we had been building two entirely different railroads for moving data from coast to coast, suddenly here we were, meeting in the middle of the continent. And, to our mutual astonishment, finding that we were both using the same guage tracks! In other words, with a little work, you could hook the two together and data would flow as smoothly up and down the Higgins/XDI stack and across Higgins/XDI-enabled systems as steam locomotives could move across the interconnected intercontinental railway system.

The secret was the guage itself – RDF. We had both arrived at it as the common core model for data description. And although the railroads we have respectively built from it have many different features and can go different speeds and handle different types of passengers and freight in different ways, they are fundamentally interoperable.

So we dubbed this “The Golden Spike Meeting” of Higgins and XDI (Laurie Rae informs me that in Canada it was called “The Last Spike”, but there’s something more romantic about a golden spike). And hopefully it will represent as important a milestone in our progress towards an open interoperable data sharing layer for the net. At a minimum you can know that Paul and Andy and I are committed to bolting these trains together as quickly and efficiently as possible and showing for real how the data can just start moving.

All aboard!!

John Udell on the Dataweb

Friday, December 2nd, 2005

Doc Searls gave me a ping that Jon Udell was starting to write about the Dataweb. His article in Infoworld is titled, “The two way data web” and it talks about how folks like Bill Gates and Adam Bosworth are hinting about using RSS/Atom in both the publish and subscribe direction.

None of this is with XRI or XDI yet (at least that I can see). But the concept of XML linking to XML everywhere forming a structured Web where the links can be active, self-describing, and self-governing is starting to catch. And as developers begin to move into that web-of-data mindset, they are going to find XRI (as an identifier syntax and resolution protocol) and XDI (as a dataweb document format and interchange protocol) to be the cat’s meow.

XRI, XDI, and Web Services

Sunday, October 2nd, 2005

I just returned from an inspiring meeting with a number of Identity Gang folks about the legal, social, and policy foundations of an identity metasystem. One of the most eye-catching presentations was Dick Hardt’s video of his OSCON talk on Identity 2.0. I think he does the best job yet of explaining what user-centric identity is really about: letting users, not systems/directories, control how they want they want to identify themselves and share data.

In his presentation, however, Dick hits one false note about XRI/XDI– he says “it’s not web services”. As Andy Dale notes in his Tao of XDI blog, XRI (as a syntax and resolution protocol for abstract identifiers) and XDI (as an XRI-based data sharing protocol) are both binding independent—they can be bound to any transport protocol. Both are starting with HTTP bindings only as a simple matter of expedience – HTTP is ubiquitous and lightweight, so there’s no reason not to use it.

But given the roots of XRI and XDI in XML, a SOAP binding makes just as much sense (and is in fact what some implementors are already using). And, as Kim Cameron keeps reminding me, this is also what’s needed for XRI and XDI to fully integrate with the world of web services. In fact, in the modular WS-* architecture, XRI and XDI should fit well, because they offer nicely compartmentalized functionality: XML-based abstract resource identification and data sharing, respectively.

Net net: XRI and XDI are neither POX- nor SOAP-centric. They are resource-centric (that’s “resource” as used in in Uniform Resource Identifier — anything that can be identified). (I could go on at great length about how “user-centric” is really “resource-centric”, but that’s another post.) And as resource-centric services, they are ideal for service-oriented architectures (SOAs) of any kind, including web services.

XDI and Homeland Security

Thursday, May 19th, 2005

I haven’t blogged much about XDI yet, in part because it’s not as far along in the standardization process at OASIS as XRI. This will start changing as the first early XDI data sharing applications start surfacing (hint hint). In the meantime, U.S. homeland security blogger W. David Stepheson has recognized its relevance to that topic. In an email he asks:

I thought you might be interested in my “smart mobs for homeland security” concept, and I’d be very interested in your thoughts on how XDI might help make it a reality.

Since one of the prime directives of DHS is information sharing and collaboration, the most obvious relevance of XDI is as a open standard protocol for implementing trusted data sharing at the massive scale DHS requires (linking all security agencies/personel at the international, federal, state, and local levels.) Frankly I don’t know anything other than Dataweb architecture that can accomplish this (but that may just be my XRI/XDI blinders on.)

However as David points out in “smart mobs for homeland security”, to leave the public out of this information sharing loop would be as shortsighted as leaving citizens out of local law enforcement. So the simple answer to David’s question is that XDI can help create a flow of authenticated information back and forth between citizens and government agencies that enables the defense mechanisms of the “whole organism”.

I won’t dive into technical details here, referring folks instead to The Social Web paper that made David aware of XDI. But I very much agree with his point that technologies like XDI can build on other networking/communications tools to make us all much more effective participants in our own security.

Entries (RSS)