Craig Burton

Logs, Links, Life and Lexicon: and Code

Craig Burton header image 3

Resolving the Data Dilemma

go ahead and share


Data Dilemma—Opening for Context Automation

Introduction

One of the interesting things I have learned while making the Context Automation Tutorial series is how to start identifying opportunities for Context Automation Applications.

The Golden Triangle

At Impact, I introduced the golden triangle of context automation. The golden triangle was made up of the following key components, the identity selector, the Kynetx Rules Engine (KRE), and the data. Since then, the triangle has been made to be more general purpose. The following outlines the changes.

  1. The Selector has become the Context Selector—A context selector does not have to be an information card selector. It can be any one of the devices used on the client to manage rules and their relationship to data.
  2. KRE has become the Context Provider—A context provider does not have to be the Kynetx Rules Engine but can be any cloud-based service capable of executing client side directives to enable context automation.
  3. The data has become the Context Information Source—The context information source is a consistent way of saying the data.

Each of these golden elements needs to move forward. However, this discussion is about the Context Information Source; the data.

Before defining M&M’s I want to cover how the term came into being. While creating some tutorials about Kynetx and how to use data, I started noticing what it is about data that calls forth opportunity with context automation. Just as web sites are silos that cry for context to be automated between them, data information sources are silos. And how are they silos? Good thing you asked.

Data is a silo when it uses a naming, schema, or a data type that is different than a relevant data source that uses a different naming, schema, or data type.

Hmmmmm. That pretty much sums up ALL data sources.

For now, this troublesome data threesome—data naming, schema, and typing—will be referred to as the “data dilemma.”

Resolving the Dilemma

Kim Cameron, while speaking at Impact, told the legendary story about the issue of agreement on a very specific data dilemma. He found himself in a meeting amongst key email vendors arguing the names for data fields in a contact record. This was a complicated and politically charged issue. Complicated not because if technology but because of personalities and perceptions. Not only was vendor turf involved, but countries, government and so on.

He didn’t tell you, but Kim actually took charge and had each vendor commit to not leaving the room until all of the parties present could agree on a naming and schema for contacts. The meeting went on for 18 or 20 hours. Names were called, people stormed out of the room. Some returned some didn’t. It was ugly. The result is now the vCard specification.

This all happened in 1996. We STILL do not have common agreement on how vCards are used. For example, the Mac OS can export all of the contacts in one vCard file. (It doesn’t use vCard syntax natively.) Outlook, on the other hand, only deals with one contact as a time per vCard. It gets even messier. I won’t go into it.

This is one of the few success stories of the industry coming to agreement on naming and schema. It will probably not happen again soon. Every email vendor still hasn’t adopted it. The Google contact database does not match the Yahoo contact database and so on. On goes the data dilemma. And here I predict, it will never be resolved.

In Lieu of Agreement—XML and JSON

Since then, the industry turned to using a markup specification in an attempt to self-describe naming, schema and typing. This effort is known as the Extensible Markup Language or XML. XML has come a long way in helping systems interoperate with the ability to “discover” how to address the data dilemma.

Note: XML and its large family of contingents to NOT resolve the data dilemma. Why? The reason is the same as always, complexity and politics. The definers of XML are arguing schema and naming amongst themselves. With XML, the names of the XML definition languages are still in discussion and remain ambiguous.

Further, XML is a document-centric specification not data-centric. This document does not go into the distinctions of document-centrism and data-centrism. That will be saved for later.

Enter JSON (JavaScript Object Notation). JSON is yet another structured markup language specific to defining data structures, schema, typing, and naming. (Interestingly JSON despite its name, doesn’t require JavaScript and there are not real “objects” in JSON.)

JSON is usually lumped in with the AJAX (Asynchronous JavaScript and XML) specification. (This is a funny note about naming.) All the benefits of AJAX can be derived in a program without JavaScript, XML, or being asynchronous. Go figure.)

For now, KRL is JSON centric and cannot interpret an XML-based data source. This is about to change. That is good. Example of why this is good and the benefits is also another document.

Now that we have taken a brief look at the data dilemma, let’s look at how that applies to Kyntex.

Mining for Gold

You would think that opportunities for context automation would be abundant and that would just start flowing from our fledgling community. Here is the problem. Our minds are so used to dealing with context in the background that we have a very hard time bringing it to the foreground.

Programming context is not a metaphor we yet live by.

Delving in to the definition of M&M’s is the exercise of bringing forth that metaphor.

Definition of Mismatches and Mashups

Mismatch—a mismatch of data occurs when a naming, schema, or data type from one source doesn’t match the naming, schema or data type of another source. A rule that resolves a mismatch aligns the data for use.

I know this needs an example. My example is going to be one that is not about the Internet. I do this because I want to stretch the thinking about what your brain is doing to automatically deal with mismatches all of the time.

I was watching an English movie the other day about young kids in England that were struggling with poverty and temptation to take their anger out on foreigners for taking their jobs. (Sound familiar?) I found myself really struggling to follow the story line. I got the big picture, that was easy, but the details were really tough. The reason it was so tough was about specific words. The problem was word pronunciation and word meaning. The mismatch was subtle, they were speaking English, it’s just because of the mismatch I only had an idea of what they were talking about. As a result, I would have to watch the movie several times and probably research or ask somebody what certain words or phrases meant.

This is a non-Internet example of a mismatch that needs to be handled in the background. In a location-centric web experience, the victim is left with connecting the dots in the background of a seemingly sequential stream of events.

Data alignment doesn’t automatically result in an interesting rule. The question still has to be answered, “What is your target audience ‘up to’ that can benefit from a data matching?” (Remember that the question “what are you ‘up to’” which a purpose-centric question is in distinct contrast to the internet metaphor we know now which is “where are you ‘going to go’”. Of course “where are you going to go” is a location-centric question.)

The point here is that recognizing a mismatch and considering resolving the mismatch is one method of mining for a rule opportunity.

Mashup—a mashup of data occurs when a common data set from two or more data sources are combined. Most of the current rules are mashups of some sort. The AAA rule takes the data from the query “Who offers AAA discounts” and the any Google query you can come up with and produces a result from that mashup.

I posit that we haven’t begun to understand the possibilities of mashups.

In similar fashion as with mismatches, a mashup doesn’t always result in an interesting rule.

A rule wrangler still must answer the question “what is my target audience ‘up to’ that can benefit from this mashup?”

Again the point here is that recognizing a mashup requirement and using that mashup is one way to identify a cool rule opening.

A cool mashup example is shown here with the beta code developed by Familylink.

Mashup of Familylink and Facebook

Does it end with M&M’s?

No.

There are probably other data dilemma considerations that be used to identify a cool rule opening.

M&M’s is just a tricky term to bring your attention to the process of synthesizing context automation. This document introduces the practice of bringing forward to the conscious thinking the process of leveraging M&M’s and turning that thinking into cool rules.

Summary

I wanted to document how to bring this thinking to the forefront.

When it catches on, KaBOOM, watch out! Spontaneous thermal nuclear explosion.

No Comments