When structured markup (SGML/XML) was invented, in the previous millenium, the elevator pitch was “Separate content from formatting.” In those barbaric times, burly men would go into the woods, kill trees, grind them into paste, roll out the paste in sheets, and press ink onto the sheets, as a means of distributing information. Content was static; all it had was appearance or “formatting” and separating content from formatting was all that was needed to make content usable for different purposes.
When we killed trees to distribute information, separating content from formatting was sufficient.
In our enlightened times, information is more frequently distributed on the Web. Information on the Web not only has formatting, it also has behavior.
There are a number of ways in which content can have behavior:
- User selectable behaviors, such as linking, selecting and sorting, and manipulation of the displayed content.
- Dynamic presentation behaviors, such as personalization and adaptive design.
- Backend synthesis behaviors, such as selecting, ordering, merging, and mashing up source content to use in different ways.
Unfortunately, the way we create and structure content today either does not support these behaviors, or supports only one form of the behavior hard coded by the author as they write. This limits the uses we can make of our content.
Just as it is desirable to separate content from formatting, it is also desirable to separate content from behavior so we can give our content different behavior in different contexts.
Separating content from formatting
Let’s review what separating content from formatting is all about.
In a traditional word processor or DTP application, writers write content and format it at the same time:
The problem with this approach is that it only supports one formatting of the content. If you want the content formatted differently, you have to go back into the authoring tool and change the formatting. If you deliver a lot of content to different places that require different formatting, this is uneconomical.
The solution is to separate writing and formatting into two distinct processes:
To enable formatting to occur later, of course, you have to add structure to the writing so that the formatting tools have something to work with.
Once writing and formatting are separated, it becomes possible to format the content differently without having to go back and edit the content source. You can write once, format many:
You can also format existing content in a new way later if requirements change — write now, format later:
That’s fine as long as formatting is all you have to worry about. But what if your content also has behavior?
Separating content from behavior
Unfortunately, the current approach to structured content has not changed much since the dead-tree days. Some tools separate content from formatting, but they don’t separate content from behavior. To the extent that they support content behavior at all, they do so by enabling the user to directly specify behavior in their source files. Apparently forgetting their founding principles, these systems separate content from formatting but embed behavior directly in the content.
This creates exactly the same problem as we had when content was not separated from formatting: if you want different behaviors in different places, you have to go back and change the behavior in the source.
The most obvious case of behavior that you might want to change is linking. Different resources may be available in different contexts, requiring different links, and you may have business reasons for applying a different linking strategy in different contexts. This becomes very difficult when the linking behavior is specified in the source content.
The solution to this is no different from the solution of the formatting problem: make writing content and assigning behavior to content into separate processes:
As the diagram shows, the behavior process generally fits in before the formatting process, since the behavioral elements of the text, or the text that results from behavior being applied to it, then needs to be formatted for display.
The means by which we separate behavior from content is exactly the same as that by which we separate content from formatting: at the writing stage we add structure to the content which can then be used to apply different behaviors. All that changes is that some additional structure is required to capture the data needed to drive various behaviors. (To be clear, this is not behavior-specific structure, which would not achieve separation at all — it is just content semantic structure that most systems did not bother to capture before.)
With that additional structure in place, you can assign different behaviors to content just as you can assign different formatting:
Some behaviors to consider
Let’s look at some of the types of behavior content can have online. To be clear, when I talk about content having behavior, I am not talking about videos, animations, or apps, though they are part of the total content behavior package as well. I am talking about textual content. Indeed, of all types of online content, textual content is actually the most flexible and dynamic and capable of the richest set of behaviors.
Here are some of the content behaviors that are now either common, emerging, or possible on the web. The list is certainly incomplete, but hopefully it gives some idea of the richness and depth of content behavior, and why we need to start separating content from behavior.
On the web, the boundaries of text flows are fluid. Though designers have often tried to fight against the ability of web content to flow — so that they could design web pages the way they are used to designing paper layouts — browsers and CSS have generally given the user the upper hand in these battles, allowing them to reflow the text by zooming or applying their own style sheets.
The advent of the mobile web had decisively tipped the scales in the favor of flowable content, first with mobile browsers having the ability to focus on just the text portion of a page and to reflow the text to fit the phone screen, and, more recently with the rise of responsive design. Content should no longer contain any expectation about how it will flow on the user’s device.
To separate flowing behavior from semantics, structured markup should avoid dimensionality in source content. This applies most obviously to tables. Tables should not be designed to work at a specific page width or height. If fact, serious consideration should be given to eliminating table markup from source content altogether.
[Tables do not always flow well into mobile device screens.]
Rather than tables, create data sets which can be presented in various ways: tables for paper, lookup widgets or apps for online. Create a structure for labeled lists, rather than using tables for this purpose, and consider treating any other types of table as graphics.
For a simple example, consider an icon list, something found in many technical manuals and commonly formatted as a table (shown in HTML for the sake of familiarity — most contemporary source formats do the same thing with slightly different syntax):
<p>Open a file.</p>
<p>Print the current file.</p>
The above simply describes a table. There is no reliable way for the processing software to do anything with it other than format it as a table. Even if it is flowed into a space too narrow to display the table (more likely with wider, more complex tables) there is no way to apply different behavior to the content to provide the same lookup functionality.
But it we mark the content up to directly express the relationship that the table was designed to convey through layout, we retain options about how to flow the content, and the ability to do an number of other things with it as well,
Here’s what this might look like:
<description>Open a file</description>
<description>Open a file</description>
The structure make no presumptions about presentation or flow. For paper presentation, it can be use to generate a table like the one above. For online viewing, it can be used to drive an alternative presentation, such as a gallery in which the icons are presented in a grid and tapping one brings up the description.
[A possible way of flowing a icon lookup onto a small screen.]
This type of structure enables other kinds of behaviors besides flowing. (This is the point of capturing semantic structure rather than behavior — that the semantic structure will support many kinds of behavior). For instance, by identifying the icon by name, rather than by pointing to an icon file, it preserves the opportunity to choose which icon graphic to show (many icons exists at multiple sizes for different uses). By preserving the fact that this structure is a list of icon descriptions, rather than simply a table, it preserves the opportunity to draw information from this structure wherever else an icon is mentioned in the content (or even in the application, where this same file could be used to show tools tips.
StreamingIn his blog post, Stop Publishing Web Pages, Anil Dash argues that people want content in steams rather than pages:
Most users on the web spend most of their time in apps. The most popular of those apps, like Facebook, Twitter, Gmail, Tumblr and others, are primarily focused on a single, simple stream that offers a river of news which users can easily scroll through, skim over, and click on to read in more depth….Users have decided they want streams, but most media companies are insisting on publishing more and more pages. And the systems which publish the web are designed to keep making pages, not to make customized streams.
Even if you think Dash is extreme, or premature, in his call to stop publishing web pages altogether, the point of structured writing is to future-proof your content so that it can be delivered in new ways as demand arises, without having to change the content itself. Thus even if your content isn’t going to be delivered in streams today, it should not have the presumption that it is going to be delivered in pages encoded into it today.
Content links to other content. This is the behavior that defines the Web. Links are the threads that bind pages into a web, and into a World Wide Web. Some links exist as direct references to specific resources on the web, but most links are not actually about linking to a specific resource, but about linking to information on a particular subject. Ideally, a link should link to the best source of information available for the person who is reading it, at the time it is read, and guided by the interests of the company that is providing the content. Since all these factors are variable, the author should not be designating the linked resources at authoring time, any more than they should be designating the font size of the linked text.
To avoid hard coding links in our source content, we must instead record the subject that we are mentioning, without specifying a resource or a location to link to. For instance, if we mention the name of a function, rather than specifying a link to a particular API reference, we can mark it up like this:
<p>You can print data to the screen using the <function-name>printf()<./function-name>. </p>
Rather than hard coding a link to a specific API reference page, this markup simply records the fact that “printf()” is the name of a function. This allows us to later link this mention of printf() to any reference source (or task topic) that describes the printf() function.
For more on this technique, see More Links, Less Time.
Progressive disclosure is the behavior of presenting only a portion of the content to the reader and of progressively revealing more information in response to readers actions. One of the more common places to see progressive disclosure is in Microsoft help systems, where the details of a procedure are often hidden behind a “more…” link.Tom Johnson has recently written a blog post on progressive disclosure that raises a lot of interesting questions about the role of progressive disclosure in both navigation schemas and content itself. Progressive disclosure could be an important tools for enabling users to navigate large bodies of information.
Progressive disclosure can be as simple as the unfolding of a hierarchical table of contents or a hierarchical document, but the navigation of hierarchies has severe limits for information finding, and dynamic progressive disclosure based on multiple independent factors could be a much more powerful navigation tool.
That kind of progressive disclosure, however, cannot be written into the content by the author as they write. It has to be done based on structure and metadata in the content, acted on dynamically by the engine behind the presentation view. If we want to enable sophisticated progressive disclosure, we have to separate the content from the behavior and provide the structure necessary to drive it.
To accomplish this, we should label each piece of information in our content by type, so that we can then choose to initially hide and progressively reveal certain types of information. (Note that we should not simply assign the types we plan to use immediately — that is little better than just marking up folds directly. We should markup types generally, so that the type information can be used for different kinds of progressive disclosure, and for other purposes.)
Selecting, ordering, and filtering
In Too Big to Know, David Weinberger suggests that readers are losing their taste for pre-filtered content.
We seem to be making the cultural choice—with our new infrastructure’s thumb heavily on the scale—to prefer to start with abundance rather than curation. Include it all. Filter it afterward.
[Don't miss my talk at LavaCon on filtering.]
We used to assume that it was the author’s job to select, order, and filter content for the reader. In a paper world, there was little choice — the reader had no easy access to the author’s sources and could not select, order, and filter for themselves. But today, while there still may be times when some would prefer an author to do all the work for them, more and more readers would rather have access to a broad selection of materials and select, order, and filter it for themselves.This, after all, is what they are doing when they choose to do a Google search rather than take a trip to the library. We also see this same behavior on social networks such as Twitter and Facebook, where each reader chooses the content they will follow, and on shopping sites such as Amazon, or in news apps where readers create filters to show them the content they are interested in. Everywhere readers are saying, make everything available and let me select and order it for myself.
Of course, this requires content that is adequately structured and identified so that the filters that matter to people can be applied to it automatically. That is generally not the case for most of the content we produce today, even the structured content.
But this is not just about readers selecting, ordering, and filtering as they read, important as that may be. It is also about the ability of a company to select, order, and filter its own content for different uses.
We hear a lot about reusing content today, but most content reuse is done manually, with authors traversing collections of loosely structured content and collating reusable chunks by hand. In many cases those chunks are too small to be meaningful to the reader by themselves and can only be reused by being assembled by writers into larger pieces. Our reuse is focused on writers creating static publications, not on organizations dynamically selecting, ordering, and filtering content for their readers, or offering their readers the ability to select, order, and filter for themselves.
The key to enabling the dynamic selection, ordering, and filtering of content is to create content units that can be queried reliably. This means that they adhere to their type (so that if you ask for information of a particular type, you will get information of that type), that they must be adequately identified, so that they can be selected reliably, and that they must function independently for the reader — they must, in other words, be Every Page is Page One topics.
A great deal of technical and commercial content is what we might call narrated data. That is, at heart the content is a set of data fields from some structured data source. Because presenting structured data fields is not very user friendly, especially if the reader is not familiar with the meaning and structure the data, we turn the data into a narrative.
We encounter narrated data all the time. Sports stories, corporate annual reports, and stock analyst reports are full of narrated data — scores, standings, stats, profit and loss, technical analysis, gainers and losers, etc., all interpreted into narrative. When Amazon tells you that “Customers who bought ‘Clever Polly and the Stupid Wolf’ also bought…” it is narrating data. Most technical references include narrated data.
The question is, who creates the narrative, a writer or a machine? The company Narrative Science has been in the news lately for its software that turns data into sports stories and annual reports. This has some wondering about its affect on technical communication jobs. I would suggest that rather than fearing it, automating the narrating of data is something technical writers should embrace and master.
The narrated data produced by Narrative Science is intended to pass a Turing test – that is, in their words, “… transform data into stories that are indistinguishable from those authored by people”. We do not always have to aim so high. No one doubts that Amazon’s narrated data is generated by a machine, but it is still effective at helping us understand useful data in a human way. There is all sorts of reference material that we produce today that is narrated data, and we could greatly enhance our productivity if we simply captured the data and did not write the narrative over and over again.
The second great virtue of this approach to narrated data is that by capturing the data itself, it makes it possible to narrate it in other ways. Reusing the underlying data to create different narrations is much more powerful than merely attempting to reuse the same narrative in different contexts.
Here the prescription for separating content from behavior is simple: store the data as data; have an algorithm produce the narrative.
Extracting and merging
One of the interesting properties of technical communication is that it frequently mixes narrated data and true narrative in the same article. A good example of this is an API reference, which includes narrated data about return types and arguments with a narrative description of the purpose of the function.
Narrated data shows up all over the place in technical communications. For instance, the navigation instructions for a GUI are narrated data. The location of every menu item, wizard, tab, dialog, and button is data. Yet writers routinely trace and record the path through the interface by hand over and over again. This is both expensive and error prone, both because the user may make an error, and because the map may change during development without the writer being informed. Suppose instead that the map of the GUI was maintained as data and that writers simply identified the GUI element to be acted on and allowed the system to generate the navigation instructions.
There is all sorts of data in code which technical writers either narrate, or wish they could narrate but don’t because they don’t have the time. If they had tools to extract this information from the code, they could produce more and better reference material. However, to really make this approach work, you need a way to merge narrated data extracted from sources with true narrative data created by writers.
Such systems exist. The most common examples are code documentation systems like JavaDoc or NaturalDocs which parse the functions signatures of API functions and merge them with specially marked-up code comments to generate an API reference. The downside of these systems is that their markup tends to be unstructured and unsophisticated, and they don’t make their content easily accessible to other forms of linking and merging. Also, by forcing the narrative content into code comments, they tend to force all authoring to be done by the API programmer — with mixed results.
A more robust and general approach is to extract data from source and then merge it with authored content maintained separately in XML. You may have trouble trying to sell this idea to programmers for their API docs, but for many other kinds of reference material for which data can be extracted from source, or from other datasets, locally or across the web, this paradigm make a lot of sense. Not only does it allow you to create, manage, and merge the narrative content without having to change how the source data is stored, it allows you to produce an XML output to which further behavior and formatting can then be applied.
In this case, it is not the format of the extracted data we have to worry about, but the format of the XML data that will be merged with that extracted data. For this, two things are necessary:
- The format must include some unique identifier also found in the extracted data, so that you can merge the right narrative with the right data.
- The format must separate the narrative portions into distinct pieces so that they can be merged correctly with the elements of the extracted data.
Mobile devices have brought zoomable interfaces into the mainstream, and the recent popularity of Prezi has introduced the world to zoomable content. It is not until you try Prezi that you realize that, through all of our previous experiments with information design, we have assumed that all information had to be presented at the same scale. Once you try zoomable content, you realize that we have all sorts of metaphors in content that relate to scale and zooming — the 50,000 foot level view, the drill down structure — but that we have always represented these relationships as moves sideways, or as folding and unfolding, rather than zooming in and out.
I have no sense yet of where zoomable content is going. It is certainly a form of progressive disclosure and one that seems, metaphorically at least, to be fruitful. But is it workable outside what Prezi is doing with it? It’s too new an idea, and one of the things I have realized from the Prezi’s I have seen is that many people are still instinctively panning sideways when the information relationship they are expressing is actually a zoom into greater detail. We don’t have our heads around this yet.
Will zoomable content become a big thing? I’m not sure. But zooming into content is definitely behavior, and we don’t want to end up writing all of our zooms by hand, the way you do in Prezi today. We want to separate the behavior of zooming from the writing of content.
Content can be embedded. Embedded, that is, in applications. Embedded help does not pop up on command to explain the interface; it is integrated into and integral to the interface. If that does not sound like a behavior, consider this: embedded content must respond to the user’s actions in an application — not merely to their specific requests for help, but to their actions as they go about their work. That is behavior, the behavior of responding to context, which is actually one of the more sophisticated behaviors in this list.
Substantial embedded help is still somewhat rare, especially outside of web-based applications, and where it exists, it seems largely to have been specifically programmed into the application, rather than having been integrated algorithmically from external content. But embedded help makes more and more sense today. On the web, separating content from function makes no sense, and on mobile, there is no room for the separation. Content needs to be embeddable as well as publishable, and that means it needs to be free of any presuppositions about whether it will be embedded or published, and semantically rich enough to respond correctly to context when it is embedded.
Increasingly, people are building applications by tying together multiple resources from around the Web into hybrid applications often dubbed mashups. One of the most familiar types of mashup is the overlaying of company data sets over Google Maps. For instance, I have an app on my phone that can show me where the next bus scheduled to arrive at a particular stop is currently located. It does this by mashing up data from Google Maps with current GPS coordinates from the bus fleet.
There is no reason for content not be mashed up in similar ways. Indeed, this is already happening, since that same bus app also mashes in timetable content and route maps. By the very nature of mashups — they happen after the fact using published data — you cannot mash up your content at authoring time. The only way to make your content mashable is to make it accessible in ways that can be reliably queried by mashup applications.
One of the key things about supporting mashups, though, is that you are far more likely to be supporting access to your structured source than to your published HTML or PDFs. It is your structured source that has the structure that the mashup needs in order to extract the data it needs from the content.
One way to do this, certainly, it to provide content in chunks with metadata attached to the chunks that the mashup can query. The problem with this is that it only allows the mashup to access the chunk as a whole, not get inside an extract the relevant pieces. That can leave the mash … well … lumpy. To really support mashups, you want to structure content so that the content itself can be reliably queried like a database.
Supporting the separation of content from behavior
The main structured writing systems used in technical communications today, DocBook and DITA (Darwin Information Typing Architecture), do not have good support for separating content from behavior. Both embed linking information in the source rather than separating it. Neither captures sufficient semantic structure to support many of the behaviors discussed here.
This is not surprising when we remember that DocBook was developed before content behavior was an issue, and that DITA, which was originally created for online help (back in the days when online means what off-line means today) comes from a time when content behavior was in its infancy. Neither was created for the behavior-rich content environment in which we now live and work. Naturally enough, neither supports the kind of separation we need between content and behavior.
It is time to begin rethinking structured writing for an age in which content has behavior.
What do you think?