DITA Evolution and the Effect on Content Management Systems

By Joe Gelb, Suite Solutions, special to TheContentWrangler

image As the Darwin Information Typing Architecutre (DITA) gains wider acceptance as an XML standard for technical documentation, there has been increased activity towards interoperability with other standards, such as S1000D and SCORM, specialization for uses in specific industries, and adaptation for different use cases. Noteworthy is the current effort to adapt DITA for monolithic business documents other than technical documents. This adaptation opens the possibility for techdoc groups to implement DITA without requiring the paradigm shift to topic-based authoring. While Content Management System (CMS) tools with strong Component Content Management (CCM) capability are important for managing topic-based documentation, this still remains a niche market. While several top Enterprise Content Management (ECM) vendors have invested in native XML capabilities, and third party solution providers have built CCM functionality on top of various ECM platforms, it seems unlikely that ECM vendors will seek to purchase CCM technology to augment their own solutions.

Evolution of DITA

The DITA standard is gaining wide acceptance as a general XML DTD and architecture for technical documentation. As DITA continues to prove its staying-power, and as tool vendors invest in supporting the standard, there has been increased activity towards interoperability with a number of other disciplines and their corresponding standards. Interoperability is a key indicator of the direction that DITA is evolving.

First we will discuss ways in which DITA can be made interoperable with other types of content and related standards. The sections following will briefly discuss current activity underway towards interoperability.

Implementing Interoperability

Interoperability with other standards can be implemented in several ways.

  1. Development of specializations to the DITA DTD within the framework of the DITA standard.
  2. Development of plug-ins to the DITA Open Source Toolkit (DITA-OT) to provide custom publishing capabilities within the framework of the DITA-OT architecture.
  3. Evolution of other standards to provide interoperability with DITA.
  4. Development of commercial tools to provide proprietary interoperability programmatically, essentially extending usage of DITA with other related standards.
  5. Evolution of the DITA standard itself as a response to the need for interoperability.

The method chosen depends on several factors, including the maturity of the corresponding content standard and the degree of interoperability desired. It may also depend on who is driving the efforts. For example, if the drive is coming from within the DITA community, there would be a greater tendency to use DITA constructs and tools, such as specialization and DITA-OT plug-ins, to provide for the interoperability. If the drive is coming from end-user demands for an immediate solution, a vendor may develop a software solution and then find ways to fit it into the standard.

Interoperability with S1000D

S1000D is an SGML/XML standard developed by the Aerospace and Defense Industries Association of Europe (ASD) for documenting military aircraft, but has since been modified for use with land, sea, and commercial equipment. The standard is currently maintained by the Technical Publications Specification Maintenance Group (TPSMG), which is represented by the ASD, ATA, AIA and other industry representatives among countries who are implementing the standard.

S1000D is structured similar to DITA in that documentation is managed in discrete modules, similar to the DITA topic-based architecture. However it has a much higher developed metadata and classification system, and module types highly specialized for the military and aircraft industries. Whereas specializing DITA to achieve the level of detail required by S1000D is prohibitively difficult, there have been efforts in the S1000D community to find ways of using DITA to provide less complex documentation such as for software and interfaces. It has also been suggested that DITA can be a stepping stone in a gradual process of migrating unstructured legacy content to S1000D.

Interoperability with Training Content

The primary standard currently being implemented for training content is SCORM. This standard is sponsored by the ADL (Advanced Distributed Learning) Initiative which was established under the auspices of the US Department of Defense, but it is rapidly being adopted for other technical training content.

The OASIS DITA Learning and Training Content Specialization Subcommittee has been working towards developing support for processing DITA learning content for delivery with standards-based learning, specifically SCORM and Question and Test Interoperability standard (QTI). Specifically, DITA processing would be extended to support basic SCORM sequencing, interactions, and required SCORM LMS runtime behaviors via a DITA-OT plug-in.

In parallel to efforts coming from the DITA community, there are also efforts from the direction of OASIS DITA Semiconductor Information Design Subcommittee is working to create DITA specializations for better information design and interoperability throughout the semiconductor industry. The subcommittee represents a community of interest within various semiconductor companies who believe there is value in developing such specializations for the industry.

Finance

As an example of DITA adaptation being driven by customer needs for a solution, the Financial Accounting Standards Board (FASB) migrated their U.S. Generally Accepted Accounting Procedures (GAAP) standards to specialized DITA-based content. The GAAP standards are provided on the Accounting Standards Codification website as dynamically published content using a DITA-based application based on DITA-OT.

Adapting DITA for Enterprise Business Documents

Perhaps the most noteworthy path of DITA evolution is the efforts to adapt DITA for enterprise business documents that do not fall within the category of technical content. The OASIS DITA for Enterprise Business Documents Subcommittee has been organized to develop and recommend an enterprise business document meta-model, and to develop and recommend an approach to harmonize this meta-model with the DITA standard.

This effort will effectively open up a channel for using DITA globally for a wide corpus of business documents and a much wider potential for using DITA tools—open source such as DITA-OT as well as commercially developed authoring, publishing and management tools. On the other hand, since enterprise business documents are generally not developed or managed based on a topic architecture, this also opens the door for ECM systems to support DITA in a more limited way.

Subcommittee co-Chair Ann Rockley lists three main types of business documents:

  • Transactional Business Documents Includes data oriented documents such as Purchase Orders, Bills of Lading, Drug Form Listings, etc.
  • Correspondent Business Documents Includes short documents that support a business process and usually augment other documents. Examples include Memos, Faxes, Letters, emails, etc.
  • Narrative Business Documents Consists of longer, often multi-section documents that may contain mixed content types. Examples include various types of Study Reports, Contracts, RFP Responses, etc.

One of the biggest obstacles to DITA implementation even for technical documentation has not been necessarily technical, but rather the paradigm shift of working with discrete content topics as opposed to monolithic documents. Although a topic-based architecture often yields its own benefits, given the possibility of using DITA for multi-section monolithic documents, it is our opinion that many organizations may opt to use DITA without migrating to a strict topic-based architecture, but still reap other benefits of using DITA, such as providing structured documents and reduced authoring costs, providing a platform for single-source multi-purpose publishing, and reducing localization and production costs.

The logical result of this decision from a tools perspective is that purchasing a CMS which provides component content management (CCM) will no longer be necessary for successfully managing DITA content, but rather a classic ECM solution may be sufficient.

Effect of DITA on CMS

Over the past several years, CCM functionality has become more highly developed as part of CMS offerings geared towards technical publications. Being that robust CCM is important to fully reap the benefits of documentation using the DITA topic-based architecture, the wider adaptation of DITA means a growing market for CMS vendors that have robust CCM support.

Given the growing trend towards DITA implementation, a question to be asked is how ECM vendors will respond to the need for CCM. Since ECM systems are designed to manage individual unstructured documents of all types and sizes, they are by default ill-equipped to support robust CCM required for DITA topic-based documentation.

We will take a look at a sampling of key ECM vendors and attempt to identify a pattern with respect to their strategy concerning structured document support and CCM functionality. More specifically, we will attempt to determine whether ECM vendors are likely to acquire CMS technologies which have robust CCM functionality.

EMC Documentum

Documentum is one of the leading ECM solutions today with strong functionality and large market presence. In mid 2007, EMC acquired X-Hive, a powerful XML repository which serves as the backbone for some of the most robust and well-placed CCM CMS offerings. With this acquisition, EMC has the wherewithal of expanding its support for structured documents and CCM in its Documentum platform.

Even before the acquisition of XHive, several professional services firms with wide experience with Documentum, such as Flatirons Solutions, developed solutions which provide strong CCM functionality on top of the Documentum platform.

Because of these two factors, the acquisition of a XML repository technology and the existence of CCM solutions based on the Documentum platform, it is our opinion that Documentum would not further expand its CCM offerings by purchasing other CCM CMS technology.

Oracle

As an opposite example of a company with significant native XML capability purchasing ECM technology, in late 2006, Oracle purchased Stellent ECM. As Stellent already implemented the IXIASOFT XML repository technology in its offerings, Oracle should be well placed to provide a wide-ranging document and component management platform without having to look further to outside CCM technology.

Interwoven

As far back as 2002, Interwoven recognized the need for robust CCM and announced its Team XML offering which was to provide XML-based content management, content reuse and multi-channel delivery all central features of todays high-end CCM CMS tools. As part of this offering Interwoven teamed up with XyEnterprise, a leading CCM CMS vendor. However, Interwoven has stepped back from this drive and removed Team XML from their product line, and seems in no hurry now to provide CCM functionality otherwise. This case study may be representative of the reality that CCM, however much it is being embraced in the techdoc and training fields, is nonetheless a niche market with relatively small market potential as compared with the growing business need for ECM.

Microsoft SharePoint

SharePoint is an ECM with relatively basic functionality compared with the other large ECM vendors but which enjoys large market share. However, because of its excellent interoperability with Microsoft Office products such as Infopath and Word, and the .NET development platform, there are several tool vendors in the techdoc space that have developed solutions that expand SharePoint capabilities towards providing CCM. Examples of this are DITA Exchange and In.Vision Xpress Author.

SDL

A variation on this theme is the recent investment in the Trisoft InfoShare CMS product by SDL. Although some would look at this as an example of investment in CCM technology, the opposite seems to be the case. The relatively small investment seems to be directed at helping Trisoft improve its support for the SDL localization management platform, thus helping to drive more business to SDL through the sale of the Trisoft CMS product.

Conclusion

Although XML is being adopted more and more for documentation, CCM remains a niche market that has not seemed to have gained the critical mass to attract the attention of the large ECM vendors to provide native robust CCM solutions. They have sufficed with providing native XML functionality along with strong document management technology, or allowing other third party tool vendors and professional services groups to provide CCM solutions on top of their platform.

With the nascent efforts to move DITA within reach of monolithic documentation as opposed to being solely within the purview of topic-based technical documentation groups, we are of the opinion that there will be even less reason for ECM vendors to invest in CCM technologies. Rather, we predict they will continue to solidify their native XML capabilities and leave it to third party vendors and service groups to provide CCM solutions for the niche techdoc

About Joe Gelb

Joe Gelb is the founder and president of Suite Solutions. Joe has overseen the successful implementation of documentation conversion, single-source publishing and content management integration projects. He has designed and built numerous utilities and applications to aid in the development, management, localization and multi-purpose publishing of technical documentation. He has broad experience in topic-oriented content development, information architecture, and application of tools for successfully optimizing creation, management, localization and delivery of technical information. Joe has developed innovative strategies for information modeling designed specifically for technical documentation and training.

After serving as CTO of Live Linx, overseeing technology design, development and content management implementation for over 10 years, Joe established Suite Solutions. The firm is based on Joe’s belief that every organization deserves responsive and professional service, independent of software vendors. His customer-oriented approach results in pragmatic process and technology solutions that meet requirements, are based on accepted standards and best practices, and make good business sense.

Contact Joe

About Suite Solutions

Suite Solutions offers a broad range of professional services to technical documentation groups of any size implementing DITA, S1000D, SCORM, IETM and other standards. Services include training, consulting, CMS and tool selection and implementation, XSL-FO style sheet and template development, and custom integration and support. Suite Solutions brings together experienced professionals with extensive knowledge of industry standards and best practices, making available a collective broad experience in XML-based documentation, information architecture, component-oriented content development, and application of tools for successful optimization of authoring, management, localization and multi-purpose publishing. Suite Solutions distinguishes itself with its responsive support, uncompromising professionalism, and ability to think out of the box to find creative solutions to the individual problems of each organization.

Tags: ,

3 Responses to “DITA Evolution and the Effect on Content Management Systems”

  1. Ann Rockley February 27, 2008 at 9:15 pm #

    You provided a well articulated overview of the evolution of DITA and its relationship with Component Content Management. Unfortunately, I disagree with the conclusion that there will be less reason for ECM vendors to invest in CCM. I started to respond to the article via the comment but found I couldn’t say everything that I wanted to in a comment so I’ve blogged my response http://rockley.com/blog/?p=48#more-48.

  2. Susan Self March 4, 2008 at 4:12 pm #

    This first page of the article cannot be printed. I tried the print link and the File->Print Preview method, and both showed a blank first page.

  3. Fabrice Talbot March 4, 2008 at 10:57 pm #

    Just wanted to bring to your attention another company: SDL Tridion.

    They have a powerful XML-based CMS that has been used by documentation companies to manage their docs and translation (even though the product targets WCM).