By Joel Amoussou, special to TheContentWrangler.com
The subject of interoperability between S1000D and the Darwin Information Typing Architecture (DITA) has received significant attention within the technical documentation community recently. This article discusses the following issues:
- Shall we create DITA specializations for S1000D data modules?
- How can we facilitate interoperability between DITA and S1000D, to enable round-tripping transforms for example?
- Is the DITA specialization mechanism the best way to make S1000D extensible?
- How can users leverage the strengths of both DITA and S1000D without introducing complexity?
In general, our approach to designing technical solutions is based on the following principles:
- Always use the right tool for the job at hand
- Do not reinvent the wheel
- Adhere to best practices in modern information design
Let’s look at the issue from the perspective of two important stakeholders: the XML Vocabulary User and the Information Architect.
The Right Tool for the Job at Hand
You are a technical writer, content publisher, or a publications manager. You work for an aircraft manufacturer. Your mandate is to produce maintenance and operation documentation for a new aircraft project. Should you use S1000D, DITA, or both?
Use S1000D to document the airframe, engine, components, and supporting equipments. Do not try to create DITA specializations for these subjects. Like DITA, S1000D supports the following concepts:
- Topic-driven information design in the form of data modules
- Extensive metadata facility based on the Dublin Core standard and the Identification and Status (IDSTATUS) section of the data module
- Information maps defined in publication modules
- Content filtering or personalization based on criteria such as skill level, security classification, product applicability, and configuration
- Content reuse through a Common Source Database (CSDB)
The design of S1000D is also well informed by over a century of efforts to document aviation and military hardware. The only significant difference between S1000D and DITA from the markup language design perspective is that S1000D lacks a specialization mechanism (more on that later).
If you also need to document the user interface (UI) and application programming interface (API) for an on-board software applications, use DITA. Note that this is subject to contract data requirements, particularly for defense contracts that sometimes dictate the XML vocabulary to be used for documentation deliverables. DITA already has specializations for the software, user interface, and programming domains. This by no means implies that DITA is good for software documentation only.
Although the S1000D specification provides an SNS for software projects (see chapter 8.3.8 of the S1000D specification), S1000D is not the most elegant solution for documenting software products. If you think that DITA alone is not enough to support the documentation needs of the on-board software, consider creating a DITA specialization using S1000D element names and semantics. This will greatly simplify round-tripping transforms in the future if necessary.
Your S1000D publications modules and data modules can link to DITA maps and topics using the element. DITA should also support the ability to link from a DITA topics and maps to S1000D data modules and publication modules.
Another idea floating around is to use DITA to create chunks of reusable data that can be assembled to create S1000D data modules as well as other forms of content such as SCORM-compliant learning objects. While this technique could work in certain situations, it should be approached with caution. Both S1000D and DITA can be intellectually challenging to deploy and authoring S1000D data modules by assembling DITA chunks could add an additional layer of unnecessary complexity to your project.
Use S1000D for domains that are already well served by the specification such as: aerospace, defense, automotive, oil and gas, power generation, and the machine industry in general.
Best Practices in Information Design
You are an Information Architect. Your job is to research, develop, and implement best practices in information design. These best practices should be selected in light of new W3C recommendations such as XPath 2.0, XSLT 2.0, and XQuery. Some of the most recognized best practices in information design include object-oriented concepts such as inheritance and polymorphism which are well supported by the XML Schema language.
Document designers should also recognize that it is impossible to create an XML vocabulary that satisfies all current and future requirements of the users of an XML vocabulary. The XML vocabulary should provide for controlled extensibility to allow any party to add extensions that are needed to satisfy their specific needs.
The main strength of DITA is its extensibility. One of the benefits of DITA specialization is that it allows the reuse of processing code (e.g. XSLT stylesheets) across specializations through a fall back mechanism to base types. The lack of specialization in S1000D is cited as one of its main drawbacks. However the combination of XML Schema, XPath 2.0, and XSLT 2.0 can provide a very robust specialization mechanism for S1000D.
DITA Specialization Mechanism Revisited
Extensibility can be achieved in XML Schema using a variety of techniques including wildcards, element substitution, and subtype polymorphism via extension or restriction of a base type.
Despite all the criticism, one of the benefits of using XML Schema to model XML vocabularies (as opposed to DTDs or even RelaxNG) is that its type system is used by most of the current W3C recommendations such as XForms, XPath 2.0, XSLT 2.0, XQuery 1.0, and WSDL. James Clark, the brain behind some of the most important innovations in SGML and XML, in his recently published blog asked “Why do we still have cruft like processing instructions and DTDs?” One the problem with the DITA specialization mechanism is that it is based on DTD syntax.
The adoption of a specialization mechanism based on XML Schema’s substitution and inheritance mechanism could require a refactoring of both DITA and S1000D content models. However, the combination of XML Schema and schema-aware XSLT 2.0 can provide a simple and robust alternative to DITA’s complex and elaborate scheme for achieving the benefits of specialization with DTDs and XSLT 1.0.
S1000D should learn from DITA’s experience and success by providing a specialization mechanism that allows any party to add extensions that are needed to satisfy their unique requirements. An S1000D extensibility framework will allow organizations and communities of interest to adopt S1000D without “polluting” the core specification. This design pattern has been successfully adopted by the National Information Exchange Model (NIEM) sponsored by the US Department of Homeland Security and Department of Justice as the standard for data exchange between government agencies.
Reengineering S1000D into DITA?
Some have proposed the creation of an interoperability bridge between DITA and S1000D. This would be a massive undertaking for which it is hard to imagine a valid business case. The S1000D specification is 2588 pages strong and is likely to grow in version 3.0 and beyond. What is needed is a set of high level guidelines for creating interoperable DITA specializations for S1000D. Users can then follow those guidelines in creating specializations on an as needed basis.
A Shared Specialization Framework for DITA and S1000D
The good thing that came out of the DITA vs. S1000D debate is the need for extensibility in S1000D. Such a specialization framework based on modern technologies such as XSLT 2.0 and XML Schema could be shared by both DITA and S1000D. The framework could be developed under the auspices of OASIS as a standalone specification.
About the author
Joel Amoussou is the Founder and CEO of Efasoft, a company specializing in content management, enterprise portal, and enterprise learning solutions. Joel started his career as a Flight Engineer of Civil Aviation and later became an expert in markup technologies. His current focus is on improving the way organizations design and manage content through the use emerging technologies such as XQuery, XSLT 2.0, XForms, ISO Schematron, and S1000D™. Email Joel.