Home » Blog » Currently Reading:

DITA Metrics: Developing Cost Metrics

November 19, 2008 Blog 12 Comments

By Mark Lewis, special to The Content Wrangler

Table of Contents

Introduction

Cost of Metrics Overview

Cost of a Project

Cost of a Topic

Cost of User Guides Without Topic Reuse

Cost of User Guides With Topic Reuse

Cost Comparison: User Guides With and Without Topic Reuse

Cost of a Reusable Master Topic

Cost of User Guides With Reusable Master Topics

Cost Comparison: Topic Reuse Versus Reusable Master Topics

Conclusion

About the author

Introduction

imageYou’ve read all the papers (and attended all the webinars) on return on investment (ROI) for XML and you get it. You’ve already concluded that moving to the Darwin Information Typing Architecture (DITA) will likely save you tons of time and money. But management says, “Prove it!”. This paper helps you determine the cost portion of the ROI calculation. What are my costs now? What will my new costs be with DITA? And what is the difference—my savings? This white paper is the first in the DITA Metrics series. The series will discuss cost metrics, reuse metrics, and a reuse strategy. This paper is the first in the DITA Metrics series. It describes one model for calculating the cost of a DITA project. After doing some content analysis on your own documentation set, you can customize this cost model to suit your documentation project needs. In the end, you should be able to speak the financial language of managers and prove to them in dollar signs the value of moving to DITA.

To benefit from this article, you should have at least an intermediate level understanding of DITA including topic structure, elements, conrefs, child maps, and filtering/conditional processing.

For your convenience, we’ve provided a downloadable PDF of this article.

Cost Metrics Overview

The cost to develop content and reuse percent values are standard components in many ROI calculations. We need the cost and reuse values for a DITA project to determine DITA ROI.

This paper focuses on the cost of content creation and introduces various levels of reuse into the model. We’ll begin with a deep dive into the cost of creating DITA topics and then incorporate the cost of unique content, identical content and similar content.

Over the course of the series, we will discuss the following components of our cost model:

  • Cost of content creation (in this paper)
  • Cost of content analysis and inventory
  • Cost of review and project management
  • Cost of filtering
  • Cost of publishing
  • Cost of content maintenance
  • Cost of converting legacy content to DITA
  • Cost of translation

Cost of a Project

Let’s show how our model is used to determine the cost of creating user guides for three models of a fictitious personal digital assistant (PDA). PDA One has a base set of features. PDA Two has all the features of PDA One, plus several additional features. PDA Three has all the features of PDA One and PDA Two, plus several features unique to PDA Three. This documentation project contains lots of identical content and unique content.

The first version of our documentation project is relatively simple so that the associated cost model is also simple. Later, we’ll introduce more of the content reuse and conditional reuse features of DITA, and show how to incorporate these into the model. Gradually, we’ll increase the complexity of our project and our cost model.

Cost of a Topic

The first step is design the content creation component of the model.  Traditional cost metrics focus on the cost of a page [1]. Since pages are similar to topics, we will start our discussion with determining the cost o creating DITA topics. Costs are expressed in terms of content creator labor hours.

Table 1 shows the cost of creating Task and Concept topics. For simplicity, we exclude Reference and other topic types, but you can easily customize the model to include them if needed. The scope of these estimates is a topic and does not include time for project/publication level activities such as designing the document outline, user task analysis, project management, implementing context-sensitive help, testing, status meetings or design meetings.

Table 1 (Cost of a Topic)

image

image

image

*Includes time to learn the product through interviews or research.

** Screen shots/images: Includes time to create sample data that would be shown in screen shot, capture the image, convert the format, name the image using naming conventions, and store it in the repository.

*** The elements listed here are optional. For example, Concept topic > screen shot, or Concept topic > feature description.

Now we have an approximate cost range in hours for creating Task and Concept topics that we can use in our model.

Cost of User Guides Without Topic Reuse

Table 2 through Table 5 shows the cost of developing a user guide for each of the three PDAs in DITA without taking advantage of topic reuse.

Table 2 (Cost of PDA One User Guide)

image

Table 3 (Cost of PDA Two User Guide)

image

Table 4 (Cost of PDA Three User Guide)

image

Table 5 (Total Cost All User Guides Without Topic Reuse)

image

This is really a worst case scenario that is not realistic because moving to DITA and not reusing any content is highly unlikely. But this simple project is a good starting point for the model and a base to which we can add reuse features of DITA. As you will see, the reuse features that you incorporate can be different for each documentation project.

Cost of User Guides With Topic Reuse

Now we’ll incorporate the cost of reusable topics (reusable in more than one user guide) in our cost model.  Reusing topics is nothing new. This feature has been available in help authoring tools for more than 10 years.

Table 6 through Table 10 shows the cost of developing a user guide for each of the three PDAs taking advantage of reusable topics. Some topics for the PDA One user guide may be reused verbatim in the PDA Two and PDA Three user guides because the topics are identical.

Table 6 (Cost of Reusable Topics)

image

Table 7 (Cost of Topics Unique to PDA One User Guide)

image

Table 8 (Cost of Topics Unique to PDA Two User Guide)

image

Table 9 (Cost of Topics Unique to PDA Three User Guide)

image

Table 10 (Total Cost All User Guides With Topic Reuse)

image

Cost Comparison: User Guides With and Without Topic Reuse

Although the scenario without topic reuse is highly unlikely and not realistic, just to be thorough, we are showing a cost comparison in Table 11 through Table 13.

Table 11 (Total Cost All User Guides Without Topic Reuse)

image

Table 12 (Total Cost All User Guides With Topic Reuse)

image

Table 13 (Savings)

image

For over a decade, significant savings have been achieved reusing topics in multiple publications.

Cost of a Reusable Master Topic

The project is simple when topics can be reused verbatim in multiple publications. But what happens to our model when there is sufficient variation in our products that we cannot write a single topic to describe a given feature? Perhaps the product screen shots are different, an extra note or warning is needed, or a button has a different label. For all three user guides, some topics are similar. Most of a similar topic is the same for each PDA and can be shared. So, if all three versions of that content are included in one topic, then all versions of the user guides may be published from this topic. Using filtering metadata, content that is unique is marked as belonging to a specific PDA. When the user guide for a specific PDA is published, content that is specific to the other PDAs is filtered. The filtering feature in DITA is also known as conditional processing and it is what allows us to create and use reusable master topics.

Table 14 shows the cost of creating reusable master Task and Concept topics.

Table 14 (Cost of a reusable master topic)

image

image

image

image

It would complicate our model to incorporate conrefs to a large variety of content types. Therefore, we are limiting the incorporation of conrefs in our model. We’ll discuss conrefs later, but for now a simple example is to reuse feature descriptions or screen shots that were created in Task topics by conref’ing them in your Concept topics. This would reduce the cost of the Concept topic by several hours. You will be able to customize the model to incorporate your use of conrefs in your project.

Now we have the estimated cost for reusable master Task and Concept topics that we can use in our model.

Cost of User Guides With Reusable Master Topics

Now we’ll incorporate the cost of reusable master topics into our cost model.

Table 15 through Table 16 shows the cost of developing a user guide for each of the three PDAs taking advantage of reusable master topics.

Table 15 (Cost of Reusable Master Topics)

image

Table 16 (Total Cost All User Guides With Topic Reuse and Reusable Master Topics)

image

Table 17 compares the cost of creating unique topics to reusable master topics.

Table 17 (Cost of Unique versus Reusable Master Topics)

image

Reusable master topics cost significantly more to create, but with proper planning they can be used in multiple publications such that the overall cost of content creation for your publications drops dramatically.

Cost Comparison: Topic Reuse Versus Reusable Master Topics

Table 18 through Table 20 shows a cost comparison of a project taking advantage of reusable topics to a project that is taking advantage of both reusable topics and reusable master topics.

When filtering and reusable master topics are used, the number of unique topics and reusable topics that have to be created is drastically reduced. This is where the cost savings of using DITA really leaps ahead of conventional help authoring tools and technology.

Table 18 (Total Cost All User Guides With Topic Reuse and Reusable Master Topics)

image

Table 19 (Total Cost All User Guides With Topic Reuse)

image

Table 20 (Savings)

image

Conclusion

“What is the cost of DITA?” is the question that is the focus of this paper.

Chances are that if you are reading this paper, then you’ve read the papers on the promise of XML, the ROI of DITA, the clear savings in translation costs alone, and you get it. However, you are being asked to justify the cost. How much is it going to cost to develop and publish content using DITA? Knowing percent reuse is important, but equally as important is the upfront cost? And, the savings are in more areas that just translation. There are savings in both content creation and content maintenance due to content reuse and conditional reuse.

The cost to develop content and percent reuse are often discussed separately. But to know the cost of a DITA-based documentation project, both of these values need to be accurately incorporated into a cost model. In this paper, we focused on cost metrics. Later in this series we will focus on reuse metrics.

For a given project, we must know the cost to create content without reuse and the cost with reuse. The difference is the savings. The cost to create content with reuse is equal to the cost to create the unique content and the reused content. That’s the upfront cost. Knowing this will allow us to more accurately calculate the cost of translation as well. What must be translated is reduced to unique content and reused content.

In Part 1 of this paper, we covered the following cost metrics:

  • Cost of content creation

    • unique content
    • reused content
  • Cost of a Task topic
  • Cost of a Concept topic
  • Cost of user guides without topic reuse
  • Cost of user guides with topic reuse
  • Cost of a reusable master topic
  • Cost of user guides with reusable master topics

The most important things we accomplished in this paper included determining the cost of creating a DITA-based topic rather than a traditional page. We then incorporated conditional reuse/filtering and determined the average cost of creating a reusable master topic. We observed that the biggest savings resulted when reusable master topics are incorporated. The flexibility and diversity of conditional reuse in DITA differentiate it from typical help authoring tool technologies and offer greater savings in not only content creation, but also content maintenance.

The ultimate question.

image

In Part 2 of this series (coming soon), we will look at other costs that need to be accounted for in a DITA project:

  • Cost of content analysis and inventory
  • Cost of review and project management
  • Cost of filtering
  • Cost of publishing
  • Cost of content maintenance
  • Cost of converting legacy content to DITA
  • Cost of translation

Other papers in the DITA Metrics series (coming soon!):

  • DITA Metrics: Cost Metrics – part 2
  • DITA Metrics: Reuse Metrics
  • DITA Metrics: Reuse Strategies for Minimizing Cost

Join the DITA Metrics Group

Interested in learning more about DITA metrics? Join the DITA Metrics group on The Content Wrangler Community. The group is lead by the author of this paper, Mark Lewis.

About the author

Mark Lewis works for YOU, the tech writer, and the information architect. Mark has received STC awards for Distinguished Chapter Service and the Florida Technical Communications Competition. Currently, he is the DITA Product Manager for Usability and a product Evangelist for Quark. He provides product direction and user experience designs for Quark’s structured authoring products allowing everyday authors to create reusable content without knowing the details of XML syntax. Mark presents at conferences and other industry events on topics including: object oriented design methodologies (for non-programmers to help jumpstart their understanding of structured authoring), information architecture, and the promise of DITA and XML. Mark manages The Content Wrangler Community Groups – WritingOBJECTively and DITA Metrics. Send questions or feedback to mlewis@quark.com or hyperwriters@hotmail.com

————

1. For more information on traditional cost metrics, see Lasecke, Joyce. “Stop Guesstimating, Start Estimating. In Intercom, February 2006 issue, Society for Technical Communication. Return to article

Similar Posts:

Print Friendly
Tags:, , ,

Currently there are "12 comments" on this Article:

  1. meg miranda says:

    So for all these cost estimates, the assumption is that I can’t get reuse without going to DITA, yes? 

    How about assuming that I’ve got a fantastic reuse system going with unstructured FrameMaker and text inset usage? In that scenario, what sort of ROI could I expect to achieve after converting to DITA?

    thanks.

  2. Marcus Carr says:

    I agree with Meg. DITA is often presented as the only way to re-use data effectively, but that’s simply not the case and it has annoyed me from the start. The entire article is about the cost benefits of re-using data – the fact that it may be tagged as DITA contributes next to nothing, as far as I can see.

  3. Bradley shoebottom says:

    Just tried to print this article off, but the text stops afer page 2 (after Table 1). Same problme when printing to PDF.

    Can you pelase fix? Thanks

  4. [...] This is the second installment of the DITA Metrics series which examines the cost and reuse values for a DITA project to determine DITA ROI. The concepts and ideas discussed are based on the cost model introduced in the first paper, DITA Metrics: Cost Metrics – Part 1. [...]

  5. Yes, you can reuse text using unstructured Frame’s text inset feature. In fact, for those using unstructured Frame, I encourage using that to get some level of reuse. However, here are some of the things that make DITA’s reuse mechanism more appealing:

    - You can centralize the content that you are reusing so there’s a single source. Makes it easier to manage and find. It also eliminates the problem of reusing the same content from multiple sources which can lead to synchronization problems.

    - Architected reuse. You can make a library of reusable content available to all so that reuse isn’t ad hoc. It’s planned for and folks can nominate candidates for reuse when it makes sense to do so.

    - Planned correctly, you can target content in a base topic, but have it be product-specific in the final output by having a different resolution at build time. Using the keyref feature of DITA 1.2 makes this even easier to accomplish and extends filtering/reuse possibilities significantly.

    - Findability. Because DITA source is XML, it’s easy to build queries that search through content and find places where content is reused or find something that may meet your needs when deciding to reuse content. This is difficult to do with binary files.

    - Modularity. Because you’re not designing for a monolith to begin with, you tend to write differently and those differences make reuse easier. There’s no need to rewrite to fit the current context because you’ve already accounted for the fact that the topic, or piece that you’re reusing will be used in multiple contexts and you write that way, which you may not do when writing in a narrative style.

    - Metadata. Something that you cannot get in any unstructured environment is the ability to describe the information to a degree that will help you filter information in the final output or allow your user to obtain precisely the information that fits their use case.

    One final note. When converting unstructured Frame with text insets to DITA, you wind up with multiple copies of the same content, effectively breaking the reuse scenario. The best way to go from Frame to DITA is to start with structured Frame but that’s not even a guarantee.

  6. Dan Dube says:

    Hi Meg and Marcus,

    I’m someone who has been helping customers for over 20 years to move from proprietary-based systems to standards based systems (DITA, XML, and prior to that SGML). You both make some very valid points, and they represent a theme that I have heard many times during my career.

    In fact, the arguments that you make are the exact same ones that I heard from Interleaf users in the late 1980s-early 1990s. Unfortunately, many of the people who stayed with their proprietary/closed Interleaf system ran into a lot of trouble later on when the technology evaporated, customer expectations for delivery formats changed, and they had to completely redo all of their hard work.

    This is the biggest concern I have when I hear people praising the virtues of unstructured FrameMaker. I agree that it has proven to be a very effective tool for creating reusable topics that can be shared, and it is quite capable as a publishing system as well. I applaud you for putting the thought into your content structure to be able to leverage FrameMaker’s capabilities to reuse content.

    However, the core problem still lies with the fact that you are completely dependent on the technology of a particular vendor in order to accomplish this. If, like Interleaf, the day comes that FrameMaker stops being supported, what will you do?

    The single biggest advantage of moving to DITA is that is is an open standard. It is completely independent of any single vendor’s technology. If one product disappears, in theory you can take all of your content and start working (almost) seamlessly with another tool, as long as it supports the same standard. This helps to ensure that all of the thought and work that you put into designing your content to maximize reuse is not lost.

    Other advantages come into play as a side benefit of migrating your content to DITA, including:

    - Your content is now format-neutral, not proprietary. This greatly simplifies the process of rendering this content to another output format as market needs change. (EPUB or Kindle, anyone?)

    - It also greatly simplifies (and lowers the cost of) localization of content into other languages, if this is a requirement for you. FrameMaker files have to be converted to a format that a translator can work with in their translation management system (Trados or whatever). Translation vendors are slowly coming around to the benefits of XML/XLIFF as a superior means to make the localization process more efficient and cost-effective.

    - I don’t know what your situations are, but I have seen many technical writers/authors/editor get bogged down in formatting issues, trying to make the page “look” right. This tends to go away once people start authoring in DITA/XML (formatting is done as a post-process at publish time), and it frees the authors to spend more time doing what they do best: write! You can use this extra time to beef up the attributes you are associating with the content (to improve the quality of search/query), as an example.

    There are many more that I could list, but these are the main ones.

    -Dan

  7. If you are reusing your content across several guides with the same format, standard Frame can be a great solution. I’ve used it myself that way. (I’ve also implemented successful reuse schemes with RoboHelp/Word, custom XML, custom SGML, as well as DITA.) So, there might not be a good financial argument to move to DITA.

    On the other hand, if you are using that content to output to three different guides, each branded differently (with a different format), then standard Frame is not so effective any more. Applying and reapplying formats in file sets with text insets can be time-consuming to get right. DITA, with the separation of content and format inherent in XML, is definitely more effective in that scenario. For example, I am working on a project where individual topics can be reused in upwards of 6 different guides, each published in multiple formats, plus two different flavors of HTML. DITA is the right solution for us.

  8. Interesting article. I’m looking forward to future installments.

    That said, having gone through major conversions several times, I can understand where Meg and Marcus are coming from. The cost of moving to a new environment can be very high, and I would argue that simply moving from an unstructured environment to an XML environment with all other elements remaining equal (including, and maybe especially, the level of reuse) will likely not be cost effective.

    To make the case, I think you need to consider additional benefits. For example:

    1) Additional reuse: does the technology help you reuse more content? Also, how extensive is your current reuse strategy? Many organizations are unknowingly duplicating more information than they think they are, and unstructured formats make that duplication harder to discover.
    2) Localization cost: does the technology help you reduce localization costs?
    3) Single sourcing: do you need to expand the number of output formats, or can you reduce the cost of producing the ones you already produce.
    4) Cost of tools: can you reduce the cost of your tool chain?
    5) Productivity: does the technology allow you to reduce the cost of developing and maintaining your content?

    Against this, you need to balance the costs that Mark will address in the next part.

    In some cases, and Meg and Marcus’s cases may be among them, the balance will remain in favor of unstructured content. However, I suspect that for a significant number of organizations, especially those that are currently not seriously reusing content, XML is the right way to go. I also suspect, but cannot quantify (maybe Mark can consider this in a future article), that the larger your organization and the more content you have, the more you will save with a well-designed XML strategy.

    One last thought. While DITA is cool, it is not the only XML game in town, and I believe, as Steve comments, that you can have a good reuse strategy in non-DITA environments.

  9. Richard’s correct. DITA isn’t the only XML game in town or is it necessarily the solution to everything. It was designed, however, with reuse, filtering, and other features that may not exist in other XML solutions. That said, it’s up to an organization to examine its requirements and make the choice as to whether or not any XML solution is right for them. With the growing numbers of acquisitions and partnerships I think you’ll see more and more companies looking for a common denominator for information interchange and reuse. Those requirements will push the decisions more than anything else.

  10. Mark Lewis says:

    Thanks to everyone for the great feedback. The first DITA Metrics article was originally written in the Winter of 2007 and I responded your questions back then, but the Internet seems to have eaten them. So, I’ll repeat.

    Yes, the cost model I propose is DITA specific, but I do not mean to imply that DITA is the only solution. There are many reasons to use DITA and just as many to not, depending on your content and publishing requirements. In the use case where you have several products that are similar, you may have a high opportunity for reuse and DITA might make sense. In other use cases DITA (and other XML), may not make sense. The cost to implement may be greater than the savings. So, perform your content audit, understand your content requirements and use this cost model to help you determine is you should make the move to DITA. If DITA doesn’t make sense, there are probably many other wonderful technologies and methodologies that will.

    At several conferences I’ve been approached told that my cost model could be used for XML schemas that support content filtering and/or content referencing. True. Change the content elements from DITA to the elements in your schema and update the cost per element. My topic-based cost model is based upon traditional cost models, the cost of a page. So, take this model, customize it to your content or your XML schema and please share your results. I’d love to hear them. As a community, the more we share, the easier it will be to justify the tools and technology we need and prove the business case and value of content management.

  11. Looks like the pdf download does not work ;-((((

    Marie-Louise

  12. Mark Lewis says:

    DITA Metrics white papers can be found at http://dita.xml.org/search/node/%22dita+metrics%22 and through the DITA Metrics Linked In group. Join to receive additional information, case studies and ideas. Join and share your own models, ideas and case studes.

    When the TCW Community relocated to Linked In, it broke much of this posting. So, we’ll see you at Linked In or DITA.XML.org
    Mark Lewis

Comment on this Article:







Subscribe to the Newsletter

Get The Content Wrangler Newsletter delivered straight to your home or work Inbox. It's full of content goodness.

Sponsors

Scriptorium
Content Rules
Dozuki
iFixit.com
oManual
Fractal Enterprise
LavaCon
Adobe FrameMaker
Gnostyx
STC
WordPress Consulting
MindTouch Techcomm
MindTouch 2
Grammar Girl
Acrolinx 1
SDL Live Content
JFM Concepts VDP Web
Smart TV San Francisco
Oxygen
MindTouch 1
Southern Polytechnic
Earley Associates Workshops
Content Rules 2
Text Wrangler
TC World Magazine

Recent Comments

  • DataComm Plus: Communication is challenge within itself. I believe that mos...
  • Barbara Saunders: I think the problem of writers who "think they are artists" ...
  • Mark Baker: @Marcia -- What is conventional wisdom for except to questio...
  • Mark Baker: @Joe -- Many of the things you might want to link on already...
  • Marcia Riefer Johnston: P.P.S. I love your title so much, in fact, that I've just a...

Readers

Subscribe by or


Archives