XML and Standards
XML is a open standard from the World Wide Web Consortium (W3C). W3C includes member organizations and individuals that define the rules and recommendations for XML structure, XML languages, and supporting technologies (such as XSL-FO and XSLT).
Vendors create tools or platforms that conform to these standards. For example, Firefox, Internet Explorer, and Google Chrome are all made by different companies but can display HTML pages. XMetaL, oXygen, and FrameMaker are all made by different companies but can be used to author valid XML files that conform to the standard. Tool vendors support widely used standards so that you can leverage the work done for that standard.
In addition to W3C standards, additional XML language standards have been developed by different organizations to describe sets of elements and processes for specific industries. Standards allow companies and individuals to share information across organizations.
Some common standards used in the technical documentation industry include:
The sections are described briefly in this section.
Docbook
Docbook is an XML language maintained by OASIS that originally came from the software and hardware documentation industries. It includes elements that describe the structure of books. The root element of a Docbook document is book, and other elements include such things as chapters, preface, appendix, and so on.
For smaller documents, Docbook provides a simplified DTD for the creation of white papers or articles.
For more information, see http://docbook.org/.
S1000D
S1000D is an International Specification for Technical Publications utilizing a Common Source Database XML language specification maintained by the S1000D Council and the S1000D Steering Committee. It originally came from the aviation industry to describe airplane maintenance and specifications.
Documentation is created in a series of reusable modules. As described on the S1000D website, a data module is defined as a stand alone information unit and contains descriptive, procedural or operational data for a platform, system or a component. It is produced in such a form that it can be stored and retrieved from a Common Source Database by using the data module code as the identifier.
Each module contains two sections: one describes identifying information (codes) about the module and the other contains the contents. Modules are maintained with a database.
For more information, see http://public.s1000d.org/Pages/Home.aspx.
DITA
DITA is an XML standard maintained by OASIS DITA Technical Committee. DITA stands for:
- Darwin (reference to Charles Darwin because DITA is based on a framework of inheritance and specialization)
- Information Typing (you divide information into topics based on the type of information)
- Architecture (framework for creating topics and publishing)
DITA was originally created by IBM for use in technical publications. Subsequently, IBM turned it over to OASIS to become a standard. DITA is designed to support topic-based writing Writers created individual self-contained topics and combines these topics into maps for create deliverables.
For more information, see http://dita.xml.org/.
Why are standards important?
In this class, each person created their own definition of a resume. Because each person developed through own, student A couldn’t send the xml file for his or her resume to student B and have student B use the transformations and DTDs that he or she developed on student A’s successfully. The underlying structure of each resume is different.
If the instructor had created a standard resume DTD and students all created resumes using that standard DTD, students could trade the resumes and still be able to use their individual transforms to create output
Introduction to topic-based, structured authoring
Structured authoring is the process of:
- Segregating blocks of information into independent, stand-alone topics, each of which conveys a particular type of information (a topic type)
- Focusing on separating descriptive information from task-oriented information
- Separating content from format (that is, introducing automation into formatting decisions)
- Enabling assembly of individual topics into well structured deliverables
- Enabling reuse of topics (for example, in multiple deliveries, or across multiple product lines)
How does this differ from the way technical manuals were written in the past?
- Writers no longer write long chapters that consist of sections and subsections within a single file.
- Writers no longer assume that readers read the information in a sequential order.
- Writers no longer make formatting decisions manually.
What is information typing?
Information typing is the practice of identifying types of topics that contain distinct kinds of information, such as concepts, tasks, and reference information.
A topic’s information type reflects the basic type of information that the topic communicates to the audience.
Advantages to topic-based authoring
Topic-based authoring allows writers to
- Work in groups on the same information set
- Share topics with other writers
- Identify common or reusable subjects
- Reorganize information more easily and consistently
- Have reviewers review smaller chunks of information
- Focus on tasks
- Eliminate unimportant or redundant information
- Factor out supporting concepts and reference information into other topics, where they can be read if required and ignored if not
- Improve information quality
Task-oriented documenation
The task is the center of the user documentation world. All references and concepts within a deliverable should support completing required tasks.
Task-oriented documentation:
- Focuses on user goals
- Is written from the user’s point of view
- Targets the appropriate audience
- Tells users why they should perform a task
DITA topics
DITA is designed to support topic-based writing. Each topic should be self-contained and cover a single subject, such as:
- How do I do I change a tire on a bike?
- What is the a gear shifter on a bike?
- What part do I need I for fix my bike?
DITA contains several pre-defined topioc types to help you structure information. The most common of these include:
Type |
Description |
Task |
Provides step-by-step instructions in a procedural format
Answers this question: “How do I use it?” |
Concept |
Provides background information that the user needs to know or understand to complete tasks successfully (for example, overview information). Ideally, each concept topic supports a task topic. Answers these questions: “What is it? What does it do? How does it work? Why is it important?” |
Reference |
Provides detailed information in a structured lookup table or list
format (for example, messages, or a parameter list)
Answers this question: “What value or data should I use (lookup information)?” |
What is a task topic?
A task topic provides sequential step-by- step instructions in a procedural format, and are used when you want to describe how to do something.
Example from DITA toolkit:
What is a concept topic?
Concept topics should provide readers with clear and accurate background information they must know before they can successfully understand and use the product or service discussed in the document. A concept is written as descriptive text, and answer the questions, what is it, what does it do, how does it work, and why is this important. Concept topic types should be used to support task topic types
Example from DITA toolkit:
What is a reference topic?
A reference topic provides detailed explanatory information in a structured lookup table or list format, and enable readers to quickly find what they need. It is used to find a particular fact, such as:
- A lists of parameters
- A lists of fields
- A list of parts
- Syntax for a command
Example from DITA toolkit: