Archive for the 'ODF' Category

OpenOffice .odt Opened Up – Part 3a: Styles/font-face-decls

Wednesday, April 4th, 2007

Overview

In my last article, OpenOffice .odt Opened Up – Part 2: Meta and Settings, I discussed two of the four top level subdocument elements, office:document-meta and office:document-settings. In this article, I will be taking a closer look at the office:document-styles element, in particular the office:font-face-decls sub-element. As before, my test cases where produced with the following software:

  • SuSE Linux 10.1
  • OpenOffice 2.0.2.7.1
  • zip 2.31 (March 8th 2005)

The Relax-NG schema language is used to define elements of the specification. The original source document can be downloaded here oo_part1.odt, and in particular the subdocument under observation can be downloaded here styles.xml.

The office:document-styles element

The office:document-styles root element contains all font face declarations, named styles, automatic styles and master styles need for the document.

office:document-styles schema

<define name="office-document-styles">
  <element name="office:document-styles">
    <ref name="office-document-common-attrs" />
    <ref name="office-font-face-decls" />
    <ref name="office-styles" />
    <ref name="office-automatic-styles" />
    <ref name="office-master-styles" />
  </element>
</define>

Next let us explore the office:font-face-decls sub-element.

The office:font-face-decls element

This element is actually duplicated in the top-level office:document-content element. A few simple test indicate that, if differences exist in the two sub-elements, complete element omissions in one are populated by the other, and where two elements differ in content the definition in office:document-styles takes precedence, though this behavior is not defined explicately in the specification.

The office:font-face-decls element consist of style:font-face elements. If you remember, we generated our test document by selecting text from a pdf and pasting that text into an .odt. This generated such style:font-face elements as follows:

<style:font-face style:name="EIDQUI+CMSLTT10"
                 svg:font-family="EIDQUI+CMSLTT10"/>

<style:font-face style:name="FFWLFJ+CMR10"
                 svg:font-family="FFWLFJ+CMR10"/>

<style:font-face style:name="GRVNVC+CMTT9"
                 svg:font-family="GRVNVC+CMTT9"/>

<style:font-face style:name="HJCZVV+CMTT8"
                 svg:font-family="HJCZVV+CMTT8"/>

<style:font-face style:name="Lucidasans1"
                 svg:font-family="Lucidasans"/>

With the exception of the last element, this looks pretty ugly. The following is a sample of style:font-face elements taken from a newly created document.

<style:font-face style:name="HG Mincho Light J"
                 svg:font-family="’HG Mincho Light J’"
                 style:font-pitch="variable"/>

<style:font-face style:name="Lucidasans"
                 svg:font-family="Lucidasans"
                 style:font-pitch="variable"/>

<style:font-face style:name="Thorndale AMT"
                 svg:font-family="’Thorndale AMT’"
                 style:font-family-generic="roman"
                 style:font-pitch="variable"/>

<style:font-face style:name="Albany AMT"
                 svg:font-family="’Albany AMT’"
                 style:font-family-generic="swiss" />

The reason for this is that OpenDocument font face declarations directly correspond to the @font-face font description of CSS2 and the <font-face> element of SVG, but have two extensions.

  1. OpenDocument font face declarations optionally may have an unique name. This name can be used inside styles as the value of the style:font-name attribute to immediately select a font face declaration. If a font face declaration is referenced this way, the steps described in CSS2 font matching algorithms for selecting a font declaration based on the font-family, font-style, font-variant, font-weight and font-size descriptors will not take place, but the referenced font face declaration is used directly.
  2. Some additional font descriptor attributes may exist.

Which basically means svg:font-family="EIDQUI+CMSLTT10" uses the SVG font matching algorithm and not the named font. SVG is beyond the scope of this article. Reference material for SVG font declarations can be found here.

Back to the bigger picture. The benefit we can observe from this, is that a predefined set of fonts can be applied to an .odt. By doing this we can ensure that documents contain a consistent set of fonts and eliminate potential redundancy or functional overlap. Care must be taken that if a style:font-face is replaced, that all style:font-name, style:font-name-complex and style:font-name-asian attributes are examined and replaced as well. While potential size gains are arguably minimal, gains in consistent look and output are immeasurable.

One option Open Office gives the user to tackle this issue is the font replacement option. Simply choose Tools -> Options then OpenOffice.org -> Fonts. You should see a dialog similar to the following:

Font Replacement Dialog

Click for full size image

The Open Office user can simply select which fonts to replace with which fonts on an Always or Screen only case. Though this is not always a complete solution. Amore complete solution will be provided in the final installment of OpenOffice .odt Opened Up – Part 3: Styles. I will provide an application that will indeed optimize all of the aspects of the office:document-style elements. Up next is the office:styles element.

Until next time,

-3monkeys

Microsoft Caught Trying to Change Wikipedia Entries

Wednesday, January 24th, 2007

Imagine my surprise when a story I happened to cover yesterday splashed up on my Google Homepage. Google News reports over 200 references to the story. In Microsoft, Office Open XML and A Lie, I reported on Mr. Jelliffe’s offer and blog entry. I’m pleased to see that the story is getting national top tier coverage. Here is some of the coverage:

Perhaps this will spark more debate on Microsoft’s motovation for the EOOXML standard.

Update: TechCrunch reported on this today, with more insight than the original AP story.

Until next time-

-3Monkeys

The Open XML Lie

Wednesday, January 17th, 2007

Rob Weir recently posted “How to hire Guillaume Portes“, which appeared on Slashdot, both of which both are great resources for additional comments and debate. The basic premise of Rob’s article was that the Microsoft Open XML Specification was similar to creating a job description that would allow for only one qualified respondent. Such a job description might read as follows:

  • 5 years experience with Java, J2EE and web development, PHP, XSLT
  • Fluency in French and Corsican
  • Experience with the Llama farming industry
  • Mole on left shoulder
  • Sister named Bridgette

While perhaps a little extreme, he continues to show that indeed the Open XML Specification is indeed written to accommodate Microsoft products. I will not bore you with all of his examples, but here are a few are worth inspection.

2.15.3.6 autoSpaceLikeWord95 (Emulate Word 95 Full-Width Character Spacing)

This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 95) when determining the spacing between full-width East Asian characters in a document’s content.

[Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

and

2.15.3.51 suppressTopSpacingWP (Emulate WordPerfect 5.x Line Spacing)

This element specifies that applications shall emulate the behavior of a previously existing word processing application (WordPerfect 5.x) when determining the resulting spacing between lines in a paragraph using the spacing element (§2.3.1.33). This emulation typically results in line spacing which is reduced from its normal size.

[Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

This gluttony is further illustrated by the shear complexity of the specification. As many 3Monkey readers know, I’m conducting a series of articles comparing the ODT and DOC formats. With Microsoft Office due to hit consumer shelves at the end of January, I thought I would get a jump on things and download the OOXML specification to get a jump on things. To my surprise the Open XML specification comes in 5 different PDF files with an 6 accompanying electronic annexes in excess of 43 megabytes. For comparison the ODF specification is a single 11 megabyte PDF, with 3 separate XML schemas. The ODF specification weighs in at a mere 722 pages, where as, the largest PDF in the Open XML specification is 5219 pages long.

While I have to wonder at Microsoft’s motivation for producing the Open XML standard, I do not have to guess at the motivation for ODF. Started as early as 1999, ODF was designed as an open and implementation neutral file format. The open specification process started in 2000 with the foundation of the OpenOffice.org open-source project. An even higher level of openness was established in 2002 with the creation of the OASIS Open Office Technical Committee (TC). ODF had gained full adoption with it’s early adopter including OpenOffice.org 1.0 and StarOffice 6 being introduced in May of 2002 and KOffice adoption of the ODF format in August of 2003.

IBM has provided the one voice of reason in this travesty. IBM voted against the certification of Microsoft Office document formats (Open XML) as an international standard at a general assembly of Ecma International in early December 2006. Bob Sutor, IBM’s vice president of standards and open source, confirms Mr. Weir’s sentiment that the ODF standard is of superior quality, versus Open XML which he considers to be “a vendor-dictated spec that documents proprietary products via XML“.

Open XML has been submitted to the ISO for standardization. I encourage each and every reader to oppose this standardization effort. Further details will be outlined on this blog as they become available.

Until next time-

-3Monkeys