OpenOffice .odt Opened Up – Part 1: Overview

Overview

In the first article in this series, OpenOffice ODF/.odt compared to Microsoft Word .doc, I compared various file types for size efficiency. Of particular interest was the fact that OpenOffice Write stores .odts in a zip format, an implementation of PKZip to be exact. With this knowledge and the Open Document Format standard, we can investigate how certain elements of a document effect its size and overall efficiency.

My test cases where produced with the following software:

  • SuSE Linux 10.1
  • OpenOffice 2.0.2.7.1
  • zip 2.31 (March 8th 2005)

Starting Out

As we previously observed, .odt documents are stored in ZIP format. It is possible to store the document as a single XML file that conforms to the OpenOffice.org document type definition (DTD). It is also possible to store the document as several subdocuments, each with a different document root that represents a particular aspect of the document, such as, content or style.
Quoting the Open Document Format for Office Applications (OpenDocument) v1.0 (Second Edition), (ODF Specification):

The OpenDocument format supports the following two ways of document representation:

  • As a single XML document.
  • As a collection of several subdocuments within a package (see section 17), each of which stores part of the complete document. Each subdocument has a different document root and stores a particular aspect of the XML document. For example, one subdocument contains the style information and another subdocument contains the content of the document. All types of documents, for example, text and spreadsheet documents, use the same document and subdocuments definitions.

There are four types of subdocuments, each with different root elements. Additionally, the single XML document has its own root element, for a total of five different supported root elements. The root elements are summarized in the following table:

Root Element Subdocument Content Subdoc. Name in Package
office:document Complete office document in a single XML document. n/a
office:document-content Document content and automatic styles used in the content. content.xml
office:document-styles Styles used in the document content and automatic styles used in the styles themselves. styles.xml
office:document-meta Document meta information, such as the author or the time of the last save action. meta.xml
office:document-settings Application-specific settings, such as the window size or printer information. settings.xml

So, what is in our reference .odt? We will use the Linux produced document from a prior article (oo_part1.odt) with XML compression disabled. We’ve done this so that the XML is more human readable. After we unzip the file using the Linux utility unzip, we have the raw files as shown below.

.odt unzipped directory tree

As you can see all four subdocuments as specified in the specification are present as well as several other files. In particular META-INF/manifest.xml list the contents of the package, including information such as full path and type.

The file Thumbnails/thumbnail.png although part of the package, is not part of the document. The thumbnail image should conform to the Thumbnail Managing Standard (TMS) at www.freedesktop.org, and therefore should be24bit, non-interlaced PNG image with full alpha transparency. The required size for the thumbnails is 128×128 pixel.

Here is the thumbnail from our reference document.

thumbnail.png

Having the thumbnail available in the package, allows other applications such as file managers to preview the document to the user. With a little creative programming, sites such as Google, Yahoo or Ask, could extract this thumbnail and preview the document for users, with little difficulty.

Document Elements

The office:document may contain any of the document elements listed below.

  • office:document-attrs
  • office:document-common-attrs
  • office:meta
  • office:settings
  • office:scripts
  • office:font-face-decls
  • office:styles
  • office:automatic-styles
  • office:master-styles
  • office:body

When the subdocument method is used however, elements are restricted to certain subdocuments.

Elements in content.xml

  • office:document-content (subdocument root)
  • office:document-common-attrs
  • office:scripts
  • office:font-face-decls
  • office:automatic-styles
  • office:body

Elements in styles.xml

  • office:document-styles (subdocument root)
  • office:document-attrs
  • office:document-common-attrs
  • office:font-face-decls
  • office:styles
  • office:automatic-styles
  • office:master-styles

Elements in meta.xml

  • office:document-meta (subdocument root)
  • office:document-common-attrs
  • office:meta

Elements in settings.xml

  • office:document-settings (subdocument root)
  • office:document-common-attrs
  • office:settings

What’s Up Next?

At this point we have a clear understanding of the subdocument method that OpenOffice applies to its ODF implementation, and we know what top level elements are handled by each subdocument.

In the next article, we will ease into the subdocument elements by exploring the office:document-meta and office:document-settings elements. These two elements are rather simple and will not require as much review compared to office:document-content or office:document-styles.
Until next time.

-3Monkeys

Popularity: 23% [?]

  • DZone
  • StumbleUpon
  • Technorati
  • del.icio.us
  • Slashdot
  • Digg
  • Reddit
  • NewsVine
  • SphereIt
  • e-mail
  • Facebook
  • Google Bookmarks
  • Live
  • Propeller
1 Star2 Stars3 Stars4 Stars5 Stars6 Stars7 Stars8 Stars9 Stars10 Stars (9 votes, average: 6.67 out of 10)
Loading ... Loading ...

20 Responses to “OpenOffice .odt Opened Up – Part 1: Overview”

  1. What’s In An ODF File? « Opportunity Knocks Says:

    [...] 12th, 2007 · No Comments 3Monkeys is doing a series where they will look inside of file formats to describe them.  Thearticle seems quite interesting.  This is a follow-on from an earlier comparison of file sizes. As you can see all four subdocuments as specified in the specification are present as well as several other files. In particular META-INF/manifest.xml list the contents of the package, including information such as full path and type. [...]

  2. Office 12 Watch » What’s the .ODT all about Says:

    [...] You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your ownsite. [...]

  3. domelhor.net Says:

    Comparao entre ODT e DOC…

    Comparativo entre os formatos ODT do OpenOffice e DOC do Word….

  4. ralamosm Says:

    Nice article that brings order to something I partially knew some years ago because a problem I had with an OO document. The problem was that I was working on a very important OO document and suddenly my PC rebooted without reason and the file get corrupted after that. Some friend told me that OO documents where just XML stored in ZIP format so I could easily fix that file by myself just editing the XML.

    If you use OO for important files, read this article (and the first one). It can save your life.

    Good work 3monkeys :)

  5. Mark SoftWare Top » OpenOffice .odt Opened Up - Part 1: Overview Says:

    [...] read more | digg story [...]

  6. New Video Published: Working with an OpenDocument Format (.odt) file | ShowMeDo Blog Says:

    [...] you can poke inside your own .odt files using my instructions. You’ll find more tips at at 3monkeyweb. January 7th, 2010 | Category: New ShowMeDo Videos, Open Source [...]

  7. freebord Says:

    Thank you for this entry. Waiting for next one.

  8. Pura Guilstorf Says:

    I like this web site because so much utile material on here : D.

  9. maxirex Says:

    Hey, good article. Thanks for sharing once again. I enjoy checking out your blog because you guys usually provide well written postsVery nice writeups.. I am excited to add this blog to my faves. I think I will subscribe to this feed too. I intrigued by sporting…..

  10. Harvey Kennemuth Says:

    I think therefore. I think your own article gives those people a fantastic reminding. And they’re going to express thanks to you in the future

  11. zsnare.com,social network,insurance network,health,computer,internet Says:

    zsnare.com,social network,insurance network,health,computer,internet…

    [...]3monkeys » OpenOffice .odt Opened Up – Part 1: Overview[...]…

  12. Buy TRX Says:

    pretty good overview on this subject

  13. Desmond Daras Says:

    Can I simply say what a reduction to seek out somebody who actually is aware of what theyre talking about on the internet. You undoubtedly know how to convey an issue to light and make it important. More folks need to read this and perceive this facet of the story. I cant consider youre no more common since you definitely have the gift.

  14. Santanu’s Blog Says:

    [...] You’ll find more tips at at 3monkeyweb. [...]

  15. android apps Says:

    3monkeys » OpenOffice .odt Opened Up – Part 1: Overview I was recommended this web site by my cousin. I’m not sure whether this post is written by him as nobody else know such detailed about my problem. You’re amazing! Thanks! your article about 3monkeys » OpenOffice .odt Opened Up – Part 1: Overview Best Regards Lisa Nick

  16. Maryanne Sebero Says:

    very first aid kits… thank you for the informative information you’ve here!…

  17. online gambling Says:

    Someone essentially help to make seriously posts I would state. This is the very first time I frequented your web page and thus far? I amazed with the research you made to make this particular publish extraordinary. Great job!

  18. Jacquie Garand Says:

    I was very pleased to find your site. I definitely liked every little bit of it and I have you bookmarked to check out new posts.

  19. marcin wrona Says:

    Aw, this was a very nice post. In concept I wish to put in writing like this moreover – taking time and actual effort to make an excellent article… however what can I say… I procrastinate alot and not at all appear to get one thing done.

  20. Johnnie Burtenshaw Says:

    Needed to post you one very little remark just to thank you very much over again considering the exceptional methods you’ve featured in this case. This is really particularly generous with you to provide unreservedly what exactly a few people might have offered for an e-book to earn some profit for their own end, particularly now that you could have done it in case you desired. Those creative ideas likewise acted like a easy way to realize that someone else have a similar interest really like my own to understand a lot more with respect to this problem. I am certain there are many more enjoyable moments ahead for individuals that view your blog.

Leave a Reply