OpenOffice .odt Opened Up – Part 2: Meta and Settings
Overview
In my last article, OpenOffice .odt Opened Up – Part 1: Overview, I discussed the overall package scheme for ODT documents, and pointed out that OpenOffice uses the subdocument form. In this article, I will be taking a closer look at two of simpler top level subdocuments of the four included in the specification. Specifically, we will be taking a closer look at the office:document-meta and office:document-settings elements.
As before, my test cases where produced with the following software:
- SuSE Linux 10.1
- OpenOffice 2.0.2.7.1
- zip 2.31 (March 8th 2005)
The original source document can be downloaded here oo_part1.odt, and in particular the two subdocuments under observation can be downloaded here meta.xml and settings.xml.
The office:document-meta element
The office:document-meta element provides metadata with respect to the document, such as, author, creation time and editing time, among other data. The metadata elements can be either pre-defined or user defined. Pre-defined elements should be respected and updated by the editing application. User defined elements provides a more generic way of storing and using metadata. Each user defined metadata element is compossed of a name, a type and a value. Supporting applications can access this information and display it to the user based on its type. Both pre-defined and user defined should be able to be referenced through appropriate document text fields.
The pre-defined metadata elements are largely based upon the metadata standards developed by the Dublin Core Metadata Initiative (http://www.dublincore.org), thus many of the elements use the dc namespace.
There are 18 pre-defined metadata elements, these are listed below:
- meta:generator
- dc:title
- dc:description
- dc:subject
- dc:keyword – Can appear multiple times
- meta:initial-creator
- dc:creator – Last modifier
- meta:printed-by
- meta:creation-date – Format YYYY-MM-DDThh:mm:ss
- dc:date – Last modification date, format YYYY-MM-DDThh:mm:ss
- meta:print-date
- meta:template
- meta:auto-reload
- meta-hyperlink-behaviour
- dc:language – As defined by RFC3066, with ISO 639 language code and ISO 3166 country code
- meta:editing-cycles
- meta:editing-duration – Format PnYnMnDTnHnMnS
- meta:document-statistic – Can appear multiple times, ODT attributes below
- meta:page-count
- meta:table-count
- meta:draw-count
- meta:image-count
- meta:ole-object-count
- meta:paragraph-count
- meta:word-count
- meta:character-count
- meta:row-count
- meta:frame-count
- meta:sentence-count
- meta:syllable-count
- meta:non-whitespace-character-count
As I suggested regarding the thumbnail image in a prior article, this information could easily be extracted and displayed to users of popular search engines such as Google, Yahoo and Ask. Additionally these services could allow the user to narrow their search based on certain criteria found in the metadata.
Provided here, meta.pl, is an example written in perl using the XML::Simple package that extracts the last editor, modification date, and page and word count. If anyone would like to contribute ports of this to another language feel free. If there is significant interest, I will cover XML::Simple or other XML packages or utilities.
The office:document-settings element
Next we take a look at the office:document-settings element. This element contains application settings that may impact thedocument. It does not caontain a complete set of application settings. Being application settings, there are no particular entries that are defined ih the ODF Specification. A office:document-settings element will contain one or more config:config-item-set elements, these elements will in turn contain config:config-item, config:config-item-set, config:config-item-map-named or config:config-item-map-indexed. The discovery of how each of these elements works with a particular application, such as OpenOffice.org Writer, is left as an exersise to the reader since they do not directly affect our goals of understanding ODF as it relates to Microsoft’s .doc format. Suffice it to say, the office:document-settings element is of little interest to all but an application developer.
What’s up next?
Next up we will investigate the significantly more interesting office:document-styles element. We will also learn some optimization techniques that we can apply to this element and perhaps discover a little of how it relates to thee office:document-content element.
Until next time,
-3Monkeys
Popularity: 8% [?]















(4 votes, average: 8.50 out of 10)
January 23rd, 2008 at 12:25 pm
The keyword element is “meta:keyword” and not “dc:keyword”, as there is no Dublin Core element for keywords management.
Great job, by the way.
March 18th, 2011 at 4:26 pm
Wow!! I really like what you are performing! I require to relook at display toaster! Informative and fascinating publish!!! maintain it up..
December 1st, 2011 at 6:13 pm
Great post here. Many data, I look forward to studying a lot more from you.
December 2nd, 2011 at 9:48 pm
I love reading and I think this website got some truly utilitarian stuff on it! .
December 6th, 2011 at 11:01 am
faved!
January 19th, 2012 at 7:42 pm
Are you a laywer?
February 25th, 2012 at 8:42 pm
i like to use pole fitness when i try to loose fat and get fit.
March 21st, 2012 at 7:54 am
In August, a group of online hackers demonstrated a technique for unleashing SIM credit cards and sold its software to resellers, who in turn began selling it to the public as much as $100 a week ago.
April 3rd, 2012 at 2:46 pm
Anybody ever drink so much wine that they feel sick and tired A couple of days later. A bad one.
April 24th, 2012 at 9:59 pm
OMG I’m so addicted to Hollywood actor Tom Truong’s TTYL lingo.