AbstractThis document describes the registry format used for specifying, storing and updating configuration data for OpenOffice.org (OOo) components and applications. It is intended to demonstrate the concepts and facilities of the OOo Registry Format (OOR)and tries to give a quick understanding of how to write configuration schemas and how to use them for their components and applications. It is a non-normative description targeted to application developers. The normative description of the format can be found at the appendices of this document containing the XMLSchema specifications of the OOR format. Change History
Basis Concepts - DataSource Configuration in OOo
It is not the intention of reinventing the wheel and providing yet another configuration format, instead of reusing an existing one. The intention is rather to have a format which is compatible to existing standards or allows, at least, an easy mapping to existing formats and also meets the requirements for the OpenOffice.org applications in their different environments. These requirements include:
The targeted audience for this document are application developers and architects, who need to know how to describe and compile an application configuration schema with OOR. Therefore this document is designed more like a Programmer's manual than a functional specification. After providing some background by describing the design principles in the first section, the following sections describe the concepts used by means of an example illustrating most of the facets of the registry document format. The appendices contain the XMLSchema definitions for the OOR format(s), which can be viewed as a reference specification.
The schema document is specified in a way, that it can be provided by the developer or author of the schema and be used unchanged in any deployment. All things that vary between products, localizations, installations or users are specified in layer documents which employ the update format. There also is a virtual third sub-format, which can be used to describe the results of a merging process applied to a configuration schema and several layers of updates. This format is called the Registry Merged Format. It is virtual in that the XML Schema for this is not documented and there are currently no use cases where this format would be required. But if a means to store or transport merged configuration hierarchies is required, such a format can be constructed from the building blocks the current formats provide. This section provides a high-level description of the registry object model, which is the basic conceptual model for the registry document format, and of the OOR sub-formats themselves, which are derived from that model. The following table lists the XML namespaces used by the OOR format and their prefixes.
According to the schema definition for OOR, only root elements MUST have a namespace qualifier. All inner (local) elements don't use a qualifier as they are qualified within their context. Throughout the documentation, elements and attributes specified by OOR are not prefixed with the namespace qualifier. Only elements imported from other namespaces are qualified. Registry Object ModelThe Registry Object Model describes a hierarchical data structure. It offers two kinds of elements. On the one hand there are nodes, which combine related elements into logical groupings. On the other hand there are properties, which hold the actual data values. Nodes are the inner elements of the tree, while properties are the leaves of the tree. All elements have a well-defined type. The type of a property corresponds directly to the type of data it can hold, while the type of a node depends (recursively) on the types of the child nodes and properties it contains and the way those are combined. A node type may be described in the schema as a separate named entity, called a template, or it may be implicit in the definition of a node. Templates allow reusing a type in multiple places in the schema and dynamic creation of nodes of a given type. The following kinds of elements can be used to build a configuration schema: PropertiesProperties, specified by <prop>tags, hold the actual data values. They must have one of a number of supported data types. The following basic data-types are supported:
Also supported are lists (ordered sequences) of
each of the basic types. [NOTE: This means that the lists must be
homogeneous – mixed lists are not supported]. The configuration
treats lists as atomic data-types -- there is no support for
accessing lists element-wise or merging the contents of lists. In addition, a property may be characterized as nillable (or not nillable). If it is nillable, it may assume the special value NIL, which indicates the absence of any particular value. NIL is different from any legal value. In particular, a string/list/binary property having zero length is not NIL. Nillable properties are sometimes also called optional. By default, each property is nillable. The schema may also provide a default value for a property. The default value must be of the appropriate data-type and should fulfill the constraints. A default value is not required, even for properties the are not nillable. An additional data-type To support internationalization, properties may be marked localized in the schema. Localized properties (and only localized properties) may assume different values for different locales. This is particularly important for text strings intended for UI display, but may apply to other settings as well. The locale is one of the parameters that is used to select a configuration. Only one language-neutral default value can be provided for a localized property. Locale-specific standard values should be provided in an appropriate layer. Group-NodesGroup-Nodes - their element tag in the
schema is <group> -
are the basic building blocks for organizing related data items. A
top-level Component-Node (see below) is a special kind of
group-node. A group-node groups together a number of child
nodes and a list of properties. The child nodes and properties
(sometimes collectively called members) are identified by their
names. The type of a group-node is determined by the number of
children and properties within the Member names must be unique within a A group may be marked as Set-NodesSet-Nodes – using schema element
tag <set> - provide a
way to describe dynamic parts of the configuration tree. New nodes
can be added to and removed from a set in a layer. Thus a set-node is
a container of node items. The type of the node items is provided by
a The items contained in a A The node-type attribute for the set is required even when further items are defined and specifies a default type for an item. Items of that type don't need to specify it in the layer where they are added. Items of a type other than the default type need to specify theit node type in the layer where they are first added. Sets may be used to build recursive data structures, if their
node-type directly or
indirectly contains a Like group-nodes, sets can be declared as extensible, which allows adding (dynamic) properties in layers. Generic Nodes [This feature is planned as an extension, but not implemented yet]Generic Nodes –
using schema element tag Functionally a generic node can be simulated by a template type which is a set that has itself as item-type and is extensible (allows adding properties). Only the ability to define a skeleton hierarchy with predefined, typed properties in the schema is a new feature. Generic nodes are needed to interoperate with other configuration systems that have less strict rules for schema usage. For that use case it is also helpful, to have this type as a builtin feature, rather than having to rely on an explicite template. Component-Nodes
Within the object model, a component-node is a special, non-extensible group-node. It can act only as root node of the whole component configuration hierarchy. When generic nodes will be introduced, there will also be a way to construct generic components, i.e. whose component-node is generic. PackagesComponents contain the settings for a part of an application. The
sum of all component settings is called the application or system
registry. Within the registry the components are organized
into packages. Each package may contain several components and
sub-packages. This concept is well known in other areas like JAVA and
UNO. On a file system, a package might be represented by a folder and
a component by a single OOR xml file. Packages form the outer
hierarchy of components. A package name and a local subpackage names
are combined using a dot ('.') to obtain the full name of the
subpackage. A component name and the name of the package it belongs
to are similarly joined to form the fully qualified component name
(FQCN). For example Registry Document TypesOOR offers two different kinds of formats, the Registry Component Schema Format (xcs) and the Registry Update Format (xcu). The first format serves as configuration definition format, while the latter one is intended to store instances of configuration data. Note: 'xcs' and 'xcu' are the default file extensions assigned to the formats. Registry Component Schema FormatThe Registry Component Schema Format is a meta format for defining the configuration hierarchy, reusable node types and key value-pairs called properties for a single component or module. Properties may have default values assigned by the schema. In addition, it is possible to provide documentation for each building block (element) of the hierarchy and to set constraints on properties. In a layered view of the registry, the component schema can be regarded as immutable base layer, providing the skeleton of the hierarchy with some default settings. Any kind of dynamic or installation-dependent data, like localized values or items of a set are not part of the schema. Defaults for such data cen be provided in an additional layer document, which uses the Registry Update Format described in the next section. Registry Update FormatOne design goal of OOR is to enable the storage of configuration settings in several layers. A cluster of configuration settings in a layer is also called preferences. Different layers may define global defaults, installation-specific settings, group preferences or policies and individual user preferences. A typical example is the OpenOffice.org application itself, which supports a multi-user installation. All default settings (the default layer) are stored in a shared folder, often on a network server, which is available to all users of the installation. The user preferences are stored in a separate user folder (user layer) in the user's home directory on on their workstation. Such layers are stored using the update format, which contains only those settings which differ, are new or hidden compared to a default configuration. When the application reads settings out of the registry on behalf of a user, the settings from the default and user layers are merged by applying the differences stored in the user layer to the default data obtained from the default layer and the schema. A single merged tree is then passed to the application. Any modifications done by the user to this merged tree are translated into changes to the user layer that affect the merged tree in the desired way. A user is also able to reset settings to default settings, which means that settingsare removed from the user layer so that the corresponding default settings are visible again. This way of storing and merging configuration data as differences is efficient in terms of the overall data size stored for an individual user. Another benefit can be found on the administration side. Changing a setting in the default layer will automatically take effect for all users that have not overwritten the settings with their own preferences. Another difference compared to the Registry Component Schema Format is the nature of the data stored in the Registry Update Format. While the schema provides a set of type definitions and a fixed data hierarchy, the update format contains dynamic data, which can expand the hierarchy (set items) in each layer. One special use of layers is that they can contain value that are adjusted (e.g. translated) for different locales.
Groups and PropertiesThe simplest way to define a component schema is to build a
configuration hierarchy by using group-nodes and properties. The
component hierarchy of the given example knows four kinds of first
level child nodes (
If you look at the example above, you can see that the group-node
The second group given in the example is the group-node
' Each structuring element has mandatory attributes. The name of each element (property, group-node, set-node) is required, as this is the identifier of sub-nodes or properties. The name of an element MUST be unique within the list of child elements. Sets and TemplatesIn the previous paragraph, a set-node with the name
Templates need to be defined in the An update tree for
In the example you find four sub-nodes of type Each sub-entry of All nodes and properties are identified by their name. This name
and the location within the hierarchy allows to identify the
equivalent element in the component schema. Therefore the type of a
node or property need not be specified in the update format. The
update format knows only nodes and their attributes. The root
Another example for the usage of sets is illustrated in the menu
example below. A menu is a recursive structure, where a menu can
contain several popup menus, which themselves may contain menu-items,
menu-separators and/or further sub-menus. This is modeled by a
template which contains a set that allows multiple node-types. One of
those node-types is the template itself, so this template is
recursive. The set specification of the
All possible child element types are listed in the item-elements
of node 'menupopup'. Each listed node-type
must be unique within the item list. The node-type
attribute of the set element identifies the default item for creation
of new properties. If any Re-Use of TemplatesTemplates are node fragments that can be reused. We already
encountered the most frequent use case for this: A template-node
serves as a pattern for the creation of sub-nodes of a
The Another way to avoid redundancy is to reuse templates across component schema boundaries. OOR allows importing existing component schemas. A prerequisite is that the imported component schema is available at both deployment and runtime to access, instantiate and validate imported template definitions. To import a template from another component, you have to specify two things:
The Localization of PropertiesOOR provides the possibility of storing different property content
for different locales. This is often useful, when configuration data
contains items that are displayed by the user interface. A property
must be marked as
To associate specific value settings with a locale, the value
element of a property has an Note: To illustrate the matching
algorithm below, we consider that the user has a preferred locale
The section 'Layering and Merging' will provide further information of how localized properties are treated in the merging process. Extensible NodesThere are certain situations where clients need to be able to
extend the characteristics of a node during runtime. For example,
generic service registries may need to allow storing arbitrary extra
information for specific services, which are represented by
configuration nodes. OOR provides the possibility of extending the
list of properties for nodes, by declaring them as
If you look at the schema snippet given above, you will find the
template-node
In addition to the common properties of Properties added to an extensible in layer can be removed at runtime in the layer where they were added. But properties that are specified by the schema are mandatory, which means that they cannot be removed. This holds for both extensible and non-extensible nodes. The section about layering and merging will provide further information about extensible nodes. Property ConstraintsIn the above component schema for the A constraint on a property is an additional restriction on the basic data-type of the property. OOR knows the following constraint facets from the XML Schema Datatypes standard:
AnnotationsWithin the component schema, it is possible to add information for
documentation purposes. This is done using the Annotations are useful for application developers that want to use
the configuration data in their application. But they should also be
usable for generic registry viewing and editing tools that present
the configuration data to the user or administrator in a meaningful
way. Typically such tools will show the VersioningLike applications the configuration schemas may evolve over time and different versions of a schema are created. To identify the versions of configuration schemas and to support an update process, which migrates configuration data from an older to a newer version, the configuration schema should contain versioning information . The OOR format has a basic support for schema versioning. Each
component schema can provide a
[This section may be augmented greatly in the future, when an overarching framework for tagging and handling schema versions is developed.]
Layering and MergingTo reduce the size of configuration data stored for a single user and to enable certain administrability features, OOR provides a mechanism to layer configuration settings and to store in each layer only differences compared to a default tree. A natural default layer within OOR is the schema layer. A component schema provides a default configuration tree for a single component which includes default settings for some properties. This tree is considered read-only, as the schema itself is immutable. Any changes compared to this default have to be placed into another layer. When fetching the configuration data on behalf of an application, both layers are read and merged before the data is offered to the client. It is also possible and common to use more than two layers. In this case the merge algorithm is applied repeatedly. For example you can have three layers A, B and C, where A is the schema layer, B adds additional settings for a group of users and C holds data for a specific user. In this case B contains only the differences compared to A and C contains only the differences compared to A merged with B. In other words A provides the default tree for B and this tree merged with B serves as default tree for C. Only the schema layer, which provides the original default tree employs the OOR Schema Document Format. All other layers contain differences and use the OOR Update Document Format. Differences are represented in the update format as operations to be executed on the elements of the configuration tree during the merge process. There are four operations, as follows:
For future evolution an extension to the list of existing operations is considered:
Merging update layersOOR allows to merge two update layers A and B without taking into account the data (and especially the schema) D with which the two update layers shall ultimately be merged. Generally, the result of first merging A and B in this way into a new update layer and then merging D with that new update layer, and the result of first merging D with A and then merging the result with B (i.e., the usual way of merging) should be identical. There is one problematic case where the results are not identical, however (see below). The following table describes how both nodes and properties are merged when
merging two update layers A and B (in OOo, which only
supports the operations
The problematic case (marked with a “*”) is a
Localized propertiesLocalized properties may contain a list of values, where each
value is specified for a certain locale by the The Omitting static informationSome information items allowed by the OOR format can be redundant. The corresponding attributes are optional. It may depend on the actual schema or on data from preceding layers whether an omission is possible. It is recommended, that such information is in fact omitted where it can be determined that this is safely possible. If redundant data is not omitted, processors should treat it as an error, if the given value is not consistent with the implied one. Below, you find a list of these optional information items:
The following attributes can be specified only on schema elements. For nodes in a layer the applicable value can be found only by reference to the underlying schema.
A merging exampleThis example contains a sequence of merging steps. All merging
steps are based on the reduced component schema given below, which
contains settings for the
The component schema acts as the default layer for the following merging steps. Each step will use the merge result of the previous step and apply additional changes. Each step contains, therefore, two fragments. The first part is the update layer containing the differences and the second part is the result of applying the differences on the previous merge result. The merge result is specified in a fictitious 'Merged format', which is used for illustration purposes. There is actually no such format in OOR. Typically an application uses an internal representation of the merge result. Modified nodesThe first step describes the simplest way of modification, a
change of a single preference setting. The difference fragment
contains the modification of the
The result tree contains the changed driver precedence. All other nodes of the component schema are still untouched and are for this reason visible in the result tree. Note: The modify operation can be omitted in the update
format, as this is the default operation for elements. This is
actually done for the surrounding nodes Note: Both, the fragment and the result tree do not contain information like property data types or set item types, that can be recovered by reference to the component schema. Inserted nodesThe next step involves the set-node
The fragment adds two new nodes into the set-node 'DriverSettings'. Both are available from the node-type 'DriverPooling', which is a template specified in the component schema. The fragment does not fully populate the two new nodes. Only one property 'Timeout' appears with the value 60. The other property 'Enable' isn't overridden and therefore doesn't appear in the layer. The value is inherited from the template definition during the merging process. Removed nodesThe removal of nodes can happen in two ways depending on the
layer, where the nodes are introduced. If the node does belong to a
default layer, the node will be marked by a '
Replaced nodesReplacing nodes is pretty much the same as adding a new node. Both
the added and the replaced node are introduced by the '
Access ControlOOR provides a simple mechanism for access control, which makes use of the layering concept. By default, each node and property can be modified or extended depending on the semantics of the node. A group-node allows altering its children and changing properties; a set-node allows to add new items and to modify,replace or remove existing items. A property allows changing its value(s). In 'default' layers, which may apply to user-groups or the entire
installation, it may be desirable to restrict the possibility of
overriding nodes and their properties in a subsequent layer. To this
end it is possible to finalize a node. Finalizing a
node always effects the node itself, its properties, its children and
all its descendent nodes recursively. It is not possible to
selectively change a subset of a finalized subtree back to
overrideable. A Below, you find an example which illustrates the meaning of
finalizing a node. If you think of the two layers involved in the
merging process as a user-group and a user, a group administrator
might decide to finalize the node 'DriverSettings'. In this case, the
group administrator can still apply changes to that node, but users,
which are members of that user-group can't change that node or any of
its sub-nodes. Therefore, the node 'DriverSettings' is marked as
This is the basic schema of the OOR format. It is included for all sub-formats of OOR.
|