previous   next   contents  

7. The SMIL 2.0 Media Object Modules

Editor
Rob Lanphier (robla@real.com), RealNetworks

Table of contents

7.1 Introduction

This section defines the SMIL media object modules, which are composed of a BasicMedia module and five modules with additional functionality that build on top of the BasicMedia module: the MediaClipping, MediaClipMarkers, MediaParam, MediaAccessibility, and MediaDescription modules. These modules contain elements and attributes used to describe media objects. Additionally, a BrushMedia element is provided which can be used as a media object. Since these elements and attributes are defined in a series of modules, designers of other markup languages can reuse the SMIL media module when they need to include media objects into their language.

Changes with respect to the media object elements in SMIL 1.0 provide additional functionality that was brought up as requirements of the Working Group, and those differences are explained in Appendix A and Appendix B.

7.2 Definitions

Continuous Media
Audio file, video file or other media for which there is a measurable and well-understood duration. For example, a five second audio clip is continuous media, because it has a well-understood duration of five seconds. Opposite of "discrete media".
Discrete Media
Image file, text file or other media which has no obvious duration. For example, a JPEG image is generally considered discrete media, because there's nothing in the file indicating how long the JPEG should be displayed. Opposite of "continuous media".
Intrinsic Duration
The duration of a referenced item without any explicit timing markup. Continuous media has an intrinsic duration defined by the media, and discrete media has no intrinsic duration. (In SMIL, discrete media is assigned an intrinsic duration of zero).

7.3 SMIL BasicMedia Module

This module defines the baseline functionality of a SMIL player. It is very close in functionality to the media object specification in SMIL 1.0.

7.3.1 Media Object Elements - ref, animation, audio, img, text, textstream and video

The media object elements allow the inclusion of media objects into a SMIL presentation. Media objects are included by reference (using a URI). The following media elements are defined in this section:

ref
Generic media reference
animation
Animated vector graphics or other animated format
audio
Audio clip
img
Still image, such as PNG or JPEG
text
Text reference
textstream
Streaming text
video
Video clip

All of these media elements are semantically identical. When playing back a media object, the player must not derive the exact type of the media object from the name of the media object element. Instead, it must rely solely on other sources about the type, such as type information contained in the type attribute, or the type information communicated by a server or the operating system.

Authors, however, should make sure that the group into which of the media object falls (animation, audio, img, video, text or textstream) is reflected in the element name. This is in order to increase the readability of the SMIL document. When in doubt about the group of a media object, authors should use the generic "ref" element.

The animation element defined here should not be confused with the elements defined in the SMIL 2.0 Animation Module. The animation element defined in this module is used to include an animation (such as a vector graphics animation) by reference. This is in contrast to the elements defined in the Animation module, which provide an in-line syntax for the animation of attributes and properties of other elements.

Anchors and links can be attached to visual media objects, i.e. media objects rendered on a visual abstract rendering surface.

Attributes Definitions

Languages implementing the SMIL BasicMedia Module must define which attributes may be attached to media object elements. In all languages implementing the SMIL BasicMedia module, media object elements can have the following attributes:

src
The value of the src attribute is the [URI] of the media element, used for locating and fetching the associated media. Note that this attribute is not required. A media object with no src attribute has an intrinsic duration of zero, and participates in timing just as any other media element. No media will be fetched by the SMIL implementation for a media element without a src attribute.
type
Content type of the media object referenced by the src attribute. The usage of this attribute depends on the protocol of the src attribute.
RTSP [RTSP]
The type attribute is used for purposes of content selection and when the type of the referenced media is not otherwise available. It may be overridden by the contents of the RTSP DESCRIBE response or by the static RTP payload number.
HTTP [HTTP], FTP [FTP], and local file playback [URL] (mainly URL spec, but describes the "file:" URL scheme])
The type attribute value takes precedence over other possible sources of the media type (for instance, the "Content-type" field in an HTTP exchange, or the file extension).

When the content represented by a URL is available in many data formats, implementations MAY use the type value to influence which of the multiple formats is used. For instance, on a server implementing HTTP content negotiation, the client may use the type attribute to order the preferences in the negotiation. 

For protocols not enumerated in this specification, implementations should use the following rules: When the media is encapsulated in a media file and delivered intact to the SMIL user agent via a protocol designed for delivery as a complete file, the type attribute value should take precedence over other possible sources of the media type. For protocols which deliver the media in a media-aware fashion, such as those delivering media in a manner using or dependent upon the specific type of media, the application of the type attribute is not defined by this specification.

Element Content

Languages utilizing the SMIL BasicMedia module must define the complete set of elements which may act as children of media object elements. There are currently no required children of a media object defined in the BasicMedia Module, but languages utilizing the BasicMedia module may impose requirements beyond this specification.

7.3.2 Integration Requirements

If the including profile supports the XMLBase functionality [XMLBase] , the values of the src and longdesc attributes on the media object elements must be interpreted in the context of the relevant XMLBase URI prefix.

7.4 SMIL MediaParam Module

This section defines the elements and attributes that make up the SMIL MediaParam Module definition. Languages implementing elements and attributes found in the MediaParam module must implement all elements and attributes defined below, as well as BasicMedia.

7.4.1 Media object initialization: the param element

param elements specify a set of values that may be required by a media object at run-time. Any number of param elements may appear in the content of a media object element, in any order, but must be placed at the start of the content of the enclosing media object element.

The syntax of names and values is assumed to be understood by the object's implementation. This document does not specify how user agents should retrieve name/value pairs nor how they should interpret parameter names that appear twice.

Attribute definitions
name
(CDATA) This attribute defines the name of a run-time parameter, assumed to be known by the inserted object. Whether the property name is case-sensitive depends on the specific object implementation.
value
(CDATA) This attribute specifies the value of a run-time parameter specified by name. Property values have no meaning to SMIL; their meaning is determined by the object in question.
valuetype
["data"|"ref"|"object"] This attribute specifies the type of the value attribute. Possible values:
  • data: This is default value for the attribute. It means that the value specified by value will be evaluated and passed to the object's implementation as a string.
  • ref: The value specified by value is a URI [URI] that designates a resource where run-time values are stored. This allows support tools to identify URIs given as parameters. The URI must be passed to the object as is, i.e., unresolved.
  • object: The value specified by value is an identifier that refers to a media object declaration in the same document. The identifier must be the value of the id attribute set for the declared media object element.
type
This attribute specifies the content type of the resource designated by the value attribute only in the case where valuetype is set to "ref". This attribute thus specifies for the user agent, the type of values that will be found at the URI designated by value. See 6.7 Content Type in [HTML4] for more information.

Example

To illustrate the use of param: suppose that we have a facial animation plug-in that is able to accept different moods and accessories associated with characters. These could be defined in the following way:
<ref src="http://www.example.com/herbert.face">
  <param name="mood" value="surly" valuetype="data"/>
  <param name="accessories" value="baseball-cap,nose-ring" valuetype="data"/>
</ref>

7.4.2 Element Attributes for All Media Objects

In addition to the element attributes defined in BasicMedia, media object elements can have the attributes and attribute extensions defined below. The inclusion or exclusion of these elements is left as an option in the language profile.

erase
Controls the behavior of the media object after the effects of any timing are complete. For example, when SMIL Timing is applied to a media element, erase controls the display of the media when the active duration of the element and when the freeze period defined by the fill attribute is complete (see SMIL Timing and Synchronization). Possible values for erase are never and whenDone.

erase="whenDone" is the default value. When this is specified (or implied) the media removal occurs at the end of any applied timing.

erase="never" is defined to keep the last state of the media displayed until the display area is reused (or if the display area is already being used by another media object). Any profile that integrates this element must define what is meant by "display area" and further define the interaction. Intrinsic hyperlinks (e.g., Flash, HTML) and explicit hyperlinks (e.g., area, a) stay active as long as the hyperlink is displayed. If timing is reapplied to an element, the effect of the erase=never is cleared. For example, when an element is restarted according to the SMIL Timing and Synchronization module, the element is cleared immediately before it restarts.

Example:

<par>
  <seq>
        <par>
            <img src="image1.jpg" region="foo1" fill="freeze" erase="never" .../>
            <audio src="audio1.au"/>        
        </par>
        <par>
            <img src="image2.jpg" region="foo2" fill="freeze" erase="never" .../>
            <audio src="audio2.au"/>        
        </par>
         ...
        <par>
            <img src="imageN.jpg" region="fooN" fill="freeze" erase="never" .../>
            <audio src="audioN.au"/>        
        </par>
  </seq>
</par>

In this example, each image is successively displayed and remains displayed until the end of the presentation.

mediaRepeat
Used to strip the intrinsic repeat value of the underlying media object. The interpretation of this attribute is specific to the media type of the media object, and is only applicable to those media types for which there is a definition of a repeat value found in the media type format specification. Media type viewers used in SMIL implementations will need to expose an interface for controlling the repeat value of the media for this attribute to be applied. For all media types where there is an expectation of interoperability between SMIL implementations, there should be a formal specification of the exact repeat value to which the mediaRepeat attribute applies.

Values:

strip
Strip the intrinsic repeat value of the media object.
preserve (default)
Leave the intrinsic repeat value of the media object intact.

As an example of how this would be used, many animated GIFs intrinsically repeat indefinitely. The application of mediaRepeat= "strip" allows an author to remove the intrinsic repeat behavior of an animated GIF on a per-reference basis, causing the animation to display only once, regardless of the repeat value embedded in the GIF.

When mediaRepeat is used in conjunction with SMIL Timing Module attributes, this attribute is applied first, so that the repeat behavior can then be controlled with the SMIL Timing Module attributes such as repeatCount and repeatDur.

sensitivity
Used to provide author control over the sensitivity of media to user interface selection events, such as the SMIL 2.0 activateEvent, and hyperlink activation. If the media is sensitive at the event location, it captures the event, and will not pass the event through to underlying media objects.  If not, it allows the event to be passed through to any media objects lower in the display hierarchy.

Values:

opaque
The media is sensitive to user interface selection events over the entire area of the media.  This is the default.
transparent
The media is not sensitive to user interface selection events over the entire area of the media. Any user interface selection events will be "passed through" to any underlying media.
percentage-value
The media sensitivity to user interface selection events is dependent upon the opacity of the media at the location of the event (the alpha channel value). If rendered media supports an alpha channel and the opacity of the media is less than the given percentage value at the event location, the behavior will be transparent as specified above. Otherwise the behavior will be as opaque. Valid values are non-negative CSS2 percentage values.

7.4.3 Integration Requirements

Any profile that integrates the erase attribute must define what is meant by "display area" and further define the interaction. See the definition of erase for more details.

The supported uses of the type and valuetype attributes on the param element must be specified by the integrating profile. If a profile does not specify this, the type and valuetype attributes will be ignored in that profile.

7.5 SMIL MediaClipping Module

This section defines the attributes that make up the SMIL MediaClipping Module definition. Languages implementing the attributes found in the MediaClipping module must implement the attributes defined below, as well as BasicMedia.

7.5.1 MediaClipping Attributes

clipBegin (clip-begin)
The clipBegin attribute specifies the beginning of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media.
Values in the clipBegin attribute have the following syntax:
Clip-value-MediaClipping ::= [ Metric "=" ] ( Clock-val | Smpte-val )
Metric            ::= Smpte-type | "npt" 
Smpte-type        ::= "smpte" | "smpte-30-drop" | "smpte-25"
Smpte-val         ::= Hours ":" Minutes ":" Seconds 
                      [ ":" Frames [ "." Subframes ]]
Hours             ::= Digit+ 
                  /* see XML 1.0 for a definition of ´Digit´*/
Minutes           ::= Digit Digit; range from 00 to 59
Seconds           ::= Digit Digit; range from 00 to 59

Frames            ::= Digit Digit; smpte range = 00-29, smpte-30-drop range = 00-29, smpte-25 range = 00-24
Subframes         ::= Digit Digit; smpte range = 00-01, smpte-30-drop range = 00-01, smpte-25 range = 00-01
      

Note: additional BNF for level 1 extensions defined later

The value of this attribute consists of a metric specifier, followed by a time value whose syntax and semantics depend on the metric specifier. The following formats are allowed:

SMPTE Timestamp
SMPTE time codes [SMPTE] can be used for frame-level access accuracy. The metric specifier can have the following values:
smpte
smpte-30-drop
These values indicate the use of the "SMPTE 30 drop" format (approximately 29.97 frames per second), as defined in the SMPTE specification (also referred to as "NTSC drop frame"). The "frames" field in the time value can assume the values 0 through 29. The difference between 30 and 29.97 frames per second is handled by dropping the first two frame indices (values 00 and 01) of every minute, except every tenth minute.
smpte-25
The "frames" field in the time specification can assume the values 0 through 24. This corresponds to the PAL standard as noted in [SMPTE]

The time value has the format hours:minutes:seconds:frames.subframes. If the subframe value is zero, it may be omitted. Subframes are measured in one-hundredths of a frame.
Examples:
clipBegin="smpte=10:12:33:20"

Normal Play Time
Normal Play Time expresses time in terms of SMIL clock values. The metric specifier is "npt", and the syntax of the time value is identical to the syntax of SMIL clock values.
Examples:
clipBegin="npt=123.45s"
clipBegin="npt=12:05:35.3
"
Marker
Not defined in this module. See clipBegin Media Marker attribute extension in the MediaClipMarkers module.

If no metric specifier is given, then a default of "npt=" is presumed.

When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

clipBegin may also be expressed as clip-begin for compatibility with SMIL 1.0. Software supporting the SMIL 2.0 Language Profile must be able to handle both clipBegin and clip-begin, whereas software supporting only the SMIL MediaClipping module only needs to support clipBegin. If an element contains both a clipBegin and a clip-begin attribute, then clipBegin takes precedence over clip-begin.

Example:

<audio src="radio.wav" clip-begin="5s" clipBegin="10s" />

The clip begins at second 10 of the audio, and not at second 5, since the clip-begin attribute is ignored. A strict SMIL 1.0 implementation will start the clip at second 5 of the audio, since the clipBegin attribute will not be recognized by that implementation. See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

clipEnd (clip-end)
The clipEnd attribute specifies the end of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media. It uses the same attribute value syntax as the clipBegin attribute.
If the value of the clipEnd attribute exceeds the duration of the media object, the value is ignored, and the clip end is set equal to the effective end of the media object. clipEnd may also be expressed as clip-end for compatibility with SMIL 1.0. Software supporting the SMIL 2.0 Language Profile must be able to handle both clipEnd and clip-end, whereas software supporting only the SMIL media object module only needs to support clipEnd. If an element contains both a clipEnd and a clip-end attribute, then clipEnd takes precedence over clip-end. When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

7.6 SMIL MediaClipMarkers Module

This section defines the attribute extensions that make up the SMIL MediaClipMarkers Module definition. Languages implementing elements and attributes found in the MediaClipMarkers module must implement all elements and attributes defined below, as well as BasicMedia and MediaClipping.

7.6.1 MediaClipMarkers Attribute Extensions

clipBegin Media Marker attribute extension
Used to define a clip using named time points in a media object, rather than using clock values or SMPTE values. The metric specifier is "marker", and the marker value is a URI (see [URI] ). The URI is relative to the src attribute, rather than to the document root or the XML base of the SMIL document.

Clip-value-MediaClipMarkers ::= Clip-value-MediaClipping |
                      "marker" "=" URI-reference
   /* "URI-reference" is defined in  [URI]  */

Example: Assume that a recorded radio transmission consists of a sequence of songs, which are separated by announcements by a disk jockey. The audio format supports marked time points, and the begin of each song or announcement with number X is marked as songX or djX respectively. To extract the first song using the "marker" metric, the following audio media element can be used:

<audio clipBegin="marker=#song1" clipEnd="marker=#dj1" />
clipEnd Media Marker attribute extension
clipEnd media markers use the same attribute value syntax as the clipBegin media marker extension media marker attribute extension. For the complete description, see clipBegin media marker extension.

7.7 SMIL BrushMedia Module

This section defines the elements and attributes that make up the SMIL BrushMedia Module definition. Languages implementing elements and attributes found in the BrushMedia module must implement all elements and attributes defined below.

7.7.1 The brush element

The brush element is a lightweight media object element which allows an author to paint a solid color or other pattern in place of a media object. Thus, all attributes associated with media objects may also be applied to brush. Since all information about the media object is specified in the attributes of the element itself, the src attribute is ignored, and thus is not required.

Attribute definitions
color
The use and definition of this attribute are identical to the "background-color" property in the CSS2 specification.

7.7.2 Integration Requirements

Profiles including the BrushMedia module must provide semantics for using a color attribute value of inherit on the brush element. Because inherit doesn't make sense in all contexts, a profile may choose to prohibit the use of this value. The value of inherit is prohibited on the color attribute of the brush element for profiles that do not otherwise define these semantics.

7.8 SMIL MediaAccessibility Module

This section defines the elements and attributes that make up the SMIL MediaAccessibility Module definition. Languages implementing elements and attributes found in the MediaAccessibility module must implement all elements and attributes defined below, as well as MediaDescription.

7.8.1 MediaAccessibility Attributes

Attribute definitions
alt
For user agents that cannot display a particular media object, this attribute specifies alternate text. alt may be displayed in addition to the media, or instead of media when the user has configured the user agent to not display the given media type.

It  is strongly recommended that all media object elements have an "alt" attribute with a brief, meaningful description. Authoring tools should ensure that no element can be introduced into a SMIL document without this attribute.

The value of this attribute is a CDATA text string.

longdesc
This attribute specifies a link ( [URI] ) to a long description of a media object. This description should supplement the short description provided using the alt attribute or the abstract attribute. When the media object has associated hyperlinked content, this attribute should provide information about the hyperlinked content.

readIndex
This attribute specifies the position of the current element in the order in which longdesc, title and alt text are read aloud by assistive devices (such as screen readers) for the current document. User agents should ignore leading zeros. The default value is 0.

Elements that contain alt, title or longdesc attributes are read by the assistive technology according to the following rules:

  • Those elements that assign a positive value to the readindex attribute are read out first. Navigation proceeds from the element with the lowest readindex value to the element with the highest value. Values need not be sequential nor must they begin with any particular value. Elements that have identical readindex values should be read out in the order they appear in the character stream of the document.
  • Those elements that assign it a value of "0" are read out in the order they appear in the character stream of the document.
  • Elements in a switch statement that have test-attributes which evaluate to "false" are not read out.

Example

<par>
  <video id="carvideo" src="car.rm" region="videoregion" title="Car video"
         alt="Illustration of relativistic time dilation and length 
              contraction." 
         longdesc="carvideodesc.html" readIndex="3"/>
  <audio id="caraudio" src="caraudio.rm" region="videoregion" 
         title="Car presentation voiceover" begin="bar.begin"/>
  <animation id="cardiagram" src="car.svg" region="animregion" 
         title="Diagram of the car" readIndex="2"/>
  <img id="scvad" src="scv.png" region="videoregion" 
         title="Advertisement for Sugar Coated Vegetables"
         readIndex="1"/>
</par>

In this example, an assistive device that is presenting titles should present the "scvad" element title first (having the lowest readIndex value of "1"), followed by the "cardiagram" title, followed by the "carvideo" element title, and finally present the "caraudio" element title (having an implicit readIndex value of "0").

7.9 SMIL MediaDescription Module

This section defines the elements and attributes that make up the SMIL MediaDescription Module definition. Languages implementing elements and attributes found in the MediaDescription module must implement all elements and attributes defined below.

7.9.1 MediaDescription Attributes

Attribute definitions
abstract
A brief description of the content contained in the element. Unlike alt, this attribute is generally not displayed as alternate content to the media object. It is typically used as a description when table of contents information is generated from a SMIL presentation, and typically contains more information than would be advisable to put in an alt attribute.

This attribute is deprecated in favor of using appropriate SMIL metadata markup in RDF. For example, this attribute maps well to the "description" attribute as defined by the Dublin Core Metadata Initiative [DC] .

author
The name of the author of the content contained in the element.

The value of this attribute is a CDATA text string.

copyright
The copyright notice of the content contained in the element.

The value of this attribute is a CDATA text string.

title
The title attribute as defined in the SMIL Structure module. It is strongly recommended that all media object elements have a title attribute with a brief, meaningful description. Authoring tools should ensure that no element can be introduced into a SMIL document without this attribute.
xml:lang
Used to identify the natural or formal language for the element. For a complete description, see section 2.12 Language Identification of [XML10].

xml:lang differs from the system-language test attribute in one important respect. xml:lang provides information about the content's language independent of what implementations do with the information, whereas system-language is a test attribute with specific associated behavior (see system-language in SMIL Content Control Module for details)

7.10 Appendices

7.10.1 Appendix A: Changes to SMIL 1.0 Media Object Attributes

clipBegin, clipEnd, clip-begin, clip-end

With regards to the clipBegin/clip-begin and clipEnd/clip-end elements, SMIL 2.0 defines the following changes to the syntax defined in SMIL 1.0:

Handling of new clipBegin/clipEnd syntax in SMIL 1.0 software

Using attribute names with hyphens such as clip-begin and clip-end is problematic when using a scripting language and the DOM to manipulate these attributes. Therefore, this specification adds the attribute names clipBegin and clipEnd as an equivalent alternative to the SMIL 1.0 clip-begin and clip-end attributes. The attribute names with hyphens are deprecated.

Authors can use two approaches for writing SMIL 2.0 presentations that use the new clipping syntax and functionality ("marker", default metric) defined in this specification, but can still can be handled by SMIL 1.0 software. First, authors can use non-hyphenated versions of the new attributes that use the new functionality, and add SMIL 1.0 conformant clipping attributes later in the text.

Example:

<audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" 
       clip-begin="npt=0s" clip-end="npt=3:50" />

SMIL 1.0 players implementing the recommended extensibility rules of SMIL 1.0 [SMIL10] will ignore the clip attributes using the new functionality, since they are not part of SMIL 1.0. SMIL 2.0 players, in contrast, will ignore the clip attributes using SMIL 1.0 syntax, since they occur later in the text.

The second approach is to use the following steps:

  1. Add a "system-required" test attribute to media object elements using the new functionality. The value of the "system-required" attribute would correspond to a namespace prefix whose namespace URI ( [URI] ) points to a SMIL specification which integrates the new functionality.
  2. Add an alternative version of the media object element that conforms to SMIL 1.0
  3. Include these two elements in a "switch" element

Example:

<smil xmlns:smil2="http://www.w3.org/2001/SMIL20/Language">
...
<switch>
  <audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1"  
   system-required="smil2" />
  <audio src="radio.wav" clip-begin="npt=0s" clip-end="npt=3:50" />
</switch>

New Accessibility Attributes

readIndex
Allows explicit ordering for controlling assistive technology.

New Advanced Media Attributes

mediaRepeat
The mediaRepeat attribute was added to provide better timing control over media with intrinsic repeat behavior (such as animated GIFs).
erase
Provides a way for visual media to remain visible throughout the duration of a presentation rather by overriding the default erase behavior.

7.10.2 Appendix B: Changes to SMIL 1.0 Media Object Elements

New child elements for media objects

SMIL 1.0 only allowed anchor as a child element of a media element. In addition to anchor (now defined in the Linking module), the param is now allowed as children of a SMIL media object. Additionally, other new children may also be defined by the host language.

The param element

A new param element provides a generalized mechanism to attach media-specific attributes to media objects.

The brush element

A new brush element allows the specification of solid color media objects with no associated media.


previous   next   contents