Skip to content

Commit 3c2152b

Browse files
Add formal definition of Overviews construct and multiscales property in CDM-based model
- Introduced the Overviews construct as a hierarchical CDM-based model for representing multiscale data. - Defined conceptual elements: OverviewSet, OverviewLevel, and optional zoom_level. - Clarified structural layout using CDM entities (groups, variables, attributes). - Added formal schema for the multiscales property describing overview hierarchy, levels, and alignment. - Updated examples to include coordinate variables, auxiliary variables, and parent-level multiscales metadata. - Improved wording for clarity, precision, and consistency with CDM terminology.
1 parent 711d426 commit 3c2152b

File tree

4 files changed

+386
-145
lines changed

4 files changed

+386
-145
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
{
2+
"$schema": "https://json-schema.org/draft/2020-12/schema",
3+
"$id": "https://example.org/schemas/multiscales.schema.json",
4+
"title": "Multiscales Schema",
5+
"description": "Defines the structure of the 'multiscales' attribute for describing multiscale hierarchies in an OverviewSet.",
6+
"type": "object",
7+
"required": ["version", "layout"],
8+
"properties": {
9+
"version": {
10+
"type": "string",
11+
"description": "Version identifier of the multiscales schema (e.g., '1.0')."
12+
},
13+
"resampling_method": {
14+
"type": "string",
15+
"description": "Default resampling or aggregation method applied across all overview levels.",
16+
"enum": [
17+
"nearest",
18+
"average",
19+
"bilinear",
20+
"cubic",
21+
"cubic_spline",
22+
"lanczos",
23+
"mode",
24+
"max",
25+
"min",
26+
"med",
27+
"sum",
28+
"q1",
29+
"q3",
30+
"rms",
31+
"gauss"
32+
],
33+
"default": "nearest"
34+
},
35+
"tile_matrix_ref": {
36+
"description": "Reference to an external grid or tiling definition (e.g., OGC Tile Matrix Set identifier or URI).",
37+
"type": ["string", "object"]
38+
},
39+
"layout": {
40+
"type": "array",
41+
"description": "Ordered list of Overview Level objects defining the hierarchy from highest to lowest resolution.",
42+
"minItems": 1,
43+
"items": {
44+
"$ref": "#/$defs/overviewLevel"
45+
}
46+
}
47+
},
48+
"$defs": {
49+
"overviewLevel": {
50+
"title": "Overview Level Object",
51+
"type": "object",
52+
"required": ["id"],
53+
"properties": {
54+
"id": {
55+
"type": "string",
56+
"description": "Unique identifier for this overview level (e.g., 'L0', 'L1', 'L2')."
57+
},
58+
"path": {
59+
"type": "string",
60+
"description": "Logical path identifying the overview level's location within the dataset hierarchy. If omitted, the level is assumed to be a direct child of the OverviewSet, and 'id' is used as the relative path."
61+
},
62+
"derived_from": {
63+
"type": "string",
64+
"description": "Identifier of another overview level from which this level was derived."
65+
},
66+
"factors": {
67+
"type": "array",
68+
"description": "Numeric decimation factors per dimension (e.g., [2, 2] for 2× downsampling in X and Y).",
69+
"items": {
70+
"type": "number"
71+
},
72+
"minItems": 1
73+
},
74+
"resampling_method": {
75+
"type": "string",
76+
"description": "Resampling or aggregation method specific to this level. If not provided, the global 'resampling_method' applies.",
77+
"enum": [
78+
"nearest",
79+
"average",
80+
"bilinear",
81+
"cubic",
82+
"cubic_spline",
83+
"lanczos",
84+
"mode",
85+
"max",
86+
"min",
87+
"med",
88+
"sum",
89+
"q1",
90+
"q3",
91+
"rms",
92+
"gauss"
93+
]
94+
}
95+
},
96+
"additionalProperties": true
97+
}
98+
},
99+
"additionalProperties": true
100+
}
Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
=== Overviews
2+
3+
==== Introduction
4+
5+
*Overviews* are downscaled representations of gridded data designed to optimise visualisation and scalable access to large datasets.
6+
Overviews or *multiscale pyramid* provide lower-resolution versions of the same variables, enabling rapid display, efficient zooming, and progressive data exploration.
7+
Multiple overview levels may exist, each representing the same data at a coarser spatial resolution.
8+
9+
The *Overviews* construct extends the Common Data Model (CDM) by defining a hierarchical organisation of groups and variables that represent data at multiple scales. The *OverviewSet* is self-described by attributes defined at the parent group level, which declare the relationships between its levels and ensure consistent, interoperable multiscale representation within the CDM framework.
10+
11+
==== Purpose and Scope
12+
13+
The *Overviews* extension enables scalable access to multidimensional gridded data, particularly for geospatial and remote sensing applications. It supports:
14+
15+
- Progressive rendering and visualisation at multiple resolutions
16+
- Efficient data transfer for large datasets
17+
- Multi-resolution analysis in analytical or cloud environments
18+
- Consistent representation of raster and data cube structures across scales
19+
20+
This specification is format-agnostic and may be implemented in any CDM-compliant structure, regardless of physical encoding (e.g. Zarr, NetCDF, GeoTIFF) although the present specification specifically targets Zarr.
21+
22+
==== Conceptual Model
23+
24+
The Overviews construct defines a multiscale hierarchy applied to the <<variable-group,variable group>>, i.e., the CDM group containing the data variables and their associated metadata, and optionally other related variables that share identical dimensions and coordinate systems.
25+
26+
Example: Typical CDM Group Structure without overviews
27+
28+
```
29+
Group: variable_group/
30+
├── variable1
31+
├── variable2
32+
├── aux_variable
33+
├── coordinates1
34+
├── coordinates2
35+
└── Attributes
36+
```
37+
38+
Each overview level provides a reduced-resolution representation of the same variables. This approach avoids redundancy by describing the hierarchy once for the entire group rather than for individual variables, ensuring consistency and concise metadata.
39+
40+
All overview levels are semantically equivalent, differing only in resolution, array extent, or sampling density.
41+
There is no requirement for a single base or reference level—each level may serve as an entry point depending on the application.
42+
43+
44+
==== Model Components
45+
46+
The *Overviews* construct defines the conceptual elements used to represent multiscale data within the GeoZarr data model. It extends the existing CDM concept to support datasets provided at multiple spatial resolutions.
47+
48+
The construct introduces the following conceptual elements:
49+
50+
[cols="1,3"]
51+
|===
52+
|Element |Definition
53+
54+
|`OverviewSet` |A *group* composed of multiple *OverviewLevels*, each containing equivalent variables defined over the same coordinate system and dimensions but sampled at different spatial resolutions. The *OverviewSet* defines the complete multiscale hierarchy.
55+
56+
|`OverviewLevel` |A single resolution level within an *OverviewSet*. Each level replicates the structure and semantics of the others, differing only in resolution or extent.
57+
58+
|`zoom_level` |An optional ordered identifier used to distinguish overview levels (e.g. `0`, `1`, `2` or symbolic identifiers). The ordering indicates relative resolution but does not imply dependency.
59+
|===
60+
61+
The OverviewSet retains the same structure as a nominal variable group, including the associated metadata and auxiliary variables, so that the multiscale hierarchy preserves the complete descriptive context of the original dataset
62+
63+
====
64+
*Note:* The native-resolution data **MAY** be stored directly in the *OverviewSet* group rather than in a dedicated *OverviewLevel* subgroup.
65+
66+
This layout is permitted for backward compatibility with existing datasets that were later augmented with multiscale metadata.
67+
However, it is **not recommended**, as it may lead to inconsistent hierarchies or interpretation issues in client applications expecting all resolution levels to be represented as explicit subgroups.
68+
====
69+
70+
==== Structural Layout
71+
72+
The *Overviews* construct is expressed within the Common Data Model (CDM) framework, which represents datasets through *groups*, *variables*, and *attributes*.
73+
74+
An *OverviewSet* corresponds to a CDM *group* containing multiple *OverviewLevels*, each representing the same data variables at different spatial resolutions.
75+
76+
Within this structure:
77+
78+
- **Groups** define the hierarchical organisation of the multiscale data.
79+
The *OverviewSet* acts as the parent group, while each *OverviewLevel* is represented as a child group that contains variables with identical names, dimensions, and coordinate definitions. The *OverviewSet* group may also include auxiliary variables and metadata consistent with the structure of a nominal CDM group.
80+
81+
- **Variables** represent the same physical or derived quantities across resolutions.
82+
Each level contains the same set of data variables and coordinate variables (for example, `x` and `y`) that describe grid geometry at that resolution.
83+
84+
- **Attributes** describe both the dataset metadata and the relationships between overview levels.
85+
They may appear at the *OverviewSet* or *OverviewLevel* level and are used to define the structure and interpretation of the hierarchy.
86+
87+
88+
The complete description of the hierarchy is provided by the `multiscales` property, an attribute of the *OverviewSet* group that lists the available overview levels, their identifiers, and any associated information such as resampling methods or grid references.
89+
90+
===== OverviewSet CDM-Based Representation
91+
92+
The following example illustrates the structural organisation of an *OverviewSet* using Common Data Model (CDM) constructs:
93+
94+
```
95+
Group: reflectance/ # OverviewSet (Group)
96+
├── Attribute: multiscales # Metadata describing the multiscale hierarchy
97+
├── Attribute: spatial_ref = "EPSG:32633"
98+
├── Auxiliary Variable: quality_flag
99+
├── Group: L0/ # OverviewLevel (highest or nominal resolution)
100+
│ ├── Variable: b01
101+
│ ├── Variable: b02
102+
│ ├── Variable: b03
103+
│ ├── Coordinate Variable: x
104+
│ └── Coordinate Variable: y
105+
├── Group: L1/ # OverviewLevel (coarser resolution)
106+
│ ├── Variable: b01
107+
│ ├── Variable: b02
108+
│ ├── Variable: b03
109+
│ ├── Coordinate Variable: x
110+
│ └── Coordinate Variable: y
111+
└── Group: L2/ # OverviewLevel (coarsest resolution)
112+
├── Variable: b01
113+
├── Variable: b02
114+
├── Variable: b03
115+
├── Coordinate Variable: x
116+
└── Coordinate Variable: y
117+
```
118+
119+
In this representation:
120+
121+
- The **parent group** (`reflectant/`) corresponds to the *OverviewSet* and defines the common spatial, semantic, and organisational context for all levels.
122+
- Each **child group** (`L0`, `L1`, `L2`) represents an *OverviewLevel*, implemented as a CDM *group* containing variables that share the same names, coordinate variables, and metadata conventions.
123+
- **Variables** (`b01`, `b02`, etc.) represent equivalent physical quantities at different spatial resolutions.
124+
125+
==== OverviewSet Metadata
126+
127+
The `multiscales` property is an attribute of the *OverviewSet* group that formally defines the organisation of the multiscale hierarchy.
128+
It provides a structured description of all overview levels, their ordering, and the resampling or aggregation relationships between them.
129+
130+
The property SHALL be encoded as a structured object formally defined as a JSON Schema available at:
131+
link:../schemas/multiscales.schema.json[Multiscales JSON Schema]
132+
133+
It defines global attributes applying to the entire hierarchy and a `layout` array that lists all overview levels in order of resolution.
134+
135+
===== Multiscales Fields
136+
137+
[cols="1,3"]
138+
|===
139+
|Field |Definition
140+
141+
|`version` |**Type:** string.
142+
Version identifier of the multiscales schema.
143+
This field SHALL be present to indicate the version of the schema used.
144+
Example: `"1.0"`
145+
146+
|`resampling_method` |**Type:** string.
147+
(Optional) Default resampling or aggregation method applied across all levels.
148+
If omitted, resampling may be defined per level.
149+
Allowed values include: `"nearest"`, `"average"`, `"bilinear"`, `"cubic"`, `"cubic_spline"`, `"lanczos"`, `"mode"`, `"max"`, `"min"`, `"med"`, `"sum"`, `"q1"`, `"q3"`, `"rms"`, `"gauss"`.
150+
Default: `"nearest"`.
151+
152+
|`tile_matrix_ref` |**Type:** string or object.
153+
(Optional) Reference to an external grid or tiling definition (e.g. an OGC Tile Matrix Set identifier or URI) that describes the spatial structure and scale relationships.
154+
155+
|`layout` |**Type:** array of <<overview-level-object,Overview Level Object>>.
156+
A mandatory array describing each *OverviewLevel* within the hierarchy, ordered from highest to lowest resolution.
157+
Each entry defines the group name and optional derivation information.
158+
|===
159+
160+
[[overview-level-object]]
161+
===== Overview Level Object
162+
163+
Each object in the `layout` array describes one *OverviewLevel* within the multiscale hierarchy.
164+
It defines a unique identifier for the level, its location within the dataset hierarchy, and optionally its derivation from another level.
165+
166+
[cols="1,3"]
167+
|===
168+
|Field |Definition
169+
170+
|`id` |**Type:** string.
171+
Required unique identifier for this overview level.
172+
The identifier SHALL be stable within the dataset and MAY be used for reference in other metadata fields.
173+
Example: `"L0"`, `"L1"`, `"L2"`.
174+
175+
|`path` |**Type:** string.
176+
(Optional) Logical path identifying the location of the overview level within the dataset hierarchy.
177+
If omitted, the level is assumed to be located as a *direct child group* of the *OverviewSet* and the `id` value SHALL be used as the default relative path.
178+
Example: `"L0"`, `"overviews/L2"`.
179+
180+
|`derived_from` |**Type:** string.
181+
(Optional) Identifier of another overview level from which this level was derived.
182+
Used to express lineage or dependency relationships between levels.
183+
The value SHALL correspond to an existing `id` entry in the same `layout` array.
184+
185+
|`factors` |**Type:** array of number.
186+
(Optional) Numeric decimation factors per dimension (e.g. `[2, 2]` for a 2× reduction in X and Y).
187+
Used to describe the scaling applied to generate this level from its source.
188+
189+
|`resampling_method` |**Type:** string.
190+
(Optional) Resampling or aggregation method specific to this level.
191+
If not defined, the method specified in the root `multiscales.resampling_method` field applies.
192+
|===
193+
194+
// Group and from_group directly reference the data model structure itself. Path provide a clearer and more neutral way to describe these fields that keeps them referential without binding them to data model specific structures
195+
196+
===== Example Representation
197+
198+
Here is a JSON example that conforms to the **final `multiscales` schema**:
199+
200+
```json
201+
{
202+
"version": "1.0",
203+
"resampling_method": "average",
204+
"tile_matrix_ref": "OGC:WMT:1.0:WebMercatorQuad",
205+
"layout": [
206+
{
207+
"id": "L0",
208+
"path": "L0"
209+
},
210+
{
211+
"id": "L1",
212+
"path": "L1",
213+
"derived_from": "L0",
214+
"factors": [2, 2],
215+
"resampling_method": "average"
216+
},
217+
{
218+
"id": "L2",
219+
"path": "L2",
220+
"derived_from": "L1",
221+
"factors": [2, 2],
222+
"resampling_method": "average"
223+
}
224+
]
225+
}
226+
```
227+
228+
**Notes:**
229+
230+
* Each `id` uniquely identifies an overview level.
231+
* `path` points to the logical container for that level (may be omitted if it is a direct child of the `OverviewSet`).
232+
* `derived_from` expresses lineage between levels.
233+
* `factors` defines downscaling ratios.
234+
* `resampling_method` can be defined per level or inherited from the global one.
235+
* The `tile_matrix_ref` provide context for external referencing.

0 commit comments

Comments
 (0)