Skip to content

Commit dfb2300

Browse files
authored
Merge pull request #1091 from radiantearth/clarify-root
updates to make clear root can be catalog or collection
2 parents 091af68 + e257928 commit dfb2300

File tree

5 files changed

+44
-35
lines changed

5 files changed

+44
-35
lines changed

best-practices.md

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,8 @@ So to enable all the great web tools (like [stacindex.org](http://stacindex.org)
7171
[Google Cloud Storage](https://cloud.google.com/storage/docs/cross-origin), or [Apache Server](https://enable-cors.org/server_apache.html).
7272
Many more are listed on [enable-cors.org](https://enable-cors.org/server.html). We recommend enabling CORS for all requests ('\*'),
7373
so that diverse online tools can access your data. If you aren't sure if your server has CORS enabled you can use
74-
[test-cors.org](https://www.test-cors.org/). Enter the URL of your STAC root [Catalog](catalog-spec/catalog-spec.md) JSON
75-
and make sure it gets a response.
74+
[test-cors.org](https://www.test-cors.org/). Enter the URL of your STAC root [Catalog](catalog-spec/catalog-spec.md) or
75+
[Collection](collection-spec/collection-spec.md) JSON and make sure it gets a response.
7676

7777
### STAC on the Web
7878

@@ -81,7 +81,7 @@ surprised that there is nothing about HTML in the entire specification. This is
8181
should be on web pages without ending up with very bad looking pages. But the importance of having web-accessible versions
8282
of every STAC Item is paramount.
8383

84-
The main recommendation is to have an HTML page for every single STAC Item and Catalog. They should be visually pleasing,
84+
The main recommendation is to have an HTML page for every single STAC Item, Catalog and Collection. They should be visually pleasing,
8585
crawlable by search engines and ideally interactive. The current best practice is to use a tool in the STAC ecosystem called
8686
[STAC Browser](https://github.com/radiantearth/stac-browser/). It can crawl most any valid STAC implementation and generate unique web
8787
pages for each Item and Catalog (or Collection). While it has a default look and feel, the design can easily be
@@ -393,6 +393,10 @@ file that just has the bands needed for display
393393

394394
## Catalog & Collection Practices
395395

396+
*Note: This section uses the term 'Catalog' (with an uppercase C) to refer to the JSON entity specified in the
397+
[Catalog spec](catalog-spec/catalog-spec.md), and 'catalog' (with a lowercase c) to refer to any full STAC implementation,
398+
which can be any mix of Catalogs Collections and Items.*
399+
396400
### Static and Dynamic Catalogs
397401

398402
As mentioned in the main [overview](overview.md), there are two main types of catalogs - static
@@ -446,7 +450,7 @@ providers, and users could browse down to both. The leaf Items should just be li
446450

447451
### Catalog Layout
448452

449-
Creating a catalog involves a number of decisions as to what folder structure to use to represent sub-catalogs, items
453+
Creating a catalog involves a number of decisions as to what folder structure to use to represent sub-catalogs, Items
450454
and assets, and how to name them. The specification leaves this totally open, and you can link things as you want. But
451455
it is recommended to be thoughtful about the organization of sub-catalogs, putting them into a structure that a person
452456
might reasonably browse (since they likely will with [STAC on the Web](#stac-on-the-web) recommendations). For example
@@ -463,14 +467,14 @@ if you follow these recommendations.
463467
1. Root documents (Catalogs / Collections) should be at the root of a directory tree containing the static catalog.
464468
2. Catalogs should be named `catalog.json` and Collections should be named `collection.json`.
465469
3. Items should be named `<id>.json`.
466-
4. Sub-Catalogs should be stored in subdirectories of their parent
470+
4. Sub-Catalogs or sub-Collections should be stored in subdirectories of their parent
467471
(and only 1 subdirectory deeper than a document's parent, e.g. `.../sample/sub1/catalog.json`).
468-
5. Items should be stored in subdirectories of their parent Catalog.
469-
This means that each Item and its assets are contained in a unique subdirectory.
470-
6. Limit the number of Items in a Catalog or sub-Catalog, grouping / partitioning as relevant to the dataset.
472+
5. Items should be stored in subdirectories of their parent Catalog or Collection.
473+
This means that each Item and its assets are contained in a unique subdirectory.
474+
6. Limit the number of Items in a Catalog or Collection, grouping / partitioning as relevant to the dataset.
471475
7. Use structural elements (Catalog and Collection) consistently across each 'level' of your hierarchy.
472476
For example, if levels 2 and 4 of the hierarchy only contain Collections,
473-
don't add a Catalog at levels 2 and 4.
477+
don't add a Catalog at levels 2 and 4.
474478

475479
#### Dynamic Catalog Layout
476480

@@ -483,7 +487,7 @@ different sub-catalog organization structures. For example one catalog could div
483487
by providers, and users could browse down to both. The leaf Items should just be linked to in a single canonical location
484488
(or at least use a rel link that indicates the location of the canonical one). It is recommended that dynamic catalogs
485489
provide multiple 'views' to allow users to navigate in a way that makes sense to them, providing multiple 'sub-catalogs'
486-
from the root Catalog that enable different paths to browse (country/state, date/time, constellation/satellite, etc). But the
490+
from the root that enable different paths to browse (country/state, date/time, constellation/satellite, etc). But the
487491
canonical 'rel' link should be used to designate the primary location of the Item to search engine crawlers.
488492

489493
#### Mixing STAC Versions
@@ -608,9 +612,9 @@ implement it.
608612
#### Relative Published Catalog
609613

610614
This is a self-contained catalog as described above, except it includes an absolute `self` link at
611-
the root catalog, to identify its online location. This is designed so that a self-contained catalog (of either type, with its
615+
the root to identify its online location. This is designed so that a self-contained catalog (of either type, with its
612616
assets or just metadata) can be 'published' online
613-
by just adding one field (the self link) to its root catalog. All the other links should remain the same. The resulting catalog
617+
by just adding one field (the self link) to its root (Catalog or Collection). All the other links should remain the same. The resulting catalog
614618
is no longer compliant with the self-contained catalog recommendations, but instead transforms into a 'relative published catalog'.
615619
With this, a client may resolve Item and sub-catalog self links by traversing parent and root links, but requires reading
616620
multiple sources to achieve this.
@@ -632,8 +636,8 @@ a number of the common official relations that are used in production STAC imple
632636
| alternate | It is recommended that STAC Items are also available as HTML, and should use this rel with `"type" : "text/html"` to tell clients where they can get a version of the Item or Collection to view in a browser. See [STAC on the Web in Best Practices](#stac-on-the-web) for more information. |
633637
| canonical | The URL of the [canonical](https://en.wikipedia.org/wiki/Canonical_link_element) version of the Item or Collection. API responses and copies of catalogs should use this to inform users that they are direct copy of another STAC Item, using the canonical rel to refer back to the primary location. |
634638
| via | The URL of the source metadata that this STAC Item or Collection is created from. Used similarly to canonical, but refers back to a non-STAC record (Landsat MTL, Sentinel tileInfo.json, etc) |
635-
| prev | Indicates that the link's context is a part of a series, and that the previous in the series is the link target. Typically used in STAC by API's, to return smaller groups of Items or Catalogs. |
636-
| next | Indicates that the link's context is a part of a series, and that the next in the series is the link target. Typically used in STAC by API's, to return smaller groups of Items or Catalogs. |
639+
| prev | Indicates that the link's context is a part of a series, and that the previous in the series is the link target. Typically used in STAC by API's, to return smaller groups of Items or Catalogs/Collections. |
640+
| next | Indicates that the link's context is a part of a series, and that the next in the series is the link target. Typically used in STAC by API's, to return smaller groups of Items or Catalogs/Collections. |
637641

638642
### Versioning for Catalogs
639643

catalog-spec/catalog-spec.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,9 @@ and fields to be compliant.
3636
This Catalog specification primarily defines a structure for information to be discoverable. Any use
3737
that is publishing a set of related spatiotemporal assets is strongly recommended to also use the
3838
STAC Collection specification to provide additional information about the set of Items
39-
contained in a Catalog, in order to give contextual information to aid in discovery. Every STAC Collection is
40-
also a valid STAC Catalog.
39+
contained in a Catalog, in order to give contextual information to aid in discovery.
40+
STAC Collections all have the same fields as STAC Catalogs, but with different allowed
41+
values for `type` and `stac_extensions`.
4142

4243
## Catalog fields
4344

@@ -89,11 +90,11 @@ The following types are commonly used as `rel` types in the Link Object of a STA
8990
| ------- | ----------- |
9091
| self | STRONGLY RECOMMENDED. *Absolute* URL to the location that the Catalog file can be found online, if available. This is particularly useful when in a download package that includes metadata, so that the downstream user can know where the data has come from. |
9192
| root | STRONGLY RECOMMENDED. URL to the root STAC Catalog or [Collection](../collection-spec/README.md). Catalogs should include a link to their root, even if it's the root and points to itself. |
92-
| parent | URL to the parent STAC Catalog or Collection. Non-root Catalogs should include a link to their parent. |
93-
| child | URL to a child STAC Catalog or Collection. |
93+
| parent | URL to the parent STAC entity (Catalog or Collection). Non-root Catalogs should include a link to their parent. |
94+
| child | URL to a child STAC entity (Catalog or Collection). |
9495
| item | URL to a STAC Item. |
9596

96-
**Note:** A link to at least one `item` or `child` Catalog is **REQUIRED**.
97+
**Note:** A link to at least one `item` or `child` (Catalog or Collection) is **REQUIRED**.
9798

9899
There are additional `rel` types in the [Using Relation Types](../best-practices.md#using-relation-types) best practice, but as
99100
they are more typically used in Collections, as Catalogs tend to just be used to structure STAC organization, so tend to just use

collection-spec/collection-spec.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -252,9 +252,9 @@ This is done where there is not a clear official option, or where STAC uses an o
252252
| Type | Description |
253253
| ------- | ------------------------------------------------------------ |
254254
| self | STRONGLY RECOMMENDED. *Absolute* URL to the location that the Collection file can be found online, if available. This is particularly useful when in a download package that includes metadata, so that the downstream user can know where the data has come from. |
255-
| root | URL to the root STAC Catalog or Collection. Collections should include a link to their root, even if it's the root and points to itself. |
256-
| parent | URL to the parent STAC Catalog or Collection. Non-root Collections should include a link to their parent. |
257-
| child | URL to a child STAC Catalog or Collection. |
255+
| root | URL to the root STAC entity (Catalog or Collection). Collections should include a link to their root, even if it's the root and points to itself. |
256+
| parent | URL to the parent STAC entity (Catalog or Collection). Non-root Collections should include a link to their parent. |
257+
| child | URL to a child STAC entity (Catalog or Collection). |
258258
| item | URL to a STAC Item. All Items linked from a Collection MUST refer back to its Collection with the [`collection` relation type](../item-spec/item-spec.md#relation-types). |
259259
| license | The license URL(s) for the Collection SHOULD be specified if the `license` field is set to `proprietary` or `various`. If there is no public license URL available, it is RECOMMENDED to put the license text in a separate file and link to this file. |
260260
| derived_from | URL to a STAC Collection that was used as input data in the creation of this Collection. See the note in [STAC Item](../item-spec/item-spec.md#derived_from) for more info. |

item-spec/item-spec.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ It is important that an Item identifier is unique within a Collection, and that
8484
[Collection identifier](../collection-spec/collection-spec.md#id) in turn is unique globally. Then the two can be combined to
8585
give a globally unique identifier. Items are *[strongly recommended](#collections)* to have Collections, and not having one makes
8686
it more difficult to be used in the wider STAC ecosystem.
87-
If an Item does not have a Collection, then the Item identifier should be unique within its root Catalog.
87+
If an Item does not have a Collection, then the Item identifier should be unique within its root Catalog or root Collection.
8888

8989
As most geospatial assets are already uniquely defined by some
9090
identification scheme from the data provider it is recommended to simply use that ID.
@@ -192,8 +192,8 @@ This happens where there is not a clear official option, or where STAC uses an o
192192
| Type | Description |
193193
| ------------ | ------------------------------------------------------------ |
194194
| self | STRONGLY RECOMMENDED. *Absolute* URL to the Item if it is available at a public URL. This is particularly useful when in a download package that includes metadata, so that the downstream user can know where the data has come from. |
195-
| root | URL to the root STAC Catalog or Collection. |
196-
| parent | URL to the parent STAC Catalog or Collection. |
195+
| root | URL to the root STAC entity (Catalog or Collection). |
196+
| parent | URL to the parent STAC entity (Catalog or Collection). |
197197
| collection | STRONGLY RECOMMENDED. URL to a Collection. *Absolute* URLs should be used whenever possible. The referenced Collection is STRONGLY RECOMMENDED to implement the same STAC version as the Item. A link with this `rel` type is *required* if the `collection` field in properties is present. |
198198
| derived_from | URL to a STAC Item that was used as input data in the creation of this Item. |
199199

overview.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -53,10 +53,11 @@ A Catalog is a very simple construct - it just provides links to Items or to oth
5353
The closest analog is a folder in a file structure, it is the container for Items, but it can
5454
also hold other containers (folders / catalogs).
5555

56-
The Collection specification shares some fields with the catalog spec but has a number of additional fields:
56+
The Collection entity shares most fields with the Catalog entity but has a number of additional fields:
5757
license, extent (spatial and temporal), providers, keywords and summaries. Every Item in a Collection links
5858
back to their Collection, so clients can easily find fields like the license. Thus every Item implicitly
59-
shares the fields described in their parent Collection.
59+
shares the fields described in their parent Collection. Collection entities can be used just like Catalog
60+
entities to provide structure, as they provide all the same options for linking and organizing.
6061

6162
But what *should* go in a Collection, versus just in a Catalog? A Collection will generally consist of
6263
a set of assets that are defined with the same properties and share higher level metadata. In the
@@ -78,27 +79,30 @@ provide multiple grouping paths, serving as a sort of faceted search.
7879

7980
The second case is used when one wants to represent diverse data in a single place. If an organization
8081
has an internal catalog with Landsat 8, Sentinel 2, NAIP data and several commercial imagery providers
81-
then they'd have a root catalog that would link to a number of different Collections.
82+
then they'd have a root Catalog that would link to a number of different Collections.
8283

83-
So in conclusion it's best to use Collections for what you want a user to find as a starting point, and then
84-
catalogs are just for structuring and grouping the data. Future work includes a mechanism to actually
84+
So in conclusion it's best to use Collections for what you want user to find as starting point, and then
85+
Catalogs are just for structuring and grouping the data. Future work includes a mechanism to actually
8586
search Collection-level data, hopefully in concert with other specifications.
8687

8788
## Catalog Overview
8889

90+
*NOTE: The below examples all say Catalog, but those can all be Collections as well, as it has all the fields necessary to
91+
serve as a Catalog*
92+
8993
There are two required element types of a Catalog: Catalog and Item. A STAC Catalog
9094
points to [STAC Items](item-spec/README.md), or to other STAC catalogs. It provides a simple
9195
linking structure that can be used recursively so that many Items can be included in
9296
a single Catalog, organized however the implementor desires.
9397

94-
STAC makes no formal distinction between a "root" catalog and the "child" catalogs. A root catalog
95-
is simply the top-most catalog -- it has no parent. A nested catalog structure is useful (and
98+
STAC makes no formal distinction between a "root" Catalog and the "child" Catalogs. A root Catalog
99+
is simply the top-most Catalog or Collection -- it has no parent. A nested catalog structure is useful (and
96100
recommended) for breaking up massive numbers of catalog Items into logical groupings. For example,
97101
it might make sense to organize a catalog by date (year, month, day), or geography (continent,
98102
country, state/prov). See the [Catalog Layout](best-practices.md#catalog-layout) best practices
99103
section for more.
100104

101-
A simple Catalog structure might look like this:
105+
A simple STAC structure might look like this:
102106

103107
- catalog (root)
104108
- catalog
@@ -164,8 +168,8 @@ each Item and Catalog, as well as ways to achieve that.
164168

165169
## Collection Overview
166170

167-
A STAC Collection extends the core fields of the Catalog construct to provide additional metadata to describe the set of Items it
168-
contains. The required fields are fairly
171+
A STAC Collection includes the core fields of the Catalog entity and also provides additional metadata to describe
172+
the set of Items it contains. The required fields are fairly
169173
minimal - it includes the 4 required Catalog fields (id, description, stac_version and links), and adds license
170174
and extents. But there are a number of other common fields defined in the spec, and more common fields are also
171175
defined in [STAC extensions](extensions/). These serve as basic metadata, and ideally Collections also link to

0 commit comments

Comments
 (0)