Statement Mapping

By default, we perform steps to help map Schema-related data into their conventional types. Given the wide range of data sources and generation methods, these mappings help establish a minimal baseline to make other, application-specific assumptions. The following steps are performed on a best-effort basis, and any unrecognized syntax, types, and values will be ignored.

Once processed, the resulting graph may be further analyzed for errors through the validation step. These two steps try to follow Schema recommendations fairly strictly to help publish compatible data for consumers. Additional transformations are available through the normalization step which makes further corrections and restructures some data into more consistent structures for consumers.

Steps

Canonicalize IRIs

The Schema.org ontology is often used with one of several different base IRIs. It is more important to use one consistently through a data platform, but we have chosen to use and recommend the secure, apex form. Predicates, object IRIs, and literal datatypes that have the following prefixes will be converted to https://schema.org/.

Alternate Base IRI
http://schema.org/
http://www.schema.org/
https://www.schema.org/

Extension domains, such as pending.schema.org, are currently ignored.

Drop Empty Literals

By convention, empty literal statements will be dropped from the graph if:

  1. The datatype is Schema-related or an XSD primitive; and
  2. The lexical form is empty after collapsing any white space.

The following table shows some examples of how objects will be evaluated.

InputBehaviorNote
""Dropped
" \t "DroppedOnly white space
""^^schema:TextDropped
""^^example:TypeNo changeUnrelated datatype
"0"No changeNot empty
<>No changeIRI, non-literal
example:No changePrefixed name IRI, non-literal

This has the notable effect that a Schema-related resource cannot have an empty-valued property. However, this helps handle the common practice of publishers including empty property values simply because it's the easier method to generate templated, structured data.

Map Enumerations

All properties which support an enumeration for the statement object will be evaluated to prefer its canonical IRI form. The following table shows examples of evaluating the object of a schema:availability property.

InputOutputNote
"InStock"schema:InStock
"https://schema.org/InStock"schema:InStockString to IRI
"http://www.schema.org/InStock"schema:InStockString with Alternate Base IRI to IRI
"ExampleUnknown""ExampleUnknown"Unknown enumeration path, no change
"http://example.com/InStock""http://example.com/InStock"Unknown enumeration, no change
schema:InStockschema:InStockIRI, no change
schema:Falseschema:FalseIRI, no change
schema:ExampleUnknownschema:ExampleUnknownIRI, no change

As a reminder, this step does not generate any errors for unrecognized enumeration values this may be handled by the Validator.

Cast Data Types

All objects of Schema properties will be evaluated against their expected data types and updated to their Schema-specific data type, as appropriate.