Types
Typed fields are at the heart of Tessellate.
tess supports all native Java primitive types (long and Long, etc.).
And additional coercive types.
Java Built-in
Object types are nullable.
Primitives numbers coerce to zero (0).
When declaring a field with a type, use one of the following values depending on if null values are permitted.
String-
nullor any String value. int-
nullcoerced to0. Integer-
nullok. long-
nullcoerced to0 Long-
nullok. float-
nullcoerced to0 Float-
nullok. double-
nullcoerced to0 Double-
nullok. boolean-
nullcoerced tofalse Boolean-
nullok.
Coercive
Some types are canonically a custom or special Java type, but also have metadata associated with them.
These types take the form type[|metadata] where the optional metadata can be a format string, in the case of the temporal types.
Instant|format-
Canonically a
java.time.Instant.-
Supports nanos precision, format defaults to ISO-8601 instant format, e.g.
2011-12-03T10:15:30Z -
May also be a Clusterless time interval:
Day,Hours,Fourths,Sixth, orTwelfths -
The default format may be changed via the
TESS_INSTANT_TYPE_FORMATenvironment variable
-
DateTime|format-
Canonically a
Long.-
Format defaults to
yyyy-MM-dd HH:mm:ss.SSSSSS z. -
The default format may be changed via the
TESS_DATE_TYPE_FORMATenvironment variable
-
json-
Canonically
com.fasterxml.jackson.databind.JsonNode.-
Maybe either a value, nested object, or array.
-
When reading source data, fields can be parsed and coerced into specific types, as declared, or when using formats like Parquet, type information will be inherited.
When writing to a sink, the fields and any type metadata will be used to inform how the output is rendered or written.
In the case of Parquet, an Instant will be stored as a native Parquet timestamp with appropriate metadata for specifying the precision.
When a Parquet file is read by tools like AWS Athena, the timestamp metadata will be honored.