Skip to main content

How does Temporal handle application data?

This guide provides an overview of data handling using a Data Converter on the Temporal Platform.

Data Converters in Temporal are SDK components that handle the serialization and encoding of data entering and exiting a Temporal Cluster. Workflow inputs and outputs need to be serialized and deserialized so they can be sent as JSON to a Temporal Cluster.

Data Converter encodes and decodes data

Data Converter encodes and decodes data

The Data Converter encodes data from your application to a Payload before it is sent to the Temporal Cluster in the Client call. When the Temporal Server sends the encoded data back to the Worker, the Data Converter decodes it for processing within your application. This ensures that all your sensitive data exists in its original format only on hosts that you control.

Data Converter steps are followed when data is sent to a Temporal Cluster (as input to a Workflow) and when it is returned from a Workflow (as output). Due to how Temporal provides access to Workflow output, this implementation is asymmetric:

  • Data encoding is performed automatically using the default converter provided by Temporal or your custom Data Converter when passing input to a Temporal Cluster. For example, plain text input is usually serialized into a JSON object.
  • Data decoding may be performed by your application logic during your Workflows or Activities as necessary, but decoded Workflow results are never persisted back to the Temporal Cluster. Instead, they are stored encoded on the Cluster, and you need to provide an additional parameter when using temporal workflow show or when browsing the Web UI to view output.

Each piece of data (like a single argument or return value) is encoded as a Payload, which consists of binary data and key-value metadata.

For details, see the API references:

What is a Payload?

A Payload represents binary data such as input and output from Activities and Workflows. Payloads also contain metadata that describe their data type or other parameters for use by custom encoders/converters.

When processed through the SDK, the default Data Converter serializes your data/value to a Payload before sending it to the Temporal Server. The default Data Converter processes supported type values to Payloads. You can create a custom Payload Converter to apply different conversion steps.

You can additionally apply custom codecs, such as for encryption or compression, on your Payloads.

What is a default Data Converter?

Each Temporal SDK includes and uses a default Data Converter. The default Data Converter converts objects to bytes using a series of Payload Converters and supports binary, Protobufs, and JSON formats. It encodes values in the following order:

  • Null
  • Byte array
  • Protobuf JSON
  • JSON

For example:

  • If a value is an instance of a Protobuf message, it is encoded with proto3 JSON.
  • If a value isn't null, binary, or a Protobuf, it is encoded as JSON. Most common input types — including strings, integers, floating point numbers, and booleans — are serializable as JSON. If any part of it is not serializable as JSON, an error is thrown.

The default Data Converter serializes objects based on their root type, rather than nested types. The JSON serializers of some SDKs cannot process lists with Protobuf children objects without implementing a custom Data Converter.

What is a custom Data Converter?

A custom Data Converter extends the default Data Converter with custom logic for Payload conversion or encoding.

You can create a custom Data Converter to alter formats (for example, using MessagePack instead of JSON) or add compression and encryption.

A Payload Codec encodes and decodes Payloads, with bytes-to-bytes conversion. To use custom encryption or compression logic, create a custom Payload Codec with your encryption/compression logic in the encode function and your decryption/decompression logic in the decode function. To implement a custom Payload Codec, you can override the default Data Converter, or create a customized Data Converter that defines its own Payload Converter.

Custom Data Converters are not applied to all data; for example, Search Attributes are persisted unencoded so they can be indexed for searching.

A customized Data Converter can have the following three components:

For details on how to implement custom encryption and compression in your SDK, see Data Encryption.

What is a Payload Converter?

A Payload Converter serializes data, converting values to bytes and back.

When you initiate a Workflow Execution through a Client and pass data as input, the input is serialized using a Data Converter that runs it through a set of Payload Converters. When your Workflow Execution starts, this data input is deserialized and passed as input to your Workflow.

Composite Data Converters

A Composite Data Converter is used to apply custom, type-specific Payload Converters in a specified order. A Composite Data Converter can be comprised of custom rules that you created, and it can also leverage the default Data Converters built into Temporal. In fact, the default Data Converter logic is implemented internally in the Temporal source as a Composite Data Converter. It defines these rules in this order:

defaultDataConverter = NewCompositeDataConverter(
NewNilPayloadConverter(),
NewByteSlicePayloadConverter(),
NewProtoJSONPayloadConverter(),
NewProtoPayloadConverter(),
NewJSONPayloadConverter(),
)

The order in which the Payload Converters are applied is important. During serialization, the Data Converter tries the Payload Converters in that specific order until a Payload Converter returns a non-nil Payload. A custom PayloadConverter must implement the functions:

  • FromPayload (for a single value) or
  • FromPayloads (for a list of values) to convert to values from a Payload, and
  • ToPayload (for a single value) or
  • ToPayloads (for a list of values) to convert values to a Payload.

Defining a new Composite Data Converter is not necessary to use a Custom Data Converter. To keep your code more manageable, you can override the default Converter to use your Codec instead.

What is a Payload Codec?

A Payload Codec transforms an array of Payloads (for example, a list of Workflow arguments) into another array of Payloads.

When serializing to Payloads, the Payload Converter is applied first to convert your objects to bytes, followed by codecs that convert bytes to bytes. When deserializing from Payloads, codecs are applied first to last to reverse the effect, followed by the Payload Converter.

Use a custom Payload Codec to transform your Payloads; for example, implementing compression and/or encryption on your Workflow Execution data.

Encryption

Using end-to-end encryption in your custom Data Converter ensures that sensitive application data is secure when handled by the Temporal Server.

Apply your encryption logic in a custom Payload Codec and use it locally to encrypt data. You maintain all the encryption keys, and the Temporal Server sees only encrypted data. Refer to What is Key Management? for more guidance.

Your data exists unencrypted only on the Client and the Worker process that is executing the Workflows and Activities, on hosts that you control. For details, see Securing your data.

The following samples use encryption (AES GCM with 256-bit key) in a custom Data Converter:

What is a Failure Converter?

As with input and output, Temporal also uses its own default converter logic for errors that are generated by Workflows. The default Failure Converter copies error messages and stack traces as plain text, and this text output is then directly accessible in the Message field of these Workflow Executions.

This may be undesirable for your application. In some cases, errors could contain privileged or sensitive information that you would need to prevent from leaking or being available via a side channel. Failure messages and stack traces are not encoded as codec-capable Payloads by default; you must explicitly enable encoding these common attributes on failures.

If your errors might contain sensitive information, you can encrypt the message and stack trace by configuring the default Failure Converter to use your encoding. This moves your message and stack_trace fields to a Payload that's run through your codec.

For example, with the Temporal Go SDK, you can do this by adding a FailureConverter parameter to your client.Options{} array when calling client.Dial(). The FailureConverter should override the DefaultFailureConverterOptions{} by setting EncodeCommonAttributes: true like so:

c, err := client.Dial(client.Options{
// Set DataConverter here to ensure that workflow inputs and results are
// encoded as required.
DataConverter: mycustom.DataConverter,
FailureConverter: temporal.NewDefaultFailureConverter(temporal.DefaultFailureConverterOptions{
EncodeCommonAttributes: true,
}),
})

If for some reason you need to specify a different set of Converter logic for your Failures, you can replace the NewDefaultFailureConverter with a custom method. For example, if you are both working with highly sensitive data and using a sophisticated logging/observability implementation, you may need to implement different encryption methods for each of them.

What is remote data encoding?

Remote data encoding is exposing your Payload Codec via HTTP endpoints to support remote encoding and decoding.

Running your encoding remotely allows you to use it with the temporal CLI to encode/decode data for several commands including temporal workflow show and with Temporal Web UI to decode data in your Workflow Execution details view.

To run data encoding/decoding remotely, use a Codec Server. A Codec Server is an HTTP server that uses your custom Codec logic to decode your data remotely. The Codec Server is independent of the Temporal Cluster and decodes your encrypted payloads through predefined endpoints. You create, operate, and manage access to your Codec Server in your own environment. The temporal CLI and the Web UI in turn provide built-in hooks to call the Codec Server to decode encrypted payloads on demand.

Encoding data on the Web UI and CLI

You can perform some operations on your Workflow Execution using temporal and the Web UI, such as starting or sending a Signal to an active Workflow Execution using temporal or canceling a Workflow Execution from the Web UI, which might require inputs that contain sensitive data.

To encode this data, specify your Codec Server endpoints with temporal and configure your Web UI to use the Codec Server endpoints.

Decoding data on the Web UI and CLI

If you use custom encoding, Payload data handled by the Temporal Cluster is stored encoded. Since the Web UI uses the Visibility database to show events and data stored on the Temporal Server, all data in the Workflow Execution History in your Web UI is displayed in the encoded format.

To decode output when using the Web UI and temporal, use a Codec Server.

Note that a remote data encoder is a separate system with access to your encryption keys and exposes APIs to encode and decode any data. Evaluate and ensure that your remote data encoder endpoints are secured and only authorized users have access to them.

Samples:

What is a Codec Server?

A Codec Server is an HTTP/HTTPS server that uses a custom Payload Codec to decode your data remotely through endpoints.

A Codec Server follows the Temporal Codec Server Protocol. It implements two endpoints:

  • /encode
  • /decode

Each endpoint receives and responds with a JSON body that has a payloads property with an array of Payloads. The endpoints run the Payloads through a Payload Codec before returning them.

Most SDKs provide example Codec Server implementation samples, listed here:

Usage

When you apply custom encoding with encryption or compression on your Workflow data, it is stored in the encrypted/compressed format on the Temporal Server. For details on what data is encoded, see Securing your data.

To see decoded data when using the CLI or Web UI to perform some operations on a Workflow Execution, configure the Codec Server endpoint in the Web UI and CLI. When you configure the Codec Server endpoints, the CLI and Web UI send the encoded data to the Codec Server, and display the decoded data received from the Codec Server.

For details on creating your Codec Server, see Codec Server Setup.

What is Key Management?

Key Management is a fundamental part of working with encryption keys.

There are many computational and logistical aspects to generating and rotating keys, and this usually calls for a dedicated application in your stack. Here are some general recommendations for working with encryption keys for Temporal applications:

  • Symmetric Encryption is generally faster and will produce smaller payloads than asymmetric. Normally, an advantage of asymmetric encryption is that it allows you to distribute your encryption and decryption keys separately, but depending on your infrastructure, this might not offer any security benefits with Temporal.

  • AES-based algorithms are hardware accelerated in Go and other languages. AES algorithms are widely vetted and trusted, and there are many different variants that may suit your requirements. Load tests using ALG_AES_256_GCM_HKDF_SHA512_COMMIT_KEY have performed well.

  • Store your encryption keys in the same manner as you store passwords, config details, and other sensitive data. When possible, load the key into your application, so you don't need to make a network call to retrieve it. Separate keys for each environment or namespace as much as possible.

  • Make sure you have a key rotation strategy in place in the event that your keys are compromised or need to be replaced for another reason. Consider using a dedicated secrets engine or a key management system (KMS). Note that when you rotate keys, you may also need to retain old keys to query old Workflows.

Key Rotation

National Institute of Standards and Technology (NIST) guidance recommends periodic rotation of encryption keys. For AES-GCM keys, rotation should occur before approximately 2^32 encryptions have been performed by a key version, following the guidelines of NIST publication 800-38D.

It is recommended that operators estimate the encryption rate of a key and use that to determine a frequency of rotation that prevents the guidance limits from being reached. For example, if one determines that the estimated rate is 40 million operations per day, then rotating a key every three months is sufficient.

Key rotation should generally be transparent to the Temporal Data Converter implementation. Temporal's Encode() and Decode() steps only need to trigger as expected, and Temporal has no knowledge of how or when you are generating your encryption keys.

You should design your Encode and Decode steps to accept all the necessary parameters for your key management, such as the key version, alongside your payloads. Like the Data Converters, keys should be mapped to a Namespace in Temporal.

Using Vault for Key Management

This repository provides a robust and complete example of using Temporal with HashiCorp's Vault secrets engine.