Background: What is a Failure?
A Failure is Temporal's representation of various types of errors that occur in the system.
There are different types of Failures, and each has a different type in the SDKs and different information in the protobuf messages (which are used to communicate with the Temporal Cluster and appear in Event History).
Most SDKs have a base class that the other Failures extend:
- TypeScript: TemporalFailure
- Java: TemporalFailure
- Python: FailureError
The base Failure proto message has these fields:
string source: The SDK this Failure originated in (for example,
"TypeScriptSDK"). In some SDKs, this field is used to rehydrate the stack trace into an exception object.
Failure cause: The
Failuremessage of the cause of this Failure (if applicable).
Payload encoded_attributes: Contains the encoded
stack_tracefields when using a Failure Converter.
Workflow and Activity code use Application Failures to communicate application-specific failures that happen. This is the only type of Failure created and thrown by user code.
- TypeScript: ApplicationFailure
- Java: ApplicationFailure
- Go: ApplicationError
- Python: ApplicationError
- Proto: ApplicationFailureInfo and Failure
Throw from Workflows
Only Workflow errors that are Temporal Failures cause the Worklow Execution to fail; all other errors cause the Workflow Task to fail and be retried (except for Go, where any error returned from the Workflow fails the Execution, and a panic fails the Task). Most types of Temporal Failures automatically occur, like a Cancelled Failure when the Workflow is Cancelled or an Activity Failure when an Activity Fails. You can also explicitly fail the Workflow Execution by throwing (or returning, depending on the SDK) an Application Failure.
Throw from Activities
In Activities, you can either throw an Application Failure or another Error to fail the Activity Task. In the latter case, the error is converted to an Application Failure. During conversion, the following Application Failure fields are set:
typeis set to the error's type name.
messageis set to the error message.
non_retryableis set to false.
detailsare left unset.
causeis a Failure converted from the error's
- stack trace is copied.
When an Activity Execution fails, the Application Failure from the last Activity Task is the
cause field of the ActivityFailure thrown in the Workflow.
When an Activity or Workflow throws an Application Failure, the Failure's
type field is matched against a Retry Policy's list of non-retryable errors to determine whether to retry the Activity or Workflow.
Activities and Workflow can also avoid retrying by setting an Application Failure's
non_retryable flag to
When Cancellation of a Workflow or Activity is requested, SDKs represent the cancellation to the user in language-specific ways. For example, in TypeScript, in some cases a Cancelled Failure is thrown directly by a Workflow API function, and in other cases the Cancelled Failure is wrapped in a different Failure. To check both types of cases, TypeScript has the isCancellation helper.
When a Workflow or Activity is successfully Cancelled, a Cancelled Failure is the
cause field of the Activity Failure or "Workflow failed" error.
- TypeScript: CancelledFailure
- Java: CanceledFailure
- Go: CanceledError
- Python: CancelledError
- Proto: CanceledFailureInfo and Failure
An Activity Failure is delivered to the Workflow Execution when an Activity fails.
It contains information about the failure and the Activity Execution; for example, the Activity Type and Activity Id.
The reason for the failure is in the
For example, if an Activity Execution times out, the
cause is a Timeout Failure.
- TypeScript: ActivityFailure
- Java: ActivityFailure
- Go: ActivityError
- Python: ActivityError
- Proto: ActivityFailureInfo and Failure
Child Workflow Failure
A Child Workflow Failure is delivered to the Workflow Execution when a Child Workflow Execution fails.
It contains information about the failure and the Child Workflow Execution; for example, the Workflow Type and Workflow Id.
The reason for the failure is in the
- TypeScript: ChildWorkflowFailure
- Java: ChildWorkflowFailure
- Go: ChildWorkflowExecutionError
- Python: ChildWorkflowError
- Proto: ChildWorkflowExecutionFailureInfo and Failure
A Timeout Failure represents the timeout of an Activity or Workflow.
When an Activity times out, the last Heartbeat details it emitted is attached.
- TypeScript: TimeoutFailure
- Java: TimeoutFailure
- Go: TimeoutError
- Python: TimeoutError
- Proto: TimeoutFailureInfo and Failure
A Terminated Failure is used as the
cause of an error when a Workflow is terminated, and you receive the error in one of the following locations:
- Inside a Workflow that's waiting for the result of a Child Workflow.
- When waiting for the result of a Workflow on the Client.
In the SDKs:
- TypeScript: TerminatedFailure
- Java: TerminatedFailure
- Go: TerminatedError
- Python: TerminatedError
- Proto: TerminatedFailureInfo and Failure
A Server Failure is used for errors that originate in the Cluster.
- TypeScript: ServerFailure
- Java: ServerFailure
- Go: ServerError
- Python: ServerError
- Proto: ServerFailureInfo and Failure