Failure detection - Go SDK
This page shows how to do the following:
- Handle errors
- Set Workflow timeouts
- Set a Workflow Retry Policy
- Set Activity timeouts
- Set a custom Activity Retry Policy
Error handling
Within a Workflow, an Activity or Child Workflow execution might fail. You can handle errors differently based on the error type.
If the Activity returns an error as errors.New() or fmt.Errorf(), that error is converted into *temporal.ApplicationError.
If the Activity returns an error as temporal.NewNonRetryableApplicationError("error message", details), that error is returned as *temporal.ApplicationError.
There are other types of errors such as *temporal.TimeoutError, *temporal.CanceledError and
*temporal.PanicError.
Here's an example of handling Activity errors within Workflow code that differentiates between different error types.
err := workflow.ExecuteActivity(ctx, YourActivity, ...).Get(ctx, nil)
if err != nil {
var applicationErr *ApplicationError
if errors.As(err, &applicationErr) {
// retrieve error message
workflow.GetLogger(ctx).Info("Application error", "error", applicationErr.Error())
// handle Activity errors (created via NewApplicationError() API)
var detailMsg string // assuming Activity return error by NewApplicationError("message", true, "string details")
applicationErr.Details(&detailMsg) // extract strong typed details
// handle Activity errors (errors created other than using NewApplicationError() API)
switch applicationErr.Type() {
case "CustomErrTypeA":
// handle CustomErrTypeA
case CustomErrTypeB:
// handle CustomErrTypeB
default:
// newer version of Activity could return new errors that Workflow was not aware of.
}
}
var canceledErr *CanceledError
if errors.As(err, &canceledErr) {
// handle cancellation
}
var timeoutErr *TimeoutError
if errors.As(err, &timeoutErr) {
// handle timeout, could check timeout type by timeoutErr.TimeoutType()
switch err.TimeoutType() {
case commonpb.ScheduleToStart:
// Handle ScheduleToStart timeout.
case commonpb.StartToClose:
// Handle StartToClose timeout.
case commonpb.Heartbeat:
// Handle heartbeat timeout.
default:
}
}
var panicErr *PanicError
if errors.As(err, &panicErr) {
// handle panic, message and call stack are available by panicErr.Error() and panicErr.StackTrace()
}
}
Panics and deferred functions
In Go, defer schedules cleanup functions and recover() catches panics to prevent them from crashing the program.
This doesn't work the same way in Temporal Workflow code — you cannot recover() from a panic inside a defer.
Deferred functions that try to interact with the Temporal SDK during panic unwinding will re-panic immediately.
Use defer only for local cleanup.
Handle Temporal API cleanup through explicit error checks instead.
Workflow timeouts
How to set Workflow timeouts using the Temporal Go SDK
Each Workflow timeout controls the maximum duration of a different aspect of a Workflow Execution.
Workflow timeouts are set when starting the Workflow Execution.
Before we continue, we want to note that we generally do not recommend setting Workflow Timeouts, because Workflows are designed to be long-running and resilient. Instead, setting a Timeout can limit its ability to handle unexpected delays or long-running processes. If you need to perform an action inside your Workflow after a specific period of time, we recommend using a Timer.
- Workflow Execution Timeout - restricts the maximum amount of time that a single Workflow Execution can be executed.
- Workflow Run Timeout: restricts the maximum amount of time that a single Workflow Run can last.
- Workflow Task Timeout: restricts the maximum amount of time that a Worker can execute a Workflow Task.
Create an instance of StartWorkflowOptions from the go.temporal.io/sdk/client package, set a timeout, and pass the instance to the ExecuteWorkflow call.
Available timeouts are:
WorkflowExecutionTimeoutWorkflowRunTimeoutWorkflowTaskTimeout
workflowOptions := client.StartWorkflowOptions{
// ...
// Set Workflow Timeout duration
WorkflowExecutionTimeout: 24 * 365 * 10 * time.Hour,
// WorkflowRunTimeout: 24 * 365 * 10 * time.Hour,
// WorkflowTaskTimeout: 10 * time.Second,
// ...
}
workflowRun, err := c.ExecuteWorkflow(context.Background(), workflowOptions, YourWorkflowDefinition)
if err != nil {
// ...
}
Workflow Retry Policy
How to set a Workflow Retry policy using the Go SDK.
A Retry Policy can work in cooperation with the timeouts to provide fine controls to optimize the execution experience.
Use a Retry Policy to retry a Workflow Execution in the event of a failure.
Workflow Executions do not retry by default, and Retry Policies should be used with Workflow Executions only in certain situations.
Create an instance of a RetryPolicy from the go.temporal.io/sdk/temporal package and provide it as the value to the RetryPolicy field of the instance of StartWorkflowOptions.
- Type:
RetryPolicy - Default: None
retrypolicy := &temporal.RetryPolicy{
InitialInterval: time.Second,
BackoffCoefficient: 2.0,
MaximumInterval: time.Second * 100,
}
workflowOptions := client.StartWorkflowOptions{
RetryPolicy: retrypolicy,
// ...
}
workflowRun, err := temporalClient.ExecuteWorkflow(context.Background(), workflowOptions, YourWorkflowDefinition)
if err != nil {
// ...
}
How to set Activity timeouts
How to set Activity timeouts using the Go SDK.
Each Activity timeout controls the maximum duration of a different aspect of an Activity Execution.
The following timeouts are available in the Activity Options.
- Schedule-To-Close Timeout: is the maximum amount of time allowed for the overall Activity Execution.
- Start-To-Close Timeout: is the maximum time allowed for a single Activity Task Execution.
- Schedule-To-Start Timeout: is the maximum amount of time that is allowed from when an Activity Task is scheduled to when a Worker starts that Activity Task.
An Activity Execution must have either the Start-To-Close or the Schedule-To-Close Timeout set.
To set an Activity Timeout in Go, create an instance of ActivityOptions from the go.temporal.io/sdk/workflow package, set the Activity Timeout field, and then use the WithActivityOptions() API to apply the options to the instance of workflow.Context.
Available timeouts are:
StartToCloseTimeoutScheduleToCloseScheduleToStartTimeout
activityoptions := workflow.ActivityOptions{
// Set Activity Timeout duration
ScheduleToCloseTimeout: 10 * time.Second,
// StartToCloseTimeout: 10 * time.Second,
// ScheduleToStartTimeout: 10 * time.Second,
}
ctx = workflow.WithActivityOptions(ctx, activityoptions)
var yourActivityResult YourActivityResult
err = workflow.ExecuteActivity(ctx, YourActivityDefinition, yourActivityParam).Get(ctx, &yourActivityResult)
if err != nil {
// ...
}
Set a custom Activity Retry Policy
How to set a custom Activity Retry Policy using the Go SDK.
A Retry Policy works in cooperation with the timeouts to provide fine controls to optimize the execution experience.
Activity Executions are automatically associated with a default Retry Policy if a custom one is not provided.
To set a RetryPolicy, create an instance of ActivityOptions from the go.temporal.io/sdk/workflow package, set the RetryPolicy field, and then use the WithActivityOptions() API to apply the options to the instance of workflow.Context.
- Type:
RetryPolicy - Default:
retrypolicy := &temporal.RetryPolicy{
InitialInterval: time.Second,
BackoffCoefficient: 2.0,
MaximumInterval: time.Second * 100, // 100 * InitialInterval
MaximumAttempts: 0, // Unlimited
NonRetryableErrorTypes: []string, // empty
}
Providing a Retry Policy here is a customization, and overwrites individual Field defaults.
retrypolicy := &temporal.RetryPolicy{
InitialInterval: time.Second,
BackoffCoefficient: 2.0,
MaximumInterval: time.Second * 100,
}
activityoptions := workflow.ActivityOptions{
RetryPolicy: retrypolicy,
}
ctx = workflow.WithActivityOptions(ctx, activityoptions)
var yourActivityResult YourActivityResult
err = workflow.ExecuteActivity(ctx, YourActivityDefinition, yourActivityParam).Get(ctx, &yourActivityResult)
if err != nil {
// ...
}
Overriding the retry interval with Next Retry Delay
You may return an Application Failure with the NextRetryDelay field set.
This value will replace and override whatever the Retry interval would be on the Retry Policy.
For example, if in an Activity, you want to base the interval on the number of attempts:
attempt := activity.GetInfo(ctx).Attempt;
return temporal.NewApplicationErrorWithOptions(fmt.Sprintf("Something bad happened on attempt %d", attempt), "NextDelay", temporal.ApplicationErrorOptions{
NextRetryDelay: 3 * time.Second * delay,
})
Activity Heartbeats
How to Heartbeat an Activity using the Go SDK.
An Activity Heartbeat is a ping from the Worker Process that is executing the Activity to the Temporal Service. Each Heartbeat informs the Temporal Service that the Activity Execution is making progress and the Worker has not crashed. If the Temporal Service does not receive a Heartbeat within a Heartbeat Timeout time period, the Activity will be considered failed and another Activity Task Execution may be scheduled according to the Retry Policy.
Heartbeats may not always be sent to the Temporal Service—they may be throttled by the Worker.
Activity Cancellations are delivered to Activities from the Temporal Service when they Heartbeat. Activities that don't Heartbeat can't receive a Cancellation. Heartbeat throttling may lead to Cancellation getting delivered later than expected.
Heartbeats can contain a details field describing the Activity's current progress.
If an Activity gets retried, the Activity can access the details from the last Heartbeat that was sent to the Temporal Service.
To Heartbeat in an Activity in Go, use the RecordHeartbeat API.
import (
// ...
"go.temporal.io/sdk/workflow"
// ...
)
func YourActivityDefinition(ctx, YourActivityDefinitionParam) (YourActivityDefinitionResult, error) {
// ...
activity.RecordHeartbeat(ctx, details)
// ...
}
When an Activity Task Execution times out due to a missed Heartbeat, the last value of the details variable above is returned to the calling Workflow in the details field of TimeoutError with TimeoutType set to Heartbeat.
You can also Heartbeat an Activity from an external source:
// The client is a heavyweight object that should be created once per process.
temporalClient, err := client.Dial(client.Options{})
// Record heartbeat.
err := temporalClient.RecordActivityHeartbeat(ctx, taskToken, details)
The parameters of the RecordActivityHeartbeat function are:
taskToken: The value of the binaryTaskTokenfield of theActivityInfostruct retrieved inside the Activity.details: The serializable payload containing progress information.
If an Activity Execution Heartbeats its progress before it failed, the retry attempt will have access to the progress information, so that the Activity Execution can resume from the failed state. Here's an example of how this can be implemented:
func SampleActivity(ctx context.Context, inputArg InputParams) error {
startIdx := inputArg.StartIndex
if activity.HasHeartbeatDetails(ctx) {
// Recover from finished progress.
var finishedIndex int
if err := activity.GetHeartbeatDetails(ctx, &finishedIndex); err == nil {
startIdx = finishedIndex + 1 // Start from next one.
}
}
// Normal Activity logic...
for i:=startIdx; i<inputArg.EndIdx; i++ {
// Code for processing item i goes here...
activity.RecordHeartbeat(ctx, i) // Report progress.
}
}
Set a Heartbeat Timeout
How to set a Heartbeat Timeout for an Activity using the Go SDK.
A Heartbeat Timeout works in conjunction with Activity Heartbeats.
To set a Heartbeat Timeout, Create an instance of ActivityOptions from the go.temporal.io/sdk/workflow package, set the HeartbeatTimeout field, and then use the WithActivityOptions() API to apply the options to the instance of workflow.Context.
activityoptions := workflow.ActivityOptions{
HeartbeatTimeout: 10 * time.Second,
}
ctx = workflow.WithActivityOptions(ctx, activityoptions)
var yourActivityResult YourActivityResult
err = workflow.ExecuteActivity(ctx, YourActivityDefinition, yourActivityParam).Get(ctx, &yourActivityResult)
if err != nil {
// ...
}