This guide is meant to be a comprehensive overview of Temporal Activities.
In day-to-day conversations, the term Activity frequently denotes either an Activity Type, an Activity Definition, or an Activity Execution. Temporal documentation aims to be explicit and differentiate between them.
The purpose of an Activity is to execute a single, well-defined action (either short or long running), such as calling another service, transcoding a media file, or sending an email.
Activities calling Activities
For some use cases, having an Activity call another Activity might seem convenient. We generally recommend not doing so. Activities are regular functions, so calling one directly is not seen—and therefore not logged—by the Temporal Server.
Instead, move logic out of the Activities and have the parent Workflow use the result of one Activity to call the other Activity.
Fault-oblivious stateful Workflow code is the core abstraction of Temporal. But, due to deterministic execution requirements, they are not allowed to call any external API directly. Instead they orchestrate execution of Activities. In its simplest form, a Temporal Activity is a function or an object method in one of the supported languages. Temporal does not recover Activity state in case of failures. Therefore an Activity function is allowed to contain any code without restrictions.
Activities are invoked asynchronously through task queues. A task queue is essentially a queue used to store an Activity task until it is picked up by an available worker. The worker processes an Activity by invoking its implementation function. When the function returns, the worker reports the result back to the Temporal service which in turn notifies the Workflow about completion. It is possible to implement an Activity fully asynchronously by completing it from a different process.
- An Activity can be implemented as a synchronous method or fully asynchronously involving multiple processes.
- An Activity can be retried indefinitely according to the provided exponential retry policy.
- If for any reason an Activity is not completed within the specified timeout, an error is reported to the Workflow, which decides how to handle it. The duration of an Activity has no limit.
- Activities support an Activity Heartbeat that helps to identify timeouts faster in case the Activity execution fails.
Temporal does not impose any system limit on Activity duration. It is up to the application to choose the timeouts for its execution.
Activities are dispatched to workers through task queues. Task queues are queues that workers listen on. Task queues are highly dynamic and lightweight. They don't need to be explicitly registered. And it is okay to have one task queue per worker process. It is normal to have more than one Activity type to be invoked through a single task queue. And it is normal in some cases (like host routing) to invoke the same Activity type on multiple task queues.
Here are some use cases for employing multiple Activity task queues in a single Workflow:
- Flow control. A worker that consumes from a task queue asks for an Activity task only when it has available capacity. So workers are never overloaded by request spikes. If Activity executions are requested faster than workers can process them, they are backlogged in the task queue.
- Throttling. Each Activity worker can specify the maximum rate it is allowed to process Activities on a task queue. It does not exceed this limit even if it has spare capacity. There is also support for global task queue rate limiting. This limit works across all workers for the given task queue. It is frequently used to limit load on a downstream service that an Activity calls into.
- Deploying a set of Activities independently. Think about a service that hosts Activities and can be deployed independently from other Activities and Workflows. To send Activity tasks to this service, a separate task queue is needed.
- Workers with different capabilities. For example, workers on GPU boxes vs non GPU boxes. Having two separate task queues in this case allows Workflows to pick which one to send Activity an execution request to.
- Routing Activity to a specific host. For example, in the media encoding case the transform and upload Activity have to run on the same host as the download one.
- Routing Activity to a specific process. For example, some Activities load large data sets and cache them in the process. The Activities that rely on this data set should be routed to the same process.
- Multiple priorities. One task queue per priority and having a worker pool per priority.
- Versioning. A new backwards incompatible implementation of an Activity might use a different task queue.
For long running Activities, we recommend that you specify a relatively short heartbeat timeout and constantly heartbeat. This way worker failures for even very long running Activities can be handled in a timely manner. An Activity that specifies the heartbeat timeout is expected to call the heartbeat method periodically from its implementation.
A heartbeat request can include application specific payload. This is useful to save Activity execution progress. If an Activity times out due to a missed heartbeat, the next attempt to execute it can access that progress and continue its execution from that point.
Long running Activities can be used as a special case of leader election. Temporal timeouts use second resolution. So it is not a solution for realtime applications. But if it is okay to react to the process failure within a few seconds, then a Temporal heartbeat Activity is a good fit.
One common use case for such leader election is monitoring. An Activity executes an internal loop that periodically polls some API and checks for some condition. It also heartbeats on every iteration. If the condition is satisfied, the Activity completes which lets its Workflow to handle it. If the Activity worker dies, the Activity times out after the heartbeat interval is exceeded and is retried on a different worker. The same pattern works for polling for new files in Amazon S3 buckets or responses in REST or other synchronous APIs.
note Cancellations are not immediate
ctx.Done() is only signaled when a heartbeat is sent to the service.
Temporal's SDK throttles this so a heartbeat may not be sent to the service until 80% of the heartbeat timeout has elapsed.
For example, if your heartbeat timeout is 20 seconds,
ctx.Done() will not be signaled until 80% of 20 seconds (~16 seconds) has elapsed.
To increase or decrease the delay of cancelation, modify the heartbeat timeout defined for the activity context.
Asynchronous Activity Completion
Asynchronous Activity Completion occurs when the final result of a computation, started by an Activity, is provided to the Temporal System from an external system.
By default, an Activity is a function or method (depending on the language) that completes as soon as the function or method returns. But in some cases an Activity implementation is asynchronous. For example, the action could be forwarded to an external system through a message queue, and the result could come through a different queue.
To support such use cases, Temporal allows Activity implementations that do not complete upon Activity function completions. A separate API should be used in this case to complete the Activity. This API can be called from any process, even in a different programming language, that the original Activity worker used.
An Activity Definition is the code that defines the constraints of an Activity Task Execution.
The term 'Activity Definition' is used to refer to the full set of primitives in any given language SDK that provides an access point to an Activity Function Definition——the method or function that is invoked for an Activity Task Execution. Therefore, the terms Activity Function and Activity Method refer to the source of an instance of an execution.
Activity Definitions are named and referenced in code by their Activity Type.
Activity Definitions are executed as normal functions.
In the event of failure, the function begins at its initial state when retried (except when Activity Heartbeats are established).
Therefore, an Activity Definition has no restrictions on the code it contains.
An Activity Definition can support as many parameters as needed.
All values passed through these parameters are recorded in the Event History of the Workflow Execution. Return values are also captured in the Event History for the calling Workflow Execution.
Activity Definitions must contain the following parameters:
- Context: an optional parameter that provides Activity context within multiple APIs.
- Heartbeat: a notification from the Worker to the Temporal Cluster that the Activity Execution is progressing. Cancelations are allowed only if the Activity Definition permits Heartbeating.
- Timeouts: intervals that control the execution and retrying of Activity Task Executions.
Other parameters, such as Retry Policies and return values, can be seen in the implementation guides, listed in the next section.
Implementing Activity Definitions
We strongly recommend that you develop an Activity Definition in a language that has a corresponding Temporal SDK.
- How to develop an Activity Definition in Go
- How to develop an Activity Interface in Java
- How to develop an Activity Interface in PHP
- How to develop an Activity Interface in TypeScript
An Activity Type is the mapping of a name to an Activity Definition.
Activity Types are scoped via Task Queues.
An Activity Execution is the full chain of Activity Task Executions.
A Workflow can request to cancel an Activity Execution.
When an Activity Execution is canceled, or its Workflow Execution has completed or failed, the context passed into its function is canceled, which also sets its channel’s closed state to
An Activity can use that to perform any necessary cleanup and abort its execution.
Cancellation requests are only delivered to Activity Executions that Heartbeat:
- The Heartbeat request fails with a special error indicating that the Activity Execution is canceled. Heartbeats can also fail when the Workflow Execution that spawned it is in a completed state.
- The Activity should perform all necessary cleanup and report when it is done.
- The Workflow can decide if it wants to wait for the Activity cancellation confirmation or proceed without waiting.
A unique identifier for an Activity Execution. The identifier can be generated by the system, or it can be provided by the Workflow code that spawns the Activity Execution. An Activity Id can be used to complete the Activity asynchronously.
A Schedule-To-Start Timeout is the maximum amount of time that is allowed from when an Activity Task is scheduled (that is, placed in a Task Queue) to when a Worker starts (that is, picks up from the Task Queue) that Activity Task. In other words, it's a limit for how long an Activity Task can be enqueued.
The moment that the Task is picked by the Worker from the Task Queue is considered to be the start of the Activity Task for the purposes of the Schedule-To-Start Timeout and associated metrics. This definition of "Start" avoids issues that a clock difference between the Temporal Cluster and a Worker might create.
"Schedule" in Schedule-To-Start and Schedule-To-Close have different frequency guarantees.
The Schedule-To-Start Timeout is enforced for each Activity Task, whereas the Schedule-To-Close Timeout is enforced once per Activity Execution. Thus, "Schedule" in Schedule-To-Start refers to the scheduling moment of every Activity Task in the sequence of Activity Tasks that make up the Activity Execution, while "Schedule" in Schedule-To-Close refers to the first Activity Task in that sequence.
A Retry Policy attached to an Activity Execution retries an Activity Task.
This timeout has two primary use cases:
- Detect whether an individual Worker has crashed.
- Detect whether the fleet of Workers polling the Task Queue is not able to keep up with the rate of Activity Tasks.
The default Schedule-To-Start Timeout is ∞ (infinity).
If this timeout is used, we recommend setting this timeout to the maximum time a Workflow Execution is willing to wait for an Activity Execution in the presence of all possible Worker outages, and have a concrete plan in place to reroute Activity Tasks to a different Task Queue. This timeout does not trigger any retries regardless of the Retry Policy, as a retry would place the Activity Task back into the same Task Queue. We do not recommend using this timeout unless you know what you are doing.
In most cases, we recommend monitoring the
temporal_activity_schedule_to_start_latency metric to know when Workers slow down picking up Activity Tasks, instead of setting this timeout.
A Start-To-Close Timeout is the maximum time allowed for a single Activity Task Execution.
The default Start-To-Close Timeout is the same as the default Schedule-To-Close Timeout.
An Activity Execution must have either this timeout (Start-To-Close) or the Schedule-To-Close Timeout set. We recommend always setting this timeout; however, make sure that it is always set to be longer than the maximum possible time for the Activity Execution to take place. For long running Activity Executions, we recommend also using Activity Heartbeats and Heartbeat Timeouts.
The main use case for the Start-To-Close timeout is to detect when a Worker crashes after it has started executing an Activity Task.
A Retry Policy attached to an Activity Execution retries an Activity Task Execution. Thus the Start-To-Close Timeout is applied to each Activity Task Execution within an Activity Execution.
If the first Activity Task Execution returns an error the first time, then the full Activity Execution might look like this:
If this timeout is reached, the following actions occur:
- An ActivityTaskTimedOut Event is written to the Workflow Execution's mutable state.
- If a Retry Policy dictates a retry, the Temporal Cluster schedules another Activity Task.
- The attempt count increments by 1 in the Workflow Execution's mutable state.
- The Start-To-Close Timeout timer is reset.
A Schedule-To-Close Timeout is the maximum amount of time allowed for the overall Activity Execution, from when the first Activity Task is scheduled to when the last Activity Task, in the chain of Activity Tasks that make up the Activity Execution, reaches a Closed status.
Example Schedule-To-Close Timeout period for an Activity Execution that has a chain Activity Task Executions:
The default Schedule-To-Close Timeout is ∞ (infinity).
An Activity Execution must have either this timeout (Schedule-To-Close) or Start-To-Close set. By default, an Activity Execution Retry Policy dictates that retries will occur for up to 10 years. This timeout can be used to control the overall duration of an Activity Execution in the face of failures (repeated Activity Task Executions), without altering the Maximum Attempts field of the Retry Policy.
A Heartbeat Timeout is the maximum time between Activity Heartbeats.
If this timeout is reached, the Activity Task fails and a retry occurs if a Retry Policy dictates it.
An Activity Heartbeat is a ping from the Worker that is executing the Activity to the Temporal Cluster. Each ping informs the Temporal Cluster that the Activity Execution is making progress and the Worker has not crashed.
Activity Heartbeats work in conjunction with a Heartbeat Timeout.
Activity Heartbeats are implemented within the Activity Definition. Custom progress information can be included in the Heartbeat which can then be used by the Activity Execution should a retry occur.
An Activity Heartbeat can be recorded as often as needed (e.g. once a minute or every loop iteration). Temporal SDKs control the rate at which Heartbeats are sent to the Cluster.
Heartbeating is not required from Local Activities, and does nothing.
Some Activity Executions are very short-living and do not need the queuing semantic, flow control, rate limiting, and routing capabilities. For this case, Temporal supports the Local Activity feature.
The main benefit of Local Activities is that they use less Temporal service resources (e.g. lower state transitions) and have much lower latency overhead (because no need to roundtrip to the Cluster) compared to normal Activity Executions. However, Local Activities are subject to shorter durations and a lack of rate limiting.
Consider using Local Activities for functions that are the following:
- no longer than a few seconds, inclusive of retries (shorter than the Workflow Task Timeout, which is 10 seconds by default).
- do not require global rate limiting.
- do not require routing to a specific Worker or Worker pool.
- can be implemented in the same binary as the Workflow that calls them.
Using a Local Activity without understanding its limitations can cause various production issues. We recommend using regular Activities unless your use case requires very high throughput and large Activity fan outs of very short-lived Activities. More guidance in choosing between Local Activity vs Activity is available in our forums.