Skip to main content

Workflows

This guide provides a comprehensive overview of Temporal Workflows.

In day-to-day conversations, the term Workflow frequently denotes either a Workflow TypeLink preview icon

, a Workflow DefinitionLink preview icon, or a Workflow ExecutionLink preview icon. Temporal documentation aims to be explicit and differentiate between them.

Workflow Definition

A Workflow Definition is the code that defines the constraints of a Workflow Execution.

A Workflow Definition is often also referred to as a Workflow Function. In Temporal's documentation, a Workflow Definition refers to the source for the instance of a Workflow Execution, while a Workflow Function refers to the source for the instance of a Workflow Function Execution.

A Workflow Execution effectively executes once to completion, while a Workflow Function Execution occurs many times during the life of a Workflow Execution.

We strongly recommend that you write a Workflow Definition in a language that has a corresponding Temporal SDK.

Deterministic constraints

A critical aspect of developing Workflow Definitions is ensuring they exhibit certain deterministic traits – that is, making sure that the same Commands are emitted in the same sequence, whenever a corresponding Workflow Function Execution (instance of the Function Definition) is re-executed.

The execution semantics of a Workflow Execution include the re-execution of a Workflow Function. The use of Workflow APIs in the function is what generates CommandsLink preview icon

. Commands tell the Cluster which Events to create and add to the Workflow Execution's Event History. When a Workflow Function executes, the Commands that are emitted are compared with the existing Event History. If a corresponding Event already exists within the Event History that maps to the generation of that Command in the same sequence, and some specific metadata of that Command matches with some specific metadata of the Event, then the Function Execution progresses.

For example, using an SDK's "Execute Activity" API generates the ScheduleActivityTask Command. When this API is called upon re-execution, that Command is compared with the Event that is in the same location within the sequence. The Event in the sequence must be an ActivityTaskScheduled Event, where the Activity name is the same as what is in the Command.

If a generated Command doesn't match what it needs to in the existing Event History, then the Workflow Execution returns a non-deterministic error.

The following are the two reasons why a Command might be generated out of sequence or the wrong Command might be generated altogether:

  1. Code changes are made to a Workflow Definition that is in use by a running Workflow Execution.
  2. There is intrinsic non-deterministic logic (such as inline random branching).

Code changes can cause non-deterministic behavior

The Workflow Definition can change in very limited ways once there is a Workflow Execution depending on it. To alleviate non-deterministic issues that arise from code changes, we recommend using Workflow Versioning.

For example, let's say we have a Workflow Definition that defines the following sequence:

  1. Start and wait on a Timer/sleep.
  2. Spawn and wait on an Activity Execution.
  3. Complete.

We start a Worker and spawn a Workflow Execution that uses that Workflow Definition. The Worker would emit the StartTimer Command and the Workflow Execution would become suspended.

Before the Timer is up, we change the Workflow Definition to the following sequence:

  1. Spawn and wait on an Activity Execution.
  2. Start and wait on a Timer/sleep.
  3. Complete.

When the Timer fires, the next Workflow Task will cause the Workflow Function to re-execute. The first Command the Worker sees would be ScheduleActivityTask Command, which wouldn't match up to the expected TimerStarted Event.

The Workflow Execution would fail and return a non-deterministic error.

The following are examples of minor changes that would not result in non-determinism errors when re-executing a History which already contain the Events:

  • Changing the duration of a Timer.
  • Changing the arguments to:
    • The Activity Options in a call to spawn an Activity Execution (local or nonlocal).
    • The Child Workflow Options in a call to spawn a Child Workflow Execution.
    • Call to Signal an External Workflow Execution.

Intrinsic non-deterministic logic

Intrinsic non-determinism is when a Workflow Function Execution might emit a different sequence of Commands on re-execution, regardless of whether all the input parameters are the same.

For example, a Workflow Definition can not have inline logic that branches (emits a different Command sequence) based off a local time setting or a random number. In the representative pseudocode below, the local_clock() function returns the local time, rather than Temporal-defined time:

fn your_workflow() {
if local_clock().is_before("12pm") {
await workflow.sleep(duration_until("12pm"))
} else {
await your_afternoon_activity()
}
}

Each Temporal SDK offers APIs that enable Workflow Definitions to have logic that gets and uses time, random numbers, and data from unreliable resources. When those APIs are used, the results are stored as part of the Event History, which means that a re-executed Workflow Function will issue the same sequence of Commands, even if there is branching involved.

In other words, all operations that do not purely mutate the Workflow Execution's state should occur through a Temporal SDK API.

Workflow Versioning

The Workflow Versioning feature enables the creation of logical branching inside a Workflow Definition based on a developer specified version identifier. This feature is useful for Workflow Definition logic needs to be updated, but there are running Workflow Executions that currently depends on it. It is important to note that a practical way to handle different versions of Workflow Definitions, without using the versioning API, is to run the different versions on separate Task Queues.

Handling unreliable Worker Processes

You do not handle Worker Process failure or restarts in a Workflow Definition.

Workflow Function Executions are completely oblivious to the Worker Process in terms of failures or downtime. The Temporal Platform ensures that the state of a Workflow Execution is recovered and progress resumes if there is an outage of either Worker Processes or the Temporal Cluster itself. The only reason a Workflow Execution might fail is due to the code throwing an error or exception, not because of underlying infrastructure outages.

Workflow Type

A Workflow Type is a name that maps to a Workflow Definition.

  • A single Workflow Type can be instantiated as multiple Workflow Executions.
  • A Workflow Type is scoped by a Task Queue. It is acceptable to have the same Workflow Type name map to different Workflow Definitions if they are using completely different Workers.

Workflow Type cardinality with Workflow Definitions and Workflow Executions

Workflow Type cardinality with Workflow Definitions and Workflow Executions

Workflow Execution

A Temporal Workflow Execution is a durable, reliable, and scalable function execution. It is the main unit of execution of a Temporal ApplicationLink preview icon

.

Each Temporal Workflow Execution has exclusive access to its local state. It executes concurrently to all other Workflow Executions, and communicates with other Workflow Executions through SignalsLink preview icon

and the environment through ActivitiesLink preview icon. While a single Workflow Execution has limits on size and throughput, a Temporal Application can consist of millions to billions of Workflow Executions.

Durability

Durability is the absence of an imposed time limit.

A Workflow Execution is durable because it executes a Temporal Workflow Definition (also called a Temporal Workflow Function), your application code, effectively once and to completion—whether your code executes for seconds or years.

Reliability

Reliability is responsiveness in the presence of failure.

A Workflow Execution is reliable, because it is fully recoverable after a failure. The Temporal Platform ensures the state of the Workflow Execution persists in the face of failures and outages and resumes execution from the latest state.

Scalability

Scalability is responsiveness in the presence of load.

A single Workflow Execution is limited in size and throughput but is scalable because it can Continue-As-NewLink preview icon

in response to load. A Temporal Application is scalable because the Temporal Platform is capable of supporting millions to billions of Workflow Executions executing concurrently, which is realized by the design and nature of the Temporal ClusterLink preview icon and Worker ProcessesLink preview icon.

Commands and awaitables

A Workflow Execution does two things:

  1. Issue CommandsLink preview icon.
  2. Wait on an Awaitables (often called Futures).

Command generation and waiting

Command generation and waiting

Commands are issued and Awaitables are provided by the use of Workflow APIs in the Workflow DefinitionLink preview icon

.

Commands are generated whenever the Workflow Function is executed. The Worker Process supervises the Command generation and makes sure that it maps to the current Event History. (For more information, see Deterministic constraints.) The Worker Process batches the Commands and then suspends progress to send the Commands to the Cluster whenever the Workflow Function reaches a place where it can no longer progress without a result from an Awaitable.

A Workflow Execution may only ever block progress on an Awaitable that is provided through a Temporal SDK API. Awaitables are provided when using APIs for the following:

Status

A Workflow Execution can be either Open or Closed.

Workflow Execution statuses

Workflow Execution statuses

Open

  • Running: The only Open status for a Workflow Execution. When the Workflow Execution is Running, it is either actively progressing or is waiting on something.

Closed

A Closed status means that the Workflow Execution cannot make further progress because of one of the following reasons:

Workflow Execution Chain

A Workflow Execution Chain is a sequence of Workflow Executions that share the same Workflow Id. Each link in the Chain is often called a Workflow Run. Each Workflow Run in the sequence is connected by one of the following:

A Workflow Execution is uniquely identified by its NamespaceLink preview icon

, Workflow IdLink preview icon, and Run IdLink preview icon.

The Workflow Execution TimeoutLink preview icon

applies to a Workflow Execution Chain. The Workflow Run TimeoutLink preview icon applies to a single Workflow Execution (Workflow Run).

Event loop

A Workflow Execution is made up of a sequence of EventsLink preview icon

called an Event HistoryLink preview icon. Events are created by the Temporal Cluster in response to either Commands or actions requested by a Temporal Client (such as a request to spawn a Workflow Execution).

Workflow Execution

Workflow Execution

Time constraints

Is there a limit to how long Workflows can run?

No, there is no time constraint on how long a Workflow Execution can be Running.

However, Workflow Executions intended to run indefinitely should be written with some care. The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution. There is a hard limit of 50,000 Events in a Workflow Execution Event History, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events. When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.

To prevent runaway Workflow Executions, you can use the Workflow Execution Timeout, the Workflow Run Timeout, or both. A Workflow Execution Timeout can be used to limit the duration of Workflow Execution Chain, and a Workflow Run Timeout can be used to limit the duration an individual Workflow Execution (Run).

You can use the Continue-As-NewLink preview icon

feature to close the current Workflow Execution and create a new Workflow Execution in a single atomic operation. The Workflow Execution spawned from Continue-As-New has the same Workflow Id, a new Run Id, and a fresh Event History and is passed all the appropriate parameters. For example, it may be reasonable to use Continue-As-New once per day for a long-running Workflow Execution that is generating a large Event History.

Command

A Command is a requested action issued by a WorkerLink preview icon

to the Temporal ClusterLink preview icon after a Workflow Task ExecutionLink preview icon completes.

The action that the Cluster takes is recorded in the Workflow Execution's Event HistoryLink preview icon

as an EventLink preview icon. The Workflow Execution can await on some of the Events that come as a result from some of the Commands.

Commands are generated by the use of Workflow APIs in your code. During a Workflow Task Execution there may be several Commands that are generated. The Commands are batched and sent to the Cluster as part of the Workflow Task Execution completion request, after the Workflow Task has progressed as far as it can with the Workflow function. There will always be WorkflowTaskStarted and WorkflowTaskCompleted Events in the Event History when there is a Workflow Task Execution completion request.

Commands are generated by the use of Workflow APIs in your code

Commands are generated by the use of Workflow APIs in your code

Commands are described in the Command reference and are defined in the Temporal gRPC API.

Event

Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution. Each Event corresponds to an enum that is defined in the Server API.

All Events are recorded in the Event HistoryLink preview icon

.

A list of all possible Events that could appear in a Workflow Execution Event History is provided in the Event reference.

Event History

An append-log of EventsLink preview icon

for your application.

  • Event History is durably persisted by the Temporal service, enabling seamless recovery of your application state from crashes or failures.
  • It also serves as an audit log for debugging.

Event History limits The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution. There is a hard limit of 50,000 Events in a Workflow Execution Event History, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events. When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.

Continue-As-New

Continue-As-New is a mechanism by which the latest relevant state is passed to a new Workflow Execution, with a fresh Event History.

As a precautionary measure, the Temporal Platform limits the total Event HistoryLink preview icon

to 50,000 Events or 50 MB, and will warn you every 10,000 Events or 10 MB. To prevent a Workflow Execution Event History from exceeding this limit and failing, use Continue-As-New to start a new Workflow Execution with a fresh Event History.

All values passed to a Workflow Execution through parameters or returned through a result value are recorded into the Event History. A Temporal Cluster stores the full Event History of a Workflow Execution for the duration of a Namespace's retention period. A Workflow Execution that periodically executes many Activities has the potential of hitting the size limit.

A very large Event History can adversely affect the performance of a Workflow Execution. For example, in the case of a Workflow Worker failure, the full Event History must be pulled from the Temporal Cluster and given to another Worker via a Workflow Task. If the Event history is very large, it may take some time to load it.

The Continue-As-New feature enables developers to complete the current Workflow Execution and start a new one atomically.

The new Workflow Execution has the same Workflow Id, but a different Run Id, and has its own Event History.

In the case of Temporal Cron JobsLink preview icon

, Continue-As-New is actually used internally for the same effect.

Run Id

A Run Id is a globally unique, platform-level identifier for a Workflow ExecutionLink preview icon

.

Temporal guarantees that only one Workflow Execution with a given Workflow IdLink preview icon

can be in an Open state at any given time. But when a Workflow Execution reaches a Closed state, it is possible to have another Workflow Execution in an Open state with the same Workflow Id. For example, a Temporal Cron Job is a chain of Workflow Executions that all have the same Workflow Id. Each Workflow Execution within the chain is considered a Run.

A Run Id uniquely identifies a Workflow Execution even if it shares a Workflow Id with other Workflow Executions.

caution

Don't rely on storing the current Run Id or using it for any logical choices. A Workflow Retry changes the Run Id. Because the current Run Id is mutable, relying on it might produce non-determinism issues.

Learn more

For more information, see the following links.

Workflow Id

A Workflow Id is a customizable, application-level identifier for a Workflow ExecutionLink preview icon

that is unique to an Open Workflow Execution within a Namespace.

A Workflow Id is meant to be a business-process identifier such as customer identifier or order identifier.

A Workflow Id Reuse PolicyLink preview icon

can be used to manage whether a Workflow Id can be re-used. The Temporal Platform guarantees uniqueness of the Workflow Id within a NamespaceLink preview icon based on the Workflow Id Reuse Policy.

It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution, regardless of the Workflow Id Reuse Policy. An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow execution already started error.

A Workflow Execution can be uniquely identified across all Namespaces by its NamespaceLink preview icon

, Workflow Id, and Run IdLink preview icon.

Workflow Id Reuse Policy

A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.

It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution. An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow Execution already started error.

A Workflow Id Reuse Policy has three possible values:

  • Allow Duplicate The Workflow Execution is allowed to exist regardless of the Closed status of a previous Workflow Execution with the same Workflow Id. This is the default policy, if one is not specified. Use this when it is OK to have a Workflow Execution with the same Workflow Id as a previous, but now Closed, Workflow Execution.
  • Allow Duplicate Failed Only: The Workflow Execution is allowed to exist only if a previous Workflow Execution with the same Workflow Id does not have a Completed status. Use this policy when there is a need to re-execute a Failed, Timed Out, Terminated or Cancelled Workflow Execution and guarantee that the Completed Workflow Execution will not be re-executed.
  • Reject Duplicate: The Workflow Execution cannot exist if a previous Workflow Execution has the same Workflow Id, regardless of the Closed status. Use this when there can only be one Workflow Execution per Workflow Id within a Namespace for the given retention period.

A Workflow Id Reuse Policy applies only if a Closed Workflow Execution with the same Workflow Id exists within the Retention Period of the associated Namespace. For example, if the Namespace's retention period is 30 days, a Workflow Id Reuse Policy can only compare the Workflow Id of the spawning Workflow Execution against the Closed Workflow Executions for the last 30 days.

If there is an attempt to spawn a Workflow Execution with a Workflow Id Reuse Policy that won't allow it the Server will prevent the Workflow Execution from spawning.

Workflow Execution Timeout

A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.

Workflow Execution Timeout period

Workflow Execution Timeout period

The default value is ∞ (infinite). If this timeout is reached, the Workflow Execution changes to a Timed Out status. This timeout is different from the Workflow Run TimeoutLink preview icon

. This timeout is most commonly used for stopping the execution of a Temporal Cron JobLink preview icon after a certain amount of time has passed.

Workflow Run Timeout

A Workflow Run Timeout is the maximum amount of time that a single Workflow Run is restricted to.

Workflow Run Timeout period

Workflow Run Timeout period

The default is set to the same value as the Workflow Execution TimeoutLink preview icon

. This timeout is most commonly used to limit the execution time of a single Temporal Cron Job ExecutionLink preview icon.

If the Workflow Run Timeout is reached, the Workflow Execution is Terminated.

Workflow Task Timeout

A Workflow Task Timeout is the maximum amount of time allowed for a WorkerLink preview icon

to execute a Workflow TaskLink preview icon after the Worker has pulled that Workflow Task from the Task QueueLink preview icon.

Workflow Task Timeout period

Workflow Task Timeout period

The default value is 10 seconds. This timeout is primarily available to recognize whether a Worker has gone down so that the Workflow Execution can be recovered on a different Worker. The main reason for increasing the default value would be to accommodate a Workflow Execution that has a very long Workflow Execution History that could take longer than 10 seconds for the Worker to load.

Implementation guides:

Memo

A Memo is a non-indexed user-supplied set of Workflow Execution metadata that is displayed with Filtered List results.

Signal

A Signal is an asynchronous request to a Workflow ExecutionLink preview icon

.

A Signal delivers data to a running Workflow Execution. It cannot return data to the caller; to do so, use a Query instead. The Workflow code that handles a Signal can mutate Workflow state. A Signal can be sent from a Temporal Client or a Workflow. When a Signal is sent, it is received by the Cluster and recorded as an Event to the Workflow Execution Event History. A successful response from the Cluster means that the Signal has been persisted and will be delivered at least once to the Workflow Execution.1 The next scheduled Workflow Task will contain the Signal Event.

A Signal must include a destination (Namespace and Workflow Id) and name. It can include a list of arguments.

Signal handlers are Workflow functions that listen for Signals by the Signal name. Signals are delivered in the order they are received by the Cluster. If multiple deliveries of a Signal would be a problem for your Workflow, add idempotency logic to your Signal handler that checks for duplicates.

Query

A Query is a synchronous operation that is used to get the state of a Workflow ExecutionLink preview icon

. The state of a running Workflow Execution is constantly changing. You can use Queries to expose the internal Workflow Execution state to the external world. Queries are available for running or completed Workflows Executions only if the Worker is up and listening on the Task Queue.

Queries are sent from a Temporal Client to a Workflow Execution. The API call is synchronous. The Query is identified at both ends by a Query name. The Workflow must have a Query handler that is developed to handle that Query and provide data that represents the state of the Workflow Execution.

Queries are strongly consistent and are guaranteed to return the most recent state. This means that the data reflects the state of all confirmed Events that came in before the Query was sent. An Event is considered confirmed if the call creating the Event returned success. Events that are created while the Query is outstanding may or may not be reflected in the Workflow state the Query result is based on.

A Query can carry arguments to specify the data it is requesting. And each Workflow can expose data to multiple types of Queries.

A Query must never mutate the state of the Workflow Execution—that is, Queries are read-only and cannot contain any blocking code. This means, for example, that Query handling logic cannot schedule Activity Executions.

Sending Queries to completed Workflow Executions is supported, though Query reject conditions can be configured per Query.

Stack Trace Query

In many SDKs, the Temporal Client exposes a predefined __stack_trace Query that returns the stack trace of all the threads owned by that Workflow Execution. This is a great way to troubleshoot a Workflow Execution in production. For example, if a Workflow Execution has been stuck at a state for longer than an expected period of time, you can send a __stack_trace Query to return the current call stack. The __stack_trace Query name does not require special handling in your Workflow code.

note

Stack Trace Queries are available only for running Workflow Executions.

Side Effect

A Side Effect is a way to execute a short, nondeterministic code snippet, such as generating a UUID, that executes the provided function once and records its result into the Workflow Execution Event History.

A Side Effect does not re-execute upon replay, but instead returns the recorded result.

Do not ever have a Side Effect that could fail, because failure could result in the Side Effect function executing more than once. If there is any chance that the code provided to the Side Effect could fail, use an Activity.

Child Workflow

A Child Workflow Execution is a Workflow ExecutionLink preview icon

that is spawned from within another Workflow.

A Workflow Execution can be both a Parent and a Child Workflow Execution because any Workflow can spawn another Workflow.

Parent and Child Workflow Execution entity relationship

Parent and Child Workflow Execution entity relationship

A Parent Workflow Execution must await on the Child Workflow Execution to spawn. The Parent can optionally await on the result of the Child Workflow Execution. Consider the Child's Parent Close PolicyLink preview icon

if the Parent does not await on the result of the Child, which includes any use of Continue-As-New by the Parent.

When a Parent Workflow Execution reaches a Closed status, the Cluster propagates Cancellation Requests or Terminations to Child Workflow Executions depending on the Child's Parent Close Policy.

If a Child Workflow Execution uses Continue-As-New, from the Parent Workflow Execution's perspective the entire chain of Runs is treated as a single execution.

Parent and Child Workflow Execution entity relationship with Continue As New

Parent and Child Workflow Execution entity relationship with Continue As New

When to use Child Workflows

Consider Workflow Execution Event History size limits.

An individual Workflow Execution has an Event HistoryLink preview icon

size limit, which imposes a couple of considerations for using Child Workflows.

On one hand, because Child Workflow Executions have their own Event Histories, they are often used to partition large workloads into smaller chunks. For example, a single Workflow Execution does not have enough space in its Event History to spawn 100,000 Activity ExecutionsLink preview icon

. But a Parent Workflow Execution can spawn 1,000 Child Workflow Executions that each spawn 1,000 Activity Executions to achieve a total of 1,000,000 Activity Executions.

However, because a Parent Workflow Execution Event History contains EventsLink preview icon

that correspond to the status of the Child Workflow Execution, a single Parent should not spawn more than 1,000 Child Workflow Executions.

In general, however, Child Workflow Executions result in more overall Events recorded in Event Histories than Activities. Because each entry in an Event History is a cost in terms of compute resources, this could become a factor in very large workloads. Therefore, we recommend starting with a single Workflow implementation that uses Activities until there is a clear need for Child Workflows.

Consider each Child Workflow Execution as a separate service.

Because a Child Workflow Execution can be processed by a completely separate set of WorkersLink preview icon

than the Parent Workflow Execution, it can act as an entirely separate service. However, this also means that a Parent Workflow Execution and a Child Workflow Execution do not share any local state. As all Workflow Executions, they can communicate only via asynchronous SignalsLink preview icon.

Consider that a single Child Workflow Execution can represent a single resource.

As all Workflow Executions, a Child Workflow Execution can create a one to one mapping with a resource. For example, a Workflow that manages host upgrades could spawn a Child Workflow Execution per host.

Parent Close Policy

A Parent Close Policy determines what happens to a Child Workflow ExecutionLink preview icon

if its Parent changes to a Closed status (Completed, Failed, or Timed out).

There are three possible values:

  • Abandon: the Child Workflow Execution is not affected.
  • Request Cancel: a Cancellation request is sent to the Child Workflow Execution.
  • Terminate (default): the Child Workflow Execution is forcefully Terminated.

ParentClosePolicy proto definition.

Each Child Workflow Execution may have its own Parent Close Policy. This policy applies only to Child Workflow Executions and has no effect otherwise.

Parent Close Policy entity relationship

Parent Close Policy entity relationship

You can set policies per child, which means you can opt out of propagating terminates / cancels on a per-child basis. This is useful for starting Child Workflows asynchronously (see relevant issue here or the corresponding SDK docs).

Temporal Cron Job

A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.

Temporal Cron Job timeline

Temporal Cron Job timeline

A Temporal Cron Job is similar to a classic unix cron job. Just as a unix cron job accepts a command and a schedule on which to execute that command, a Cron Schedule can be provided with the call to spawn a Workflow Execution. If a Cron Schedule is provided, the Temporal Server will spawn an execution for the associated Workflow Type per the schedule.

Each Workflow Execution within the series is considered a Run.

  • Each Run receives the same input parameters as the initial Run.
  • Each Run inherits the same Workflow Options as the initial Run.

The Temporal Server spawns the first Workflow Execution in the chain of Runs immediately. However, it calculates and applies a backoff (firstWorkflowTaskBackoff) so that the first Workflow Task of the Workflow Execution does not get placed into a Task Queue until the scheduled time. After each Run Completes, Fails, or reaches the Workflow Run TimeoutLink preview icon

, the same thing happens: the next run will be created immediately with a new firstWorkflowTaskBackoff that is calculated based on the current Server time and the defined Cron Schedule.

The Temporal Server spawns the next Run only after the current Run has Completed, Failed, or has reached the Workflow Run Timeout. This means that, if a Retry Policy has also been provided, and a Run Fails or reaches the Workflow Run Timeout, the Run will first be retried per the Retry Policy until the Run Completes or the Retry Policy has been exhausted. If the next Run, per the Cron Schedule, is due to spawn while the current Run is still Open (including retries), the Server automatically starts the new Run after the current Run completes successfully. The start time for this new Run and the Cron definitions are used to calculate the firstWorkflowTaskBackoff that is applied to the new Run.

A Workflow Execution TimeoutLink preview icon

is used to limit how long a Workflow can be executing (have an Open status), including retries and any usage of Continue As New. The Cron Schedule runs until the Workflow Execution Timeout is reached or you terminate the Workflow.

Temporal Cron Job Run Failure with a Retry Policy

Temporal Cron Job Run Failure with a Retry Policy

Cron Schedules

Cron Schedules are interpreted in UTC time by default.

The Cron Schedule is provided as a string and must follow one of two specifications:

Classic specification

This is what the "classic" specification looks like:

┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *

For example, 15 8 * * * causes a Workflow Execution to spawn daily at 8:15 AM UTC. Use the crontab guru site to test your cron expressions.

robfig predefined schedules and intervals

You can also pass any of the predefined schedules or intervals described in the robfig/cron documentation.

| Schedules              | Description                                | Equivalent To |
| ---------------------- | ------------------------------------------ | ------------- |
| @yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 1 1 * |
| @monthly | Run once a month, midnight, first of month | 0 0 1 * * |
| @weekly | Run once a week, midnight between Sat/Sun | 0 0 * * 0 |
| @daily (or @midnight) | Run once a day, midnight | 0 0 * * * |
| @hourly | Run once an hour, beginning of hour | 0 * * * * |

For example, "@weekly" causes a Workflow Execution to spawn once a week at midnight between Saturday and Sunday.

Intervals just take a string that can be accepted by time.ParseDuration.

@every <duration>

Time zones

This feature only applies in Temporal 1.15 and up

You can change the time zone that a Cron Schedule is interpreted in by prefixing the specification with CRON_TZ=America/New_York (or your desired time zone from tz). CRON_TZ=America/New_York 15 8 * * * therefore spawns a Workflow Execution every day at 8:15 AM New York time, subject to caveats listed below.

Consider that using time zones in production introduces a surprising amount of complexity and failure modes! If at all possible, we recommend specifying Cron Schedules in UTC (the default).

If you need to use time zones, here are a few edge cases to keep in mind:

  • Beware Daylight Saving Time: If a Temporal Cron Job is scheduled around the time when daylight saving time (DST) begins or ends (for example, 30 2 * * *), it might run zero, one, or two times in a day! The Cron library that we use does not do any special handling of DST transitions. Avoid schedules that include times that fall within DST transition periods.
    • For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes 1:59 … 1:00 … 1:01 … 1:59 … 2:00 … 2:01 AM and any Cron jobs that fall in that 1 AM hour are fired again. The inverse happens when clocks "spring forward" for DST, and Cron jobs that fall in the 2 AM hour are skipped.
    • In other time zones like Chile and Iran, DST "spring forward" is at midnight. 11:59 PM is followed by 1 AM, which means 00:00:00 never happens.
  • Self Hosting note: If you manage your own Temporal Cluster, you are responsible for ensuring that it has access to current tzdata files. The official Docker images are built with tzdata installed (provided by Alpine Linux), but ultimately you should be aware of how tzdata is deployed and updated in your infrastructure.
  • Updating Temporal: If you use the official Docker images, note that an upgrade of the Temporal Cluster may include an update to the tzdata files, which may change the meaning of your Cron Schedule. You should be aware of upcoming changes to the definitions of the time zones you use, particularly around daylight saving time start/end dates.
  • Absolute Time Fixed at Start: The absolute start time of the next Run is computed and stored in the database when the previous Run completes, and is not recomputed. This means that if you have a Cron Schedule that runs very infrequently, and the definition of the time zone changes between one Run and the next, the Run might happen at the wrong time. For example, CRON_TZ=America/Los_Angeles 0 12 11 11 * means "noon in Los Angeles on November 11" (normally not in DST). If at some point the government makes any changes (for example, move the end of DST one week later, or stay on permanent DST year-round), the meaning of that specification changes. In that first year, the Run happens at the wrong time, because it was computed using the older definition.

How to stop a Temporal Cron Job

A Temporal Cron Job does not stop spawning Runs until it has been Terminated or until the Workflow Execution TimeoutLink preview icon

is reached.

A Cancellation Request affects only the current Run.

Use the Workflow Id in any requests to Cancel or Terminate.

Implementation guides:

Schedule

A Schedule contains instructions for starting a Workflow ExecutionLink preview icon

at specific times. Schedules provide a more flexible and user-friendly approach than Temporal Cron JobsLink preview icon.

A Schedule has an identity and is independent of a Workflow Execution. This differs from a Temporal Cron Job, which relies on a cron schedule as a property of the Workflow Execution.

Action

The Action of a Schedule is where the Workflow Execution properties are established, such as Workflow Type, Task Queue, parameters, and timeouts.

Workflow Executions started by a Schedule have the following additional properties:

Spec

The Schedule Spec describes when the Action is taken. There are two kinds of Schedule Spec:

  • A simple interval, like "every 30 minutes" (aligned to start at the Unix epoch, and optionally including a phase offset).
  • A calendar-based expression, similar to the "cron expressions" supported by lots of software, including the older Temporal Cron feature.

These two kinds have multiple representations, depending on the interface or SDK you're using, but they all support the same features.

In tctl, for example, an interval is specified as a string like 45m to mean every 45 minutes, or 6h/5h to mean every 6 hours but at the start of the fifth hour within each period.

In tctl, a calendar expression can be specified as either a traditional cron string with five (or six or seven) positional fields, or as JSON with named fields:

{
"year": "2022",
"month": "Jan,Apr,Jul,Oct",
"dayOfMonth": "1,15",
"hour": "11-14"
}

The following calendar JSON fields are available:

  • year
  • month
  • dayOfMonth
  • dayOfWeek
  • hour
  • minute
  • second
  • comment

Each field can contain a comma-separated list of ranges (or the * wildcard), and each range can include a slash followed by a skip value. The hour, minute, and second fields default to 0 while the others default to *, so you can describe many useful specs with only a few fields.

For month, names of months may be used instead of integers (case-insensitive, abbreviations permitted). For dayOfWeek, day-of-week names may be used.

The comment field is optional and can be used to include a free-form description of the intent of the calendar spec, useful for complicated specs.

No matter which form you supply, calendar and interval specs are converted to canonical representations. What you see when you "describe" or "list" a Schedule might not look exactly like what you entered, but it has the same meaning.

Other Spec features:

Multiple intervals/calendar expressions: A Spec can have combinations of multiple intervals and/or calendar expressions to define a specific Schedule.

Time bounds: Provide an absolute start or end time (or both) with a Spec to ensure that no actions are taken before the start time or after the end time.

Exclusions: A Spec can contain exclusions in the form of zero or more calendar expressions. This can be used to express scheduling like "each Monday at noon except for holidays. You'll have to provide your own set of exclusions and include it in each schedule; there are no pre-defined sets. (This feature isn't currently exposed in tctl or the Temporal Web UI.)

Jitter: If given, a random offset between zero and the maximum jitter is added to each Action time (but bounded by the time until the next scheduled Action).

Time zones: By default, calendar-based expressions are interpreted in UTC. Temporal recommends using UTC to avoid various surprising properties of time zones. If you don't want to use UTC, you can provide the name of a time zone. The time zone definition is loaded on the Temporal Server Worker Service from either disk or the fallback embedded in the binary.

For more operational control, embed the contents of the time zone database file in the Schedule Spec itself. (Note: this isn't currently exposed in tctl or the web UI.)

Pause

A Schedule can be Paused. When a Schedule is Paused, the Spec has no effect. However, you can still force manual actions by using the tctl schedule trigger command.

To assist communication among developers and operators, a “notes” field can be updated on pause or resume to store an explanation for the current state.

Backfill

A Schedule can be Backfilled. When a Schedule is Backfilled, all the Actions that would have been taken over a specified time period are taken now (in parallel if the AllowAll Overlap Policy is used; sequentially if BufferAll is used). You might use this to fill in runs from a time period when the Schedule was paused due to an external condition that's now resolved, or a period before the Schedule was created.

Limit number of Actions

A Schedule can be limited to a certain number of scheduled Actions (that is, not trigger immediately). After that it will act as if it were paused.

Policies

A Schedule supports a set of Policies that enable customizing behavior.

Overlap Policy

The Overlap Policy controls what happens when it is time to start a Workflow Execution but a previously started Workflow Execution is still running. The following options are available:

  • Skip: Default. Nothing happens; the Workflow Execution is not started.
  • BufferOne: Starts the Workflow Execution as soon as the current one completes. The buffer is limited to one. If another Workflow Execution is supposed to start, but one is already in the buffer, only the one in the buffer eventually starts.
  • BufferAll: Allows an unlimited number of Workflows to buffer. They are started sequentially.
  • CancelOther: Cancels the running Workflow Execution, and then starts the new one after the old one completes cancellation.
  • TerminateOther: Terminates the running Workflow Execution and starts the new one immediately.
  • AllowAll Starts any number of concurrent Workflow Executions. With this policy (and only this policy), more than one Workflow Execution, started by the Schedule, can run simultaneously.

Catchup Window

The Temporal Cluster might be down or unavailable at the time when a Schedule should take an Action. When it comes back up, the Catchup Window controls which missed Actions should be taken at that point. The default is one minute, which means that the Schedule attempts to take any Actions that wouldn't be more than one minute late. An outage that lasts longer than the Catchup Window could lead to missed Actions. (But you can always Backfill.)

Pause-on-failure

If this policy is set, a Workflow Execution started by a Schedule that ends with a failure or timeout (but not Cancellation or Termination) causes the Schedule to automatically pause.

Note that with the AllowAll Overlap Policy, this pause might not apply to the next Workflow Execution, because the next Workflow Execution might have started before the failed one finished. It applies only to Workflow Executions that were scheduled to start after the failed one finished.

Last completion result

A Workflow started by a Schedule can obtain the completion result from the most recent successful run. (How you do this depends on the SDK you're using.)

For overlap policies that don't allow overlap, “the most recent successful run” is straightforward to define. For the AllowAll policy, it refers to the run that completed most recently, at the time that the run in question is started. Consider the following overlapping runs:

time -------------------------------------------->
A |----------------------|
B |-------|
C |---------------|
D |--------------T

If D asks for the last completion result at time T, it gets the result of A. Not B, even though B started more recently, because A completed later. And not C, even though C completed after A, because the result for D is captured when D is started, not when it's queried.

Failures and timeouts do not affect the last completion result.

Last failure

A Workflow started by a Schedule can obtain the details of the failure of the most recent run that ended at the time when the Workflow in question was started. Unlike last completion result, a successful run does reset the last failure.

Limitations

Experimental

The Scheduled Workflows feature is available in Temporal Server version 1.18.

Internally, a Schedule is implemented as a Workflow. If you're using Advanced Visibility (Elasticsearch), these Workflow Executions are hidden from normal views. If you're using Standard Visibility, they are visible, though there's no need to interact with them directly.

Native support for Schedules in language SDKs is coming soon. For now, tctl and the web UI are the main interfaces to Schedules. For advanced use, you can also use the gRPC API by getting a WorkflowServiceClient object from the SDK and calling methods such as CreateSchedule.


  1. The Cluster usually deduplicates Signals, but does not guarantee deduplication: During shard migration, two Signal Events (and therefore two deliveries to the Workflow Execution) can be recorded for a single Signal because the deduping info is stored only in memory.