Skip to main content

Application development - Observability

The observability section of the Temporal Application development guide covers the many ways to view the current state of your Temporal Application—that is, ways to view which Workflow Executions are tracked by the Temporal Platform and the state of any specified Workflow Execution, either currently or at points of an execution.

WORK IN PROGRESS

This guide is a work in progress. Some sections may be incomplete or missing for some languages. Information may change at any time.

If you can't find what you are looking for in the Application development guide, it could be in older docs for SDKs.

This section covers features related to viewing the state of the application, including:

Metrics

Each Temporal SDK is capable of emitting an optional set of metrics from either the Client or the Worker process. For a complete list of metrics capable of being emitted, see the SDK metrics reference.

Metrics can be scraped and stored in time series databases, such as:

Temporal also provides a dashboard you can integrate with graphing services like Grafana. For more information, see:

To emit metrics from the Temporal Client in Go, create a metrics handler from the Client Options and specify a listener address to be used by Prometheus.

client.Options{
MetricsHandler: sdktally.NewMetricsHandler(newPrometheusScope(prometheus.Configuration{
ListenAddress: "0.0.0.0:9090",
TimerType: "histogram",
}

The Go SDK currently supports the Tally library; however, Tally offers extensible custom metrics reporting, which is exposed through the WithCustomMetricsReporter API.

For more information, see the Go sample for metrics.

Tracing

Tracing allows you to view the call graph of a Workflow along with its Activities and any child Workflows.

Temporal Web's tracing capabilities mainly track Activity Execution within a Temporal context. If you need custom tracing specific for your use case, you should make use of context propagation to add tracing logic accordingly.

For information about Workflow tracing, see Tracing Temporal Workflows with DataDog.

For information about how to configure exporters and instrument your code, see Tracing Temporal Services with OTEL.

The Go SDK provides support for distributed tracing through OpenTracing. Tracing allows you to view the call graph of a Workflow along with its Activities and any child Workflows.

Tracing can be configured by providing an opentracing.Tracer implementation in ClientOptions during client instantiation.

For more details on how to configure and leverage tracing, see the OpenTracing documentation.

The OpenTracing support has been validated using Jaeger, but other implementations mentioned here should also work.

Tracing functionality utilizes generic context propagation provided by the Client.

Logging

Send logs and errors to a logging service, so that when things go wrong, you can see what happened.

The SDK core uses WARN for its default logging level.

Custom logging

Use a custom logger for logging.

This field sets a custom Logger that is used for all logging actions of the instance of the Temporal Client.

Although the Go SDK does not support most third-party logging solutions natively, our friends at Banzai Cloud built the adapter package logur which makes it possible to use third party loggers with minimal overhead. Most of the popular logging solutions have existing adapters in Logur, but you can find a full list in the Logur Github project.

Here is an example of using Logur to support Logrus:

package main
import (
"go.temporal.io/sdk/client"

"github.com/sirupsen/logrus"
logrusadapter "logur.dev/adapter/logrus"
"logur.dev/logur"
)

func main() {
// ...
logger := logur.LoggerToKV(logrusadapter.New(logrus.New()))
clientOptions := client.Options{
Logger: logger,
}
temporalClient, err := client.Dial(clientOptions)
// ...
}

Log from a Workflow

In Workflow Definitions you can use workflow.GetLogger(ctx) to write logs.

import (
"context"
"time"

"go.temporal.io/sdk/activity"
"go.temporal.io/sdk/workflow"
)

// Workflow is a standard workflow definition.
// Note that the Workflow and Activity don't need to care that
// their inputs/results are being compressed.
func Workflow(ctx workflow.Context, name string) (string, error) {
// ...

workflow.WithActivityOptions(ctx, ao)

// Getting the logger from the context.
logger := workflow.GetLogger(ctx)
// Logging a message with the key value pair `name` and `name`
logger.Info("Compressed Payloads workflow started", "name", name)

info := map[string]string{
"name": name,
}


logger.Info("Compressed Payloads workflow completed.", "result", result)

return result, nil
}

Visibility

The term Visibility, within the Temporal Platform, refers to the subsystems and APIs that enable an operator to view Workflow Executions that currently exist within a Cluster.

The typical method of retrieving a Workflow Execution is by its Workflow Id.

However, sometimes you'll want to retrieve one or more Workflow Executions based on another property. For example, imagine you want to get all Workflow Executions of a certain type that have failed within a time range, so that you can start new ones with the same arguments.

You can do this with Search Attributes.

The steps to using custom Search Attributes are:

  • Create a new Search Attribute in your Cluster using tctl or the Cloud UI.
  • Set the value of the Search Attribute for a Workflow Execution:
    • On the Client by including it as an option when starting the Execution.
    • In the Workflow by calling UpsertSearchAttributes.
  • Read the value of the Search Attribute:
    • On the Client by calling DescribeWorkflow.
    • In the Workflow by looking at WorkflowInfo.
  • Query Workflow Executions by the Search Attribute using a List Filter:
    • In tctl.
    • In code by calling ListWorkflowExecutions.

Here is how to query Workflow Executions:

Set custom search attributes

After you've created custom Search Attributes in your Cluster (using tctl or the Cloud UI), you can set the values of the custom Search Attributes when starting a Workflow.

Provide key-value pairs in StartWorkflowOptions.SearchAttributes.

Search Attributes are represented as map[string]interface{}. The values in the map must correspond to the Search Attribute's value type:

  • Bool = bool
  • Datetime = time.Time
  • Double = float64
  • Int = int64
  • Keyword = string
  • Text = string

If you had custom Search Attributes CustomerId of type Keyword and MiscData of type Text, you would provide string values:

func (c *Client) CallYourWorkflow(ctx context.Context, workflowID string, payload map[string]interface{}) error {
// ...
searchAttributes := map[string]interface{}{
"CustomerId": payload["customer"],
"MiscData": payload["miscData"]
}
options := client.StartWorkflowOptions{
SearchAttributes: searchAttributes
// ...
}
we, err := c.Client.ExecuteWorkflow(ctx, options, app.YourWorkflow, payload)
// ...
}

Upsert custom search attributes

You can upsert Search Attributes to add or update Search Attributes from within Workflow code.

In advanced cases, you may want to dynamically update these attributes as the Workflow progresses. UpsertSearchAttributes is used to add or update Search Attributes from within Workflow code.

UpsertSearchAttributes will merge attributes to the existing map in the Workflow. Consider this example Workflow code:

func YourWorkflow(ctx workflow.Context, input string) error {

attr1 := map[string]interface{}{
"CustomIntField": 1,
"CustomBoolField": true,
}
workflow.UpsertSearchAttributes(ctx, attr1)

attr2 := map[string]interface{}{
"CustomIntField": 2,
"CustomKeywordField": "seattle",
}
workflow.UpsertSearchAttributes(ctx, attr2)
}

After the second call to UpsertSearchAttributes, the map will contain:

map[string]interface{}{
"CustomIntField": 2, // last update wins
"CustomBoolField": true,
"CustomKeywordField": "seattle",
}

Remove search attributes

To remove a Search Attribute that was previously set, set it to an empty array: [].

There is no support for removing a field.

However, to achieve a similar effect, set the field to some placeholder value. For example, you could set CustomKeywordField to impossibleVal. Then searching CustomKeywordField != 'impossibleVal' will match Workflows with CustomKeywordField not equal to impossibleVal, which includes Workflows without the CustomKeywordField set.

Replays

Replays recreate the exact state the Workflow code was in. You can replay Workflows from the beginning of their history when resumed.

Replays allow code to resume only if it is compatible from a deterministic point of view.

Use the worker.WorflowReplayer to replay an existing Workflow Execution from its Event History to replicate errors.

For example, the following code retrieves the Event History of a Workflow:

import (
"context"

"go.temporal.io/api/enums/v1"
"go.temporal.io/api/history/v1"
"go.temporal.io/sdk/client"
)

func GetWorkflowHistory(ctx context.Context, client client.Client, id, runID string) (*history.History, error) {
var hist history.History
iter := client.GetWorkflowHistory(ctx, id, runID, false, enums.HISTORY_EVENT_FILTER_TYPE_ALL_EVENT)
for iter.HasNext() {
event, err := iter.Next()
if err != nil {
return nil, err
}
hist.Events = append(hist.Events, event)
}
return &hist, nil
}

This history can then be used to replay. For example, the following code creates a WorkflowReplayer and register the YourWorkflow Workflow function. Then it calls the ReplayWorkflowHistory to replay the Event History and return an error code.

import (
"context"

"go.temporal.io/sdk/client"
"go.temporal.io/sdk/worker"
)

func ReplayWorkflow(ctx context.Context, client client.Client, id, runID string) error {
hist, err := GetWorkflowHistory(ctx, client, id, runID)
if err != nil {
return err
}
replayer := worker.NewWorkflowReplayer()
replayer.RegisterWorkflow(YourWorkflow)
return replayer.ReplayWorkflowHistory(nil, hist)
}

The code above will cause the Worker to re-execute the Workflow's Workflow Function using the original Event History. If a noticeably different code path was followed or some code caused a deadlock, it will be returned in the error code. Replaying a Workflow Execution locally is a good way to see exactly what code path was taken for given input and events.