TL;DR — Quick Summary

Master Temporal for durable workflow orchestration in microservices. Covers architecture, installation, SDKs, saga pattern, and order processing example.

TEMPORAL — DURABLE WORKFLOW ORCHESTRATION SDK Client Start / Signal Query Temporal Server Frontend gRPC / HTTP History Event sourcing Matching Task queues Persistence MySQL / Postgres Namespace isolation Worker Process Workflow Executor Activity Executor Worker Process Scale independently External Services APIs / DBs / Queues Temporal UI localhost:8080 Event history Workers poll task queues — Temporal Server never pushes to Workers

Distributed systems fail in partial ways — a payment service times out, a shipment record writes but confirmation never arrives, and now your data is inconsistent across three databases with no way back. Temporal solves this class of problem by making workflows durable by default, using event sourcing to survive crashes, replays to recover state, and structured retry policies to handle transient failures. This guide covers the full Temporal stack: server architecture, installation options, workflow and activity primitives across Go, TypeScript, and Python, advanced patterns including saga and human-in-the-loop, and a complete order processing example.

Prerequisites

  • Docker and Docker Compose installed (for local Temporal Server)
  • Go 1.22+, Node.js 22+, or Python 3.11+ depending on your chosen SDK
  • Basic familiarity with microservices and distributed systems concepts
  • Understanding of async programming patterns (promises, goroutines, or asyncio)

Temporal Architecture

Temporal Server is composed of four internal services that can run as a single binary or independently for scale:

Frontend — the externally-facing gRPC and HTTP gateway. Clients and Workers connect here to start workflows, send signals, run queries, and poll task queues.

History — the core of Temporal. Persists every workflow event to the database and drives workflow execution by replaying event history. Each workflow execution is managed by a single History shard, providing strong ordering guarantees.

Matching — manages task queues. When the History service needs a workflow or activity task executed, it pushes the task to Matching, which holds it until a Worker polls. This pull model means Workers are never overwhelmed.

Internal Worker — runs Temporal’s own system workflows for namespace management, archival, and replication. Not user-facing.

Workers are your application processes — they contain your workflow and activity code. Workers poll named task queues from the Temporal Server, execute the work locally, and return results. Workers are stateless and horizontally scalable. Your workflow code runs inside the Worker, not inside Temporal Server.

Event Sourcing and Replay: Every workflow maintains a complete ordered event history in the database. If a Worker crashes mid-workflow, a new Worker picks up the task, replays the event history to reconstruct the exact in-memory state (all variables, timers, awaited values), and continues execution from the last durable checkpoint. This is what makes Temporal durable without requiring workflow state to live in a database you manage.

Installation

Docker Compose (local development)

git clone https://github.com/temporalio/docker-compose.git
cd docker-compose
docker compose up

This starts:

  • temporal — Temporal Server on port 7233 (gRPC)
  • temporal-ui — Web UI on port 8080
  • temporal-admin-tools — container with tctl CLI pre-installed
  • postgresql — persistence backend
# Access tctl via the admin-tools container
docker exec -it temporal-admin-tools tctl namespace register --retention 7 default

# List running workflows
docker exec -it temporal-admin-tools tctl workflow list

Kubernetes with Helm

helm repo add temporal https://go.temporal.io/helm-charts
helm repo update

helm install temporal temporal/temporal \
  --set server.replicaCount=3 \
  --set cassandra.config.cluster_size=3 \
  --set elasticsearch.enabled=true \
  --namespace temporal \
  --create-namespace

Temporal Cloud (managed)

Temporal Cloud eliminates operational overhead. You get a namespace endpoint, mTLS certificates, and usage-based billing. Connect via the SDK with your endpoint and certificates:

# Set in your Worker configuration
TEMPORAL_ADDRESS=<namespace>.tmprl.cloud:7233
TEMPORAL_TLS_CERT=path/to/client.pem
TEMPORAL_TLS_KEY=path/to/client.key

Core Concepts

Workflow — A deterministic function that orchestrates activities, timers, signals, and child workflows. Must be deterministic: no random numbers, no direct system calls, no accessing mutable global state. Temporal guarantees workflows survive any failure.

Activity — A function that performs non-deterministic side effects: HTTP calls, database writes, file I/O, sending emails. Activities run in Workers, have configurable retry policies, and report heartbeats for long-running operations.

Signal — An external event sent to a running workflow. Signals allow external systems to push data into a running workflow (e.g., “payment approved”, “user cancelled order”).

Query — A synchronous read of a workflow’s current state without affecting execution. Queries are answered by the Worker executing the workflow.

Task Queue — A named channel through which Temporal Server dispatches work to Workers. Workers register to poll specific task queues. This decouples which Workers handle which work.

Namespace — An isolation boundary for workflows. Each namespace has independent retention settings, security policies, and search attribute schemas.

Writing Workflows in Go

package order

import (
    "time"
    "go.temporal.io/sdk/workflow"
    "go.temporal.io/sdk/activity"
)

// Activity retry policy — applied per activity call
var defaultRetryPolicy = &temporal.RetryPolicy{
    InitialInterval:    time.Second,
    BackoffCoefficient: 2.0,
    MaximumInterval:    30 * time.Second,
    MaximumAttempts:    5,
    NonRetryableErrorTypes: []string{"InvalidOrderError"},
}

// Workflow function — must be deterministic
func OrderWorkflow(ctx workflow.Context, order OrderInput) (OrderResult, error) {
    ao := workflow.ActivityOptions{
        StartToCloseTimeout: 30 * time.Second,
        RetryPolicy:         defaultRetryPolicy,
    }
    ctx = workflow.WithActivityOptions(ctx, ao)

    // Execute activities sequentially
    var paymentResult PaymentResult
    err := workflow.ExecuteActivity(ctx, ValidatePayment, order).Get(ctx, &paymentResult)
    if err != nil {
        return OrderResult{}, err
    }

    var inventoryResult InventoryResult
    err = workflow.ExecuteActivity(ctx, ReserveInventory, order).Get(ctx, &inventoryResult)
    if err != nil {
        // Compensate: refund payment
        workflow.ExecuteActivity(ctx, RefundPayment, paymentResult).Get(ctx, nil)
        return OrderResult{}, err
    }

    // Wait for shipping signal with timeout
    signalChan := workflow.GetSignalChannel(ctx, "shipping-update")
    var shippingInfo ShippingInfo
    selector := workflow.NewSelector(ctx)
    selector.AddReceive(signalChan, func(c workflow.ReceiveChannel, _ bool) {
        c.Receive(ctx, &shippingInfo)
    })
    // Timer: wait up to 24 hours for shipping confirmation
    timerFuture := workflow.NewTimer(ctx, 24*time.Hour)
    selector.AddFuture(timerFuture, func(f workflow.Future) {
        shippingInfo.Status = "timeout"
    })
    selector.Select(ctx)

    // Sleep is durable — survives Worker restarts
    workflow.Sleep(ctx, 7*24*time.Hour) // Wait 7 days

    workflow.ExecuteActivity(ctx, SendDeliveryConfirmation, order, shippingInfo).Get(ctx, nil)

    return OrderResult{OrderID: order.ID, Status: "completed"}, nil
}

// Activity function — performs the actual work
func ValidatePayment(ctx context.Context, order OrderInput) (PaymentResult, error) {
    // Heartbeat for long-running activities
    activity.RecordHeartbeat(ctx, "validating payment")
    // Call payment provider API — non-deterministic, OK in activity
    return callPaymentAPI(order)
}

Workflow Versioning with GetVersion

When deploying changes to a live workflow, use workflow.GetVersion to safely branch behavior:

func OrderWorkflow(ctx workflow.Context, order OrderInput) (OrderResult, error) {
    // Returns DefaultVersion (−1) for existing executions, 1 for new ones
    v := workflow.GetVersion(ctx, "add-fraud-check", workflow.DefaultVersion, 1)

    if v >= 1 {
        // New code path for new executions
        workflow.ExecuteActivity(ctx, FraudCheck, order).Get(ctx, nil)
    }
    // ... rest of workflow
}

Writing Workflows in TypeScript

import { proxyActivities, sleep, setHandler, defineSignal, defineQuery,
         condition, CancellationScope } from '@temporalio/workflow';
import type { Activities } from './activities';

const { validatePayment, reserveInventory, refundPayment,
        sendDeliveryConfirmation } = proxyActivities<Activities>({
  startToCloseTimeout: '30 seconds',
  retry: {
    initialInterval: '1s',
    backoffCoefficient: 2,
    maximumInterval: '30s',
    maximumAttempts: 5,
    nonRetryableErrorTypes: ['InvalidOrderError'],
  },
});

const shippingSignal = defineSignal<[ShippingInfo]>('shipping-update');
const orderStatusQuery = defineQuery<string>('order-status');

export async function orderWorkflow(order: OrderInput): Promise<OrderResult> {
  let currentStatus = 'processing';

  // Register signal handler
  setHandler(shippingSignal, (info: ShippingInfo) => {
    currentStatus = `shipped:${info.trackingId}`;
  });

  // Register query handler — reads state synchronously
  setHandler(orderStatusQuery, () => currentStatus);

  const payment = await validatePayment(order);

  let inventory;
  try {
    inventory = await reserveInventory(order);
  } catch (err) {
    await refundPayment(payment);
    throw err;
  }

  // Wait for signal OR timeout using condition
  const signalReceived = await condition(
    () => currentStatus.startsWith('shipped:'),
    '24 hours'
  );

  if (!signalReceived) {
    currentStatus = 'shipping-timeout';
  }

  await sleep('7 days'); // Durable sleep — survives crashes

  await sendDeliveryConfirmation(order, currentStatus);
  return { orderId: order.id, status: 'completed' };
}

Activity with Cancellation Scope

import { CancellationScope, isCancellation } from '@temporalio/workflow';

export async function processWithTimeout(input: Input): Promise<void> {
  try {
    await CancellationScope.withTimeout('5 minutes', async () => {
      await longRunningActivity(input);
    });
  } catch (err) {
    if (isCancellation(err)) {
      await compensateActivity(input);
    }
    throw err;
  }
}

Writing Workflows in Python

from datetime import timedelta
from temporalio import workflow, activity
from temporalio.common import RetryPolicy

@workflow.defn
class OrderWorkflow:
    def __init__(self) -> None:
        self._status = "processing"
        self._shipping_info: ShippingInfo | None = None

    @workflow.run
    async def run(self, order: OrderInput) -> OrderResult:
        retry_policy = RetryPolicy(
            initial_interval=timedelta(seconds=1),
            backoff_coefficient=2.0,
            maximum_interval=timedelta(seconds=30),
            maximum_attempts=5,
            non_retryable_error_types=["InvalidOrderError"],
        )

        payment = await workflow.execute_activity(
            validate_payment,
            order,
            start_to_close_timeout=timedelta(seconds=30),
            retry_policy=retry_policy,
        )

        try:
            inventory = await workflow.execute_activity(
                reserve_inventory,
                order,
                start_to_close_timeout=timedelta(seconds=30),
                retry_policy=retry_policy,
            )
        except Exception:
            await workflow.execute_activity(refund_payment, payment,
                start_to_close_timeout=timedelta(seconds=30))
            raise

        # Wait for signal with timeout
        await workflow.wait_condition(
            lambda: self._shipping_info is not None,
            timeout=timedelta(hours=24),
        )

        await workflow.sleep(timedelta(days=7))
        await workflow.execute_activity(send_delivery_confirmation, order,
            start_to_close_timeout=timedelta(seconds=30))

        return OrderResult(order_id=order.id, status="completed")

    @workflow.signal
    def shipping_update(self, info: ShippingInfo) -> None:
        self._shipping_info = info

    @workflow.query
    def order_status(self) -> str:
        return self._status


@activity.defn
async def validate_payment(order: OrderInput) -> PaymentResult:
    activity.heartbeat("validating payment with provider")
    return await call_payment_api(order)

Activity Patterns

Heartbeating for Long Activities

Activities must heartbeat to tell Temporal they are still alive. If a Worker crashes, the heartbeat timeout triggers re-scheduling on another Worker:

func ProcessLargeFile(ctx context.Context, fileURL string) error {
    for i, chunk := range chunks {
        // Heartbeat with progress — also provides cancellation detection
        activity.RecordHeartbeat(ctx, fmt.Sprintf("chunk %d/%d", i+1, len(chunks)))

        // Check if workflow was cancelled
        if ctx.Err() != nil {
            return ctx.Err()
        }
        processChunk(chunk)
    }
    return nil
}

Local Activities

Local Activities run in the same Worker process as the Workflow, without round-tripping to Temporal Server. Use them for fast, low-latency operations (under a second) that still need retries:

lao := workflow.LocalActivityOptions{
    StartToCloseTimeout: 5 * time.Second,
}
ctx = workflow.WithLocalActivityOptions(ctx, lao)
workflow.ExecuteLocalActivity(ctx, FormatOrderID, order).Get(ctx, &formattedID)

Workflow Patterns

Saga Pattern for Distributed Transactions

The Saga pattern models distributed transactions as a sequence of activities with compensating actions:

func OrderSagaWorkflow(ctx workflow.Context, order OrderInput) error {
    var compensations []func(workflow.Context) error

    ao := workflow.ActivityOptions{StartToCloseTimeout: 30 * time.Second}
    ctx = workflow.WithActivityOptions(ctx, ao)

    // Step 1: charge payment
    var payment PaymentResult
    if err := workflow.ExecuteActivity(ctx, ChargePayment, order).Get(ctx, &payment); err != nil {
        return err
    }
    compensations = append(compensations, func(ctx workflow.Context) error {
        return workflow.ExecuteActivity(ctx, RefundPayment, payment).Get(ctx, nil)
    })

    // Step 2: reserve inventory
    var reservation InventoryReservation
    if err := workflow.ExecuteActivity(ctx, ReserveInventory, order).Get(ctx, &reservation); err != nil {
        // Run compensations in reverse
        for i := len(compensations) - 1; i >= 0; i-- {
            compensations[i](ctx) // Best-effort compensation
        }
        return err
    }
    compensations = append(compensations, func(ctx workflow.Context) error {
        return workflow.ExecuteActivity(ctx, ReleaseInventory, reservation).Get(ctx, nil)
    })

    // Step 3: create shipment
    if err := workflow.ExecuteActivity(ctx, CreateShipment, order, reservation).Get(ctx, nil); err != nil {
        for i := len(compensations) - 1; i >= 0; i-- {
            compensations[i](ctx)
        }
        return err
    }

    return nil
}

Scheduled Workflows with CronSchedule

// Start a workflow on a cron schedule
c.ExecuteWorkflow(ctx,
    client.StartWorkflowOptions{
        ID:           "daily-report",
        TaskQueue:    "reporting",
        CronSchedule: "0 9 * * MON-FRI", // Weekdays at 9am UTC
    },
    DailyReportWorkflow,
    ReportInput{ReportType: "sales"},
)

Child Workflows

func ParentWorkflow(ctx workflow.Context, orders []OrderInput) error {
    childCtx := workflow.WithChildOptions(ctx, workflow.ChildWorkflowOptions{
        TaskQueue: "order-processing",
    })
    // Launch child workflows in parallel
    futures := make([]workflow.Future, len(orders))
    for i, order := range orders {
        futures[i] = workflow.ExecuteChildWorkflow(childCtx, OrderWorkflow, order)
    }
    // Wait for all
    for _, f := range futures {
        if err := f.Get(ctx, nil); err != nil {
            return err
        }
    }
    return nil
}

Namespaces and Visibility

# Create a namespace with 30-day retention
tctl namespace register \
  --retention 30 \
  --description "Production order processing" \
  production-orders

# Add custom search attributes for filtering
tctl admin cluster add-search-attributes \
  --name OrderStatus --type Text \
  --name CustomerTier --type Keyword \
  --name OrderAmount --type Double

# List workflows with advanced filter
tctl workflow list \
  --query 'OrderStatus="pending" AND CustomerTier="premium" ORDER BY StartTime DESC'

Custom search attributes let you filter and sort workflow executions by your domain-specific fields in the Temporal UI and via tctl. For advanced visibility (full text search across millions of executions), enable Elasticsearch in your Temporal deployment.

Temporal UI

The Temporal UI at localhost:8080 provides:

  • Workflow List — searchable table of all executions with status, start time, and task queue
  • Execution Detail — full event history showing every state transition with timestamps and payloads
  • Stack Trace — shows what code the workflow is currently blocked on (which Activity, which sleep, which signal wait)
  • Pending Activities — lists activities scheduled but not yet started, useful for debugging worker connectivity

Tool Comparison

FeatureTemporalApache AirflowAWS Step FunctionsPrefectInngestConductor
Primary useDurable microservice workflowsData pipeline DAGsServerless state machinesData workflow orchestrationEvent-driven functionsMicroservice orchestration
Execution modelLong-running durableDAG batch runsManaged serverlessFlow runsServerless stepsWorkflow engine
Code languageGo, Java, TS, Python, .NETPython DAGsJSON/YAML DSLPythonTypeScriptJSON/Java
Replay/DurabilityFull event sourcing replayNoneManaged by AWSCheckpoint-basedLimitedLimited
Signals/QueriesYes — nativeNoCallbacks onlyNoEvents onlySignals
Local devDocker ComposeDocker ComposeRequires AWSLocal serverDev serverDocker
Managed cloudTemporal CloudMWAANativePrefect CloudYesConductor Cloud
Best forComplex, long-lived workflowsETL pipelinesSimple AWS workflowsML/data pipelinesServerless event chainsMicroservice choreography

Practical Example: Order Processing Saga

This complete example shows a production-ready order workflow with payment, inventory, shipping, and compensation:

// workflows/order-saga.ts
import { proxyActivities, sleep, setHandler, defineSignal,
         defineQuery, condition } from '@temporalio/workflow';

const { chargePayment, refundPayment, reserveInventory, releaseInventory,
        createShipment, sendConfirmationEmail } = proxyActivities<Activities>({
  startToCloseTimeout: '60 seconds',
  retry: { maximumAttempts: 3, initialInterval: '2s', backoffCoefficient: 2 },
});

const cancelSignal = defineSignal('cancel-order');
const statusQuery = defineQuery<string>('status');

export async function orderSagaWorkflow(order: OrderInput): Promise<OrderResult> {
  let status = 'received';
  let cancelled = false;

  setHandler(cancelSignal, () => { cancelled = true; });
  setHandler(statusQuery, () => status);

  // Charge payment
  status = 'charging';
  const payment = await chargePayment(order);

  if (cancelled) {
    await refundPayment(payment);
    return { orderId: order.id, status: 'cancelled' };
  }

  // Reserve inventory
  status = 'reserving';
  let inventory;
  try {
    inventory = await reserveInventory(order);
  } catch (err) {
    await refundPayment(payment);
    throw err;
  }

  // Create shipment
  status = 'shipping';
  try {
    await createShipment(order, inventory);
  } catch (err) {
    await releaseInventory(inventory);
    await refundPayment(payment);
    throw err;
  }

  // Wait up to 30 days for delivery confirmation signal
  status = 'awaiting-delivery';
  const delivered = await condition(() => status === 'delivered', '30 days');

  if (!delivered) {
    status = 'delivery-timeout';
  }

  await sendConfirmationEmail(order, status);
  return { orderId: order.id, status };
}
# Start the workflow
tctl workflow start \
  --taskqueue order-processing \
  --workflow_type orderSagaWorkflow \
  --workflow_id "order-12345" \
  --input '{"id":"12345","items":[{"sku":"PROD-001","qty":2}]}'

# Query current status
tctl workflow query --workflow_id order-12345 --query_type status

# Send delivery confirmation signal
tctl workflow signal \
  --workflow_id order-12345 \
  --name delivered \
  --input '{"deliveredAt":"2026-03-23T14:00:00Z"}'

Gotchas and Common Mistakes

Non-determinism bugs are the most common issue in Temporal. Any code that produces different results on replay will corrupt workflow state. Never use time.Now(), rand, UUIDs, or direct API calls inside workflow functions — always use workflow.Now() and workflow.GetVersion() instead.

Missing heartbeats on long activities cause the activity to be rescheduled even though it is still running, creating duplicate executions. Always heartbeat in loops and check ctx.Err() after each heartbeat.

Unbounded event history accumulates when a workflow runs indefinitely without checkpointing. Use Continue-As-New for polling loops and long-running processes that accumulate many events.

Task queue mismatch — Workers and workflow starts must use the same task queue name. A typo means the workflow task sits in the queue forever with no worker to pick it up.

Summary

  • Temporal’s event sourcing model makes workflows durable by default — crashes, deploys, and network partitions do not lose workflow state
  • Workers poll task queues — the pull model means Workers are never overwhelmed and scale independently from the server
  • Activities handle all non-deterministic side effects with configurable retry policies, heartbeats, and timeout controls
  • Signals and Queries let external systems interact with running workflows without polling your database
  • The Saga pattern with compensating Activities is the Temporal-native approach to distributed transactions
  • GetVersion enables safe rolling deploys without breaking in-flight workflow executions
  • Use Temporal Cloud for production to eliminate server operations overhead; use Docker Compose for local development