How to design a task to have resilience

Introduction

The TaskCentre Server has the ability to interact with multiple systems effectively linking them together. Because of this, there exists a need to treat all of the steps within a task as a single transaction.

For example, a task that reads data from Database A, writes that data to Database B and sends an email notification to Mail Server C would ideally be handled as a single transaction.

In practice treating this as a single transaction would be difficult to achieve. (Think about it, if the email notification to Mail Server C failed we would have to roll back the update to Database B – not an easy task to accomplish when we don’t have access to the database transactions statements; BEGIN TRANSACTION, COMMIT and ABORT.)

Summary

This article is intended to provide information about how to make tasks resilient by employing a retry mechanism to reprocess failed transactions.

Example Task

Let’s take as an example a task which monitors a sales order system and emails a report containing details of a newly placed order. Our ‘Process Orders’ task contains the following four steps:

  1. Microsoft SQL Server Trigger – Maps the OrderID to a task variable.
  2. Database Query – Extracts additional details relating to the OrderID such as the OrderDate, CustomerName, and OrderAmount
  3. Format HTML Report – Creates a HTML formatted report from the query data.
  4. Send Email – Emails out the report.

Screenshot of example task ‘Process Orders’:

ProcessOrders

The problem with this task is that it is not resilient to failure. The following two failures could occur after the Microsft SQL Server Trigger event step has fired:

  1. The Database Query step times out because the database server is busy.
  2. The Send Email step fails because the SMTP server is unreachable.

Either of these issues will cause the trigger event to be ‘lost’ because no report will ever be sent for that order. To manage this we will build a retry mechanism which will have the following two features:

  1. Events Queue – Places the event to be processed in a queue.
  2. Recovery – Retry when a failure has occurred.

Creating the Events Queue

The principal here is that an event will be written to the EVENTS table at the start of the task run. The status flag will initially be set to a status of ‘Unprocessed’. Only if all the steps succeed and it gets to the final step is the entry updated with a status of ‘processed’

Create an Events table to queue events

  1. On the TaskCentre server machine install a local SQL Server Express instance.
  2. Create a table called EVENTS and add the following columns.
Field NameDataTypeComments
EVENT_IDSet the data type to match that of the external system’s primary keyPrimary key of our EVENTS table. Populated by the primary key from the external system.
TASK_IDintThe task id
TASK_STATUSbit, NOT NULLUnprocessed = 0, Processed = 1

Field NameData TypeCommentsEVENT_IDSet the data type to match that of the external system’s primary keyPrimary key of our EVENTS table. Populated by the primary key from the external system.TASK_IDintThe task idTASK_STATUSbit, NOT NULLUnprocessed = 0, Processed = 1

 

CREATE TABLE [EVENTS]
(
-- Assumes the external system’s primary key is char(20)
EVENT_ID char(20) NOT NULL,

-- The Task ID is found under the 'Advanced' tab of 'Task Options'.
TASK_ID int NULL,

-- Unprocessed = 0, Processed = 1
TASK_STATUS bit NOT NULL

CONSTRAINT PK_EVENTS PRIMARY KEY CLUSTERED (EVENT_ID)
)

Task design changes

Edit the task to add an ‘OLEDB Call Procedure’ step named ‘INSERT_EVENT’ that inserts a row into the EVENTS table and sets the TASK_STATUS field as unprocessed.
ProcessOrders_INSERT_EVENT(1)

INSERT_EVENT step parameter mapping pane:
INSERT_EVENT Parameter mapping(1)

The SQL Stored Procedure that the INSERT_EVENT step calls:

CREATE PROCEDURE INSERT_EVENT
@PRIMARY_KEY datetime,
@TASK_ID int,
AS
BEGIN
INSERT INTO dbo.EVENTS
(EVENT_ID, TASK_ID, TASK_STATUS)

VALUES
(
@PRIMARY_KEY, @TASK_ID, 0 -- Initial state is always 'Unprocessed'
)
END

Processing the Events Queue

Task design changes

Edit the task further with the addition of an OLEDB Call Procedure step named ‘UPDATE_EVENT’ that updates the TASK_STATUS field in the EVENTS table to mark it as processed.
ProcessOrders_UPDATE_EVENT(1)

The UPDATE_EVENT step parameter mapping pane
UPDATE_EVENT Parameter mapping(1)

The SQL Stored Procedure that the UPDATE_EVENT step calls:

CREATE PROCEDURE UPDATE_EVENT
-- If we’re calling this procedure it means that all
 -- of the preceding task steps have completed
 -- and the status can be set to 'Processed'
@PRIMARY_KEY char(20),
AS
BEGIN
UPDATE dbo.EVENTS

SET TASK_STATUS = 1 -- 'Processed'

WHERE EVENT_ID = @PRIMARY_KEY

END

Screenshot of the final task design. (Note the additional INSERT_EVENT and UPDATE_EVENT steps compared to the non-resilient task in the first section.)
ProcessOrders_INSERT_EVENT

Recovery

Lets assume that a failed transaction has occurred. How do we recover from it.

The screenshot below of SQL management studio shows TaskID 1138 has run five times. The fourth time it ran (EventID 23) one of the steps has failed.
SQLManagementStudio

Reprocess failed transactions

To recover from the failure we build a new scheduled task ‘Reprocess Orders’.
ReprocessOrders

This task is the same as the original ‘Process Orders’ task but with the following differences.

  1. The SQL event step has been replaced by a schedule event step set to a regular interval one hour.
  2. The SQL in the query step has been amended to JOIN the Orders table to our Events table with a WHERE clause of ‘TASK_STATUS = 0 so that it only picks up the unprocessed transactions.
    SELECT
    OrderID
    OrderDate
    CustomerName
    
    FROM
    TestDB.dbo.SalesOrders INNER JOIN TestDB.dbo.[EVENTS]
    
    ON
    TestDB.dbo.SalesOrders.OrderID = TestDB.dbo.[EVENTS].EVENT_ID
    
    WHERE
    -- Filter on 'Unrocessed' rows
    TestDB.dbo.[EVENTS].TASK_STATUS = 0
    
  3. In the output Send Email step we add a call stored procedure memory that updates the TASK_STATUS field in the EVENTS table to mark it as processed.

OLEDB_Memory