Oracle Corporation

04/25/2024 | Press release | Distributed by Public on 04/25/2024 19:35

Where is the Complexity? Part 2

Introduction

In my previous blog post, I described how a simple Teller microservice to transfer funds from one account to another account could be implemented as REST services in Spring Boot. What I didn't cover were all the possible ways a simple application like this could fail. I only mentioned a couple that could have been implemented. However, to solve this problem of various failures such as network failure, service failure, database failure, etc., one can choose to use a distributed transaction pattern such as the popular Saga pattern. In this post, I'll examine what it takes to move from the simplistic minimal or no error handling version of these microservices to leveraging Eclipse MicroProfile Long Running Actions. As you'll learn in reading this post, adding Sagas adds considerable complexity to the application code. Ideally using a distributed transaction pattern should simplify the developer's life, not make it more complex. Continue reading to see how much complexity the developer takes on in using Sagas.

To use the Saga pattern with Eclipse MicroProfile Long Running Actions (LRA), the developer must keep track of information that is essentially unrelated to the business problem being solved. The basic flow for the LRA in this use case is:

  1. Teller microservice is called to transfer X from account A in department 1 to account B in department 2
  2. Teller microservice begins an LRA. This causes LRA headers to be included in subsequent calls
  3. Teller microservice calls withdraw on department1 for account A
    1. Under the covers, department 1 sees the LRA headers and joins the LRA by enlisting with the LRA coordinator
    2. Department 1 withdraws X from account A and returns to the teller microservice
  4. Teller microservice calls deposit on department 2 for account B
    1. Under the covers, department 2 sees the LRA headers and joins the LRA by enlisting with the LRA coordinator
    2. Department 2 deposits X into account B and returns to the teller microservice
  5. Teller microservice closes the LRA which causes
    1. LRA Coordinator to call back to the complete endpoint for each joined participant
    2. Each participant's complete endpoint then performs any final steps for the LRA, more on this later
  6. Teller microservice returns success

Handling errors

Now to handle the possible various failures that can occur, the following additional steps may need to be performed:

  1. If the withdraw request fails, teller should cancel the LRA and return failure to its caller
    1. This in turn will cause the LRA coordinator to call back to each enrolled participant at their compensate endpoint, which at this point is only the initiator
    2. The initiator will get called back on its compensate endpoint if defined, at which it has nothing to do. The sample code doesn't define a compensate or completion callback as there is nothing to do in the teller to compensate/complete the LRA as the teller maintains no state.
  2. Likewise, if the deposit request to account B fails, teller should cancel the LRA.
    1. This in turn will cause the LRA coordinator to call back to each enrolled participant at their compensate endpoint, which at this point is likely just the initiator and department 1. It might also include department 2 if department 2 had a chance to join before failing to deposit.
      1. The initiator will get called back on its compensate endpoint if defined, at which it has nothing to do
      2. Department 1 will get called back on its compensate endpoint, at which point it needs to figure out how to compensate the previous withdraw operation. This is where the fun begins. More on this shortly!
      3. Department 2 may also get called back on its compensate endpoint and go through the same drill as department 1.
  3. If both withdraw and deposit succeed, the teller should close the LRA. This causes the LRA coordinator to:
    1. Call the completion endpoint on the teller
      1. The teller really has nothing to do at this point
    2. Call the completion endpoint on department 1
      1. Department 1's completion endpoint needs to clean-up any bookkeeping information it has on the LRA
    3. Call the completion endpoint on department 2
      1. Department 2's completion endpoint needs to clean-up any bookkeeping information it has on the LRA

And we're done! Well, sort of. Let's examine in more detail what was mentioned about the fun beginning and bookkeeping. When a participant is called at its complete or compensate endpoints, all that is passed is the LRA Id in the REST headers. What it means to complete or compensate the transaction is completely up to the application code provided by the developer. So typically, this means creating some sort of log or journal to record the changes that were made as part of the LRA such that the microservice knows what needs to be done to complete or compensate the transaction.

Here is the updated Teller microservice code from TransferResource.java looks like:

    @Autowired
    @Qualifier("MicroTxLRA")
    RestTemplate restTemplate;

    @Value("${departmentOneEndpoint}")
    String departmentOneEndpoint;

    @Value("${departmentTwoEndpoint}")
    String departmentTwoEndpoint;

    @RequestMapping(value = "transfer", method = RequestMethod.POST)
    @LRA(value = LRA.Type.REQUIRES_NEW, end = true, cancelOnFamily = {HttpStatus.Series.CLIENT_ERROR, HttpStatus.Series.SERVER_ERROR})
    public ResponseEntity<?> transfer(@RequestBody Transfer transferDetails,
                                      @RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId) {

        LOG.info("Transfer initiated: {}", transferDetails);
        try {
            ResponseEntity withdrawResponse = withdraw(transferDetails.getFrom(), transferDetails.getAmount());
            if (!withdrawResponse.getStatusCode().is2xxSuccessful()) {
                LOG.error("Withdraw failed: {} Reason: {}", transferDetails, withdrawResponse.getBody());
                return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                        .body(new TransferResponse("Withdraw failed. " + withdrawResponse.getBody()));
            }

            ResponseEntity depositResponse = deposit(transferDetails.getTo(), transferDetails.getAmount());
            if (!depositResponse.getStatusCode().is2xxSuccessful()) {
                LOG.error("Deposit failed: {} Reason: {} ", transferDetails, depositResponse.getBody());
                return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                        .body(new TransferResponse("Deposit failed"));
            }
        } catch (Exception e) {
            LOG.error("Transfer failed with exception {}", e.getLocalizedMessage());
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
                    .body(new TransferResponse("Transfer failed. " + e.getLocalizedMessage()));
        }
        LOG.info("Transfer successful: {}", transferDetails);
        return ResponseEntity
                .ok(new TransferResponse("Transfer completed successfully"));
    }

You'll notice that we've switched the RestTemplate to one provided by MicroTx that provides filters to automate things like passing MicroTx transaction headers. We added the @LRA annotation to indicate that a new LRA needs to be started. As well we've removed the redepositWithdrawnAmount method as the LRA will ensure corrective action takes place by eventually calling the department 1's compensate endpoint. Finally, because of the "end = true"option in the LRA annotation, the LRA will automatically be closed (or canceled if the return status is a server or client error, 4xx or 5xx) when the transfer method completes.

So far, the changes are minimal. Let's see what changes need to be made to the department deposit and withdraw services. In the first blog post, these services were contained in the AccountResource.java file. In this post, we've split that class into 3 separate classes (AccountAdminService, DepositService, and WithdrawService) as we want to register completion and/or compensation callbacks that are unique to the withdraw or deposit services.

Let's look at the changes to how the withdraw service is implemented in WithdrawService.java:

@RestController
@RequestMapping("/withdraw")
public class WithdrawService {

    private static final Logger LOG = LoggerFactory.getLogger(WithdrawService.class);

    @Autowired
    IAccountOperationService accountService;

    @Autowired
    JournalRepository journalRepository;

    @Autowired
    AccountTransferDAO accountTransferDAO;

    @Autowired
    IAccountQueryService accountQueryService;

    /**
     * cancelOnFamily attribute in @LRA is set to empty array to avoid cancellation from participant.
     * As per the requirement, only initiator can trigger cancel, while participant returns right HTTP status code to initiator
     */
    @RequestMapping(value = "/{accountId}", method = RequestMethod.POST)
    @LRA(value = LRA.Type.MANDATORY, end = false, cancelOnFamily = {})
    public ResponseEntity<?> withdraw(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId,
                                      @PathVariable("accountId") String accountId, @RequestParam("amount") double amount) {
        try {
            this.accountService.withdraw(accountId, amount);
            accountTransferDAO.saveJournal(new Journal(JournalType.WITHDRAW.name(), accountId, amount, lraId, ParticipantStatus.Active.name()));
            LOG.info(amount + " withdrawn from account: " + accountId);
            return ResponseEntity.ok("Amount withdrawn from the account");
        } catch (NotFoundException e) {
            return ResponseEntity.status(HttpStatus.NOT_FOUND).body(e.getMessage());
        } catch (UnprocessableEntityException e) {
            LOG.error(e.getLocalizedMessage());
            return ResponseEntity.status(HttpStatus.UNPROCESSABLE_ENTITY).body(e.getMessage());
        } catch (Exception e) {
            LOG.error(e.getLocalizedMessage());
            return ResponseEntity.internalServerError().body(e.getLocalizedMessage());
        }
    }

As you can see the withdraw service got more complex as we needed to introduce a journal to keep track of changes made by an LRA. We keep this journal so when asked to compensate we'll know how to compensate the request. The withdraw service updates the account balance immediately and tracks the change in its journal.

    /**
     * Update LRA state. Do nothing else.
     */
    @RequestMapping(value = "/complete", method = RequestMethod.PUT)
    @Complete
    public ResponseEntity<?> completeWork(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        LOG.info("withdraw complete called for LRA : " + lraId);
        Journal journal = accountTransferDAO.getJournalForLRAid(lraId, JournalType.WITHDRAW);
        if (journal != null) {
            String lraState = journal.getLraState();
            if (lraState.equals(ParticipantStatus.Completing.name()) ||
                    lraState.equals(ParticipantStatus.Completed.name())) {
                // idempotency : if current LRA stats is already Completed, do nothing
                return ResponseEntity.ok(ParticipantStatus.valueOf(lraState));
            }
            journal.setLraState(ParticipantStatus.Completed.name());
            accountTransferDAO.saveJournal(journal);
        }
        return ResponseEntity.ok(ParticipantStatus.Completed.name());
    }

    /**
     * Read the journal and increase the balance by the previous withdrawal amount before the LRA
     */
    @RequestMapping(value = "/compensate", method = RequestMethod.PUT)
    @Compensate
    public ResponseEntity<?> compensateWork(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        LOG.info("Account withdraw compensate() called for LRA : " + lraId);
        try {
            Journal journal = accountTransferDAO.getJournalForLRAid(lraId, JournalType.WITHDRAW);
            if (journal != null) {
                String lraState = journal.getLraState();
                if (lraState.equals(ParticipantStatus.Compensating.name()) ||
                        lraState.equals(ParticipantStatus.Compensated.name())) {
                    // idempotency : if current LRA stats is already Compensated, do nothing
                    return ResponseEntity.ok(ParticipantStatus.valueOf(lraState));
                }
                accountTransferDAO.doCompensationWork(journal);
            } else {
                LOG.warn("Journal entry does not exist for LRA : {} ", lraId);
            }
            return ResponseEntity.ok(ParticipantStatus.Compensated.name());
        } catch (Exception e) {
            LOG.error("Compensate operation failed : " + e.getMessage());
            return ResponseEntity.ok(ParticipantStatus.FailedToCompensate.name());
        }
    }

    @RequestMapping(value = "/status", method = RequestMethod.GET)
    @Status
    public ResponseEntity<?> status(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId,
                                    @RequestHeader(LRA_HTTP_PARENT_CONTEXT_HEADER) String parentLRA) throws Exception {
        return accountTransferDAO.status(lraId, JournalType.WITHDRAW);
    }

    /**
     * Delete journal entry for LRA (or keep for auditing)
     */
    @RequestMapping(value = "/after", method = RequestMethod.PUT)
    @AfterLRA
    public ResponseEntity<?> afterLRA(@RequestHeader(LRA_HTTP_ENDED_CONTEXT_HEADER) String lraId, @RequestBody String status) {
        LOG.info("After LRA called for lraId : {} with status {} ", lraId, status);
        accountTransferDAO.afterLRA(lraId, status, JournalType.WITHDRAW);
        return ResponseEntity.ok().build();
    }

The remaining portion of the WithdrawService class is dealing with completing or compensating the LRA, allowing the department's view of the LRA status, and registering a callback when the LRA is completed or compensated. If the LRA is canceled and the compensation endpoint is called by the coordinator, it redeposits the withdrawn amount and sets its LRA state to compensated. If the LRA is completed, then the completion endpoint is called which has nothing to do as the funds have already been withdrawn and committed, so just its LRA state is updated. When the LRA is finally done, either completed or compensated, the after callback is made by the coordinator. In this case the callback simply deletes the journal entry as it is no longer needed.

Let's look at the deposit service DepositService.java:

@RestController
@RequestMapping("/deposit")
public class DepositService {

    private static final Logger LOG = LoggerFactory.getLogger(DepositService.class);

    @Autowired
    IAccountOperationService accountService;

    @Autowired
    JournalRepository journalRepository;

    @Autowired
    AccountTransferDAO accountTransferDAO;

    /**
     * cancelOnFamily attribute in @LRA is set to empty array to avoid cancellation from participant.
     * As per the requirement, only initiator can trigger cancel, while participant returns right HTTP status code to initiator
     */
    @RequestMapping(value = "/{accountId}", method = RequestMethod.POST)
    @LRA(value = Type.MANDATORY, end = false, cancelOnFamily = {})
    public ResponseEntity<?> deposit(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId,
                                     @PathVariable("accountId") String accountId, @RequestParam("amount") double amount) {
        accountTransferDAO.saveJournal(new Journal(JournalType.DEPOSIT.name(), accountId, amount, lraId, ParticipantStatus.Active.name()));
        return ResponseEntity.ok("Amount deposited to the account");
    }

    /**
     * Increase balance amount as recorded in journal during deposit call.
     * Update LRA state to ParticipantStatus.Completed.
     */
    @RequestMapping(value = "/complete", method = RequestMethod.PUT)
    @Complete
    public ResponseEntity<?> completeWork(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        try {
            LOG.info("deposit complete called for LRA : " + lraId);
            Journal journal = accountTransferDAO.getJournalForLRAid(lraId, JournalType.DEPOSIT);
            if (journal != null) {
                String lraState = journal.getLraState();
                if (lraState.equals(ParticipantStatus.Completing.name()) ||
                        lraState.equals(ParticipantStatus.Completed.name())) {
                    // idempotency : if current LRA stats is already Completed, do nothing
                    return ResponseEntity.ok(ParticipantStatus.valueOf(lraState));
                }
                accountTransferDAO.doCompleteWork(journal);
            } else {
                LOG.warn("Journal entry does not exist for LRA : {} ", lraId);
            }
            return ResponseEntity.ok(ParticipantStatus.Completed.name());
        } catch (Exception e) {
            LOG.error("Complete operation failed : " + e.getMessage());
            return ResponseEntity.ok(ParticipantStatus.FailedToComplete.name());
        }
    }

and the doCompleteWork method looks like:

    public void doCompleteWork(Journal journal) throws Exception {
        try {
            Account account = accountQueryService.getAccountDetails(journal.getAccountId());
            account.setAmount(account.getAmount() + journal.getJournalAmount());
            accountService.save(account);
            journal.setLraState(ParticipantStatus.Completed.name());
            journalRepository.save(journal);
        } catch (Exception e) {
            journal.setLraState(ParticipantStatus.FailedToComplete.name());
            journalRepository.save(journal);
            throw new Exception("Failed to complete", e);
        }
    }

which finally adds the amount deposited to the account. If we look at the compensate callback for deposit, it just sets the LRA status in the journal to compensated:

    /**
     * Update LRA state to ParticipantStatus.Compensated.
     */
    @RequestMapping(value = "/compensate", method = RequestMethod.PUT)
    @Compensate
    public ResponseEntity<?> compensateWork(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId) {
        LOG.info("Account deposit compensate() called for LRA : " + lraId);
        Journal journal = accountTransferDAO.getJournalForLRAid(lraId, JournalType.DEPOSIT);
        if (journal != null) {
            String lraState = journal.getLraState();
            if (lraState.equals(ParticipantStatus.Compensating.name()) ||
                    lraState.equals(ParticipantStatus.Compensated.name())) {
                // idempotency : if current LRA stats is already Compensated, do nothing
                return ResponseEntity.ok(ParticipantStatus.valueOf(lraState));
            }
            journal.setLraState(ParticipantStatus.Compensated.name());
            accountTransferDAO.saveJournal(journal);
        }
        return ResponseEntity.ok(ParticipantStatus.Compensated.name());
    }

    @RequestMapping(value = "/status", method = RequestMethod.GET)
    @Status
    public ResponseEntity<?> status(@RequestHeader(LRA_HTTP_CONTEXT_HEADER) String lraId,
                                    @RequestHeader(LRA_HTTP_PARENT_CONTEXT_HEADER) String parentLRA) throws Exception {
        return accountTransferDAO.status(lraId, JournalType.DEPOSIT);
    }

Finally the after callback will be triggered and it will delete the associated journal entry by calling the afterLRA method on the accountTransferDAO.

    /**
     * Delete journal entry for LRA (or keep for auditing)
     */
    @RequestMapping(value = "/after", method = RequestMethod.PUT)
    @AfterLRA
    public ResponseEntity<?> afterLRA(@RequestHeader(LRA_HTTP_ENDED_CONTEXT_HEADER) String lraId, @RequestBody String status) {
        LOG.info("After LRA Called : " + lraId);
        accountTransferDAO.afterLRA(lraId, status, JournalType.DEPOSIT);
        return ResponseEntity.ok().build();
    }

The accountTransferDAO afterLRA method is what actually deletes the journal entry:

    public void afterLRA(String lraId, String lraStatus, JournalType journalType){
        Journal journal = getJournalForLRAid(lraId, journalType);
        if (Objects.nonNull(journal) && isLRASuccessfullyEnded(lraStatus)) {
            journalRepository.delete(journal);
        }
    }

As the above code shows, a lot of additional work is required the application developer to ensure data consistency. However even with all these changes, the above code doesn't handle two calls to deposit or withdraw in the same LRA. The journal implementation only records the last update.

In my next post, I'll show the same sample application, but this time using XA to ensure data consistency.