Affected
- ResolvedResolved
This has been resolved.
The issue was due to an unforeseen case where processing a previously scheduled campaign, or a campaign scheduled at the exact same time as another larger campaign, would take much longer than expected to process (ie, minutes instead of seconds). While resources were allocated to process the first larger campaign, seperate campaign watchdogs (designed as a failsafe solution) would identify campaign that was scheduled for minutes ago has not yet been processed and would trigger their own campaign start procedure to ensure the send happened. The watchdogs would continue to do this until the first campaign send was completed.
**What we have done to remedy this: **
We have identified the exact cause and have been able to replicate the issue within test environments. Using that we have implemented a fix that introduces a locking process to ensure the watchdogs don't interpret campaigns as 'not sent' when they're in fact processing and erroneously re-triggering the campaign. We have also reviewed and added additional processes and additional automated validation in our workflows to ensure this type of issue is not presented again. - MonitoringUpdate
We're working on writing up the resolution now, the resolution will be available here shortly.
- MonitoringUpdate
We continue to monitor the results before marking as resolved. Expected resolution is now Friday morning AEDT.
- MonitoringMonitoring
We implemented a fix and are currently monitoring the result. We expect resolution within the next few hours.
- IdentifiedIdentified
We have identified the issue and are working on an immediate fix. Only a subset of customers are affected.
Customers are welcome to use campaigns, but we're advising to stagger campaign schedule times by 5 minutes for campaigns that exceed 2000 recipients. This is a temporary soft limit while we continue to work to a resolution.
A full summary will be provided in due course.
- InvestigatingUpdate
We continue to investigate the cause.
- InvestigatingInvestigating
We are currently investigating this incident.