Heroku Connect Duplicate Record Issues
Last updated December 01, 2022
Table of Contents
Heroku Connect uses separate threads for reading and writing. It’s possible to create duplicate records when a read-write mapping doesn’t have a unique identifier. This article describes this situation and how to prevent it.
Scenario
A duplicate record gets created in this scenario:
- Heroku Connect writes a new record to Salesforce.
- Another thread starts to synchronize the record back to Postgres, before the writing thread writes the new Salesforce ID (
sfid
) back to Postgres. - When synchronizing changes from Salesforce, Heroku Connect tries to match records by their Salesforce ID. Since it’s unavailable in this example, it falls back to using the unique external identifier for matching. If both
sfid
and the unique external identifier aren’t available, Connect creates a new, duplicate record in Postgres. The original record remains without itssfid
set.
Prevent Duplicates By Ensuring Uniqueness
Add a custom field to your object in Salesforce and configure it as the unique identifier. This identifier prevents the creation of duplicate records. Additionally, you can also reliably reference a newly inserted row from a foreign table using the configured unique external ID.
Treat the unique identifier as an alias for a Salesforce ID. Never change or reuse them and choose a unique mechanism for generating them.
Don’t use Postgres integer sequences as they aren’t guaranteed to be unique. Don’t use the ID field in the Heroku Connect database as the unique identifier. These numbers can reset when importing your configuration into a new database or on reloading a table.
Avoid using auto-incrementing fields for unique identifiers.
Populate the unique identifier field with a unique value, such as a globally or universal unique ID (GUID or UUID) when you first INSERT
the record. This strategy ensures the uniqueness of newly created records in Postgres, as well as establish relationships between objects. It’s recommended to use random UUIDs, which can be generated using gen_random_uuid()
.
Using gen_random_uuid()
requires enabling the pgcrypto
extension (CREATE EXTENSION IF NOT EXISTS pgcrypto;
).
Re-inserted Records
Even when using unique identifiers, it’s possible to create duplicate records if you delete records in Salesforce shortly after syncing them via Heroku Connect. While this situation is rare, it’s not recommended to delete records in Salesforce shortly after syncing via Heroku Connect. Wait at least 10 minutes before deleting. See Re-inserted Records for more info.