SmartConnector Diff Checking
Audience: Admins, Developers, Solution Architects
Purpose: Explain what diff checking is, how it works, how to enable and disable it, and when its built-in behavior is not appropriate for a given use case.
Overview
Diff checking is an optional optimization that prevents SmartConnectors from re-processing rows that have not changed since the previous run. In recurring imports, diff checking often speeds up processing by 10x or more, but understanding exactly what it compares, and what it does not, is essential to using it correctly.
What Is Diff Checking?
Diff checking is a mechanism that compares each incoming row against the previous run's output before deciding whether to send that row to the load step. If the contents of a row are the same as the last run, it is skipped. Only rows whose contents have changed are sent through to the load step.
This is most valuable for recurring imports that ingest complete dataset files on a schedule. When most rows are identical from run to run, diff checking eliminates the overhead of re-processing thousands of unchanged records.
One important detail: diff checking occurs after variable resolution. The hash is computed on the mapped execution variable values. If you want a value to be considered by the diff, map it as a variable. This works even if the variable is not used by a later load step: any mapped variable contributes to the diff.
How To Enable Diff Checking
Diff checking is configured at the SmartConnector level and applies to the entire SmartConnector when enabled. It is not set per output table.
Diff checking can also be toggled directly in the Run GUI at the time of starting a run. Turning it off before a run forces a full re-ingestion regardless of what was processed in previous runs. Every row is sent to the load step as if it were new.
Troubleshooting Tip: If a SmartConnector ran successfully but data is not updating as expected, check two things. First, load step conflict resolution rules (for example, "only update if blank" on a field that is not blank). Second, diff check skipping the row; the run report shows which rows were skipped, or you can toggle diff check off and re-run to rule it out.
Behavior and Limitations
Diff check compares the current run's output against the previous successful run's output, not against the current state of Records in Kizen. If the previous run failed, it is not used for comparison, and the next run will diff against the most recent successful run before it. This distinction matters in any environment where Records can be modified between runs.
This behavior is by design for straightforward ingestion Workflows, where the source file is the authoritative data source and Kizen Records are not expected to be modified independently. In those cases, skipping unchanged rows is safe and efficient.
The built-in diff check is not the right tool for every use case. If your pipeline needs to detect and correct changes that have been made directly to Kizen Records between runs, diff checking will miss those changes entirely.
Custom diff checking
For cases where the built-in diff check is not sufficient, it is also possible to use SQL processing to compare incoming data against reference data from Kizen. This allows for more complex comparisons, such as checking against the current state of Records in Kizen rather than the previous run's output.
The most common reason to reach for a custom diff is to handle Records that are missing from the source data. For example, if one run's file contains Records A, B, and C, and the next run's file contains B, C, and D, a custom diff can detect that A is no longer present and take action on it (such as expiring or archiving the Record). The built-in diff check cannot do this, because it only evaluates rows that are in the incoming file.
This is a power-user feature. It requires significant SQL ability to implement correctly, and most SmartConnectors will not need it.
Caution: Diff checking has no visibility into changes made directly to Kizen Records between runs. If a user, Agentic Workflow, or other process updates a Record after the last run, the SmartConnector will not detect that change. The row will hash to the same value as before and will be skipped on the next run.
What's Next
With diff checking configured, you have everything you need to run your SmartConnector reliably. Continue to Running a SmartConnector to learn how to activate your SmartConnector, execute a dry run, interpret the XLS output report, and understand what each execution status means.
Last updated
Was this helpful?