Export passes

When the sync engine is about to export an object, it could be that it has a reference to another object that has not yet been exported. When it sees that a reference has no chance to be successful to be exported, it removes it from the export. It then put the object in the list of objects that should be retried in a second pass of the export. In the second pass, it comes back to the object and sets the reference with an update. During a single export run, the same object might be processed several times.

Assuming a completely new connector space with staged exports, a common processing sequence would be:

  • First Pass: Add all objects, but do not export references.
  • Second Pass: Update all objects with the references. With all objects being added in the first pass, the second pass should be able to set the references successfully.

Optimistically exporting references

As it might sound in the introduction of this topic, references are always sent as an update. That behavior can be controlled by the capability flag NoReferenceValuesInFirstExport. When set to "false", which is most common, then reference attributes to objects exported in an earlier batch are exported already in the first pass.

The notable connector where this is set to "true" is FIM Service. This connector always sends the add in the first pass and reference attributes in the second pass. The reason is to make troubleshooting and development easier, even if it might sound like the opposite. A connector might end up with multiple passes without the administrator realizing it. All implementations must consider this scenario and with this behavior, the FIM Service policies have to consider and implement support for this scenario.

Ordering of objects in an export

As a general rule, it is assumed that an object with many references is more likely to include objects not yet exported, causing the object to be pushed to the second pass and retried later in the same export. To avoid this as much as possible, the sync engine at the start of an export sorts all objects based on the number of references they have. The chairman of the board (with no manager reference) and groups with no members are exported first. It then continues with all normal users with only the single manager attribute set. Then the groups are processed, with the largest groups coming at the end.

This is the reason why during export it feels like the sync engine is being slower and slower. As an example during initial sync with Azure AD Connect, it is fast in the beginning with only empty groups to export. But when it comes to all the groups with 40k-50k members, it feels like it is coming to a halt (causing a support call to Microsoft).

ECMA2 and references

During export, it might be that even though the sync engine thinks a reference attribute should succeed when writing to the target directory, it actually fails. In this case, the code should return the object-level error ExportErrorReferenceAttributeFailure. This tells the sync engine that all non-reference attributes were successful, but the reference attributes were not. It will put the object to be retried in the next pass. If there are multiple reference attributes on the object, then this error should be returned if at least one fail.

The four passes

In most cases, the connector only has to run two passes to be successful. But in the worst case, the connector might try the same object up to four times during the same export.

  • First pass exports the add and update of all non-reference attributes.
  • The second pass exports all reference attributes as a single update.
  • The third pass exports each reference attribute by itself.
  • The fourth pass exports each reference value by itself.

This algorithm allows the sync engine to slowly find out which attribute and value that is causing the export problem. Most connected directories cannot provide information on what data is causing the export problem. But the sync engines slowly tries to export all data that is "good" so only problematic data is left for the administrator to investigate.

Most connectors, not even the built-in, implements the reference retry error for all passes so it is in real life unlikely that you see all four passes in a single export.

Export and statistics

Since an object might be processed many times during a single export, the export statistics might be confusing. It could look like that many more objects were processed than expected, but that could simply be because of the multiple passes behavior.