Matching rules

There are a number of predefined rules that need to be met and therefore govern the match results that the Batch API produces.

This core set of rules applies to the matching process for all countries:

  • Elements must roughly occur in the expected predefined order (the order is defined in each country's data files).
  • Any numbers appearing before the place elements in an input address should be matched.
  • Numbers should associate correctly with accompanying elements. For example, in a UK address the building number is expected before the street name.
  • At least one place element should be found in the input address.

If these rules are not satisfied, match confidence will be reduced to intermediate unless the rule is specifically suppressed in the particular country's dataset. For example, if a place is not supplied for a UK address, the match confidence will not be reduced if a match can be made by using the supplied postcode. Normally, postal codes represent larger areas, and confidence would be reduced in those circumstances. Regardless of any change in match confidence, the information bits set within the generic match code for a returned address indicate the rules that have failed.

These rules state the minimum criteria which must be satisfied in a match. If one of these rules is not met, a match of low confidence will be returned. These rules include conditions such as:

  • There must be an input element defining the location of the address (a town, a postal code, etc.).
  • There must be an element defining the property information.

Expanding on from the Essential rules, the Preferred rules include criteria which should ideally be satisfied to give you a better match. For example:

  • A PO Box number must be matched if one exists in the dataset.
  • There must be some form of match with a street name if one exists in the dataset.
  • The postal code must match if none of the place elements in the dataset has matched.

This is an additional set of rules which affect the confidence of a match to the address under consideration. Strictness is roughly equivalent to the Preferred rules, for example:

  • Both the PO Box number and the corresponding place must match.
  • At least one of the street descriptors and the locality elements must match.
  • The sub-premises address element must make an exact match.

This is a set of rules applied by Batch API at its address acceptance stage, just before a final address is returned. These rules are only defined for some datasets and specify the strictest final criteria that an address match must satisfy. These rules are only used when confidence has not previously been reduced for any other reason. Example rules are:

  • It was not necessary to change more than one character in the supplied street name to obtain a match.
  • All of the supplied multiple place and postcode-level elements matched.

There is an additional set of rules which are inherent to the code for all countries. These rules are dataset independent.

  • Elements must occur in a predefined expected order (the order is defined in each dataset).
  • All numbers appearing before the place elements in the input address must be matched.
  • Numbers must associate correctly with the accompanying elements.
  • At least one place element must be found in the input address.

The information bits returned with an address indicate the rules that have been enforced.

Unmatched input text

When using one of the datasets listed above, Batch API will downgrade match success scores and confidence levels if it is unable to match some text in the input address. This is designed to prevent misleading high confidence matches being made to address records that share only a certain amount of text in common with the input address.

Consider using Batch API with United Kingdom or AddressBase® Premium data with the following input address:

Quick and Speedy Dry Cleaning Ltd, 2-3 Clapham Common North Side, London, SW4 0QL

Part of that input address would match to part of the following, postally correct address:

Experian Ltd, George West House 2-3 Clapham Common North Side London SW4 0QL

When using other datasets, the equivalent result would be a full address match with intermediate confidence. But when using one of the datasets listed above, the match is not used due to the amount of unmatched input text. Instead, we receive a result similar to this:

George West House
2-3 Clapham Common North Side London
SW4 0QL

The result is a partial address match which only includes elements from the input address that can be matched with high confidence.

Additional dataset-specific matching rules

Since Batch API is a multi-country address matching product, additional rules are tailored for each dataset and are embedded in the data.