The Analyze blocking keys section shows you the efficiency and effectiveness of your blocking keys and suggests potential performance improvements.
To begin analyzing your blocking key configuration, click the Analyze button. This process can take several minutes for large datasets.
Once the analysis is complete, you will see a table containing a summary of the performance of each of your blocking keys.
To determine which blocking keys are the best candidates for optimization, we suggest looking at the cost to matches not found by other keys ratio and starting with the largest values. If your blocking key is generating a large number of score pairs but few score pairs not found by other blocking keys, it may be worth making that blocking key more specific. Note that there isn't a linear relationship between the number of score pairs and overall processing time so reducing the number of score pairs by 50% is unlikely to result in a 50% drop in processing time.
The possible improvement section lists changes to your existing blocking keys to make them more efficient. Each blocking key suggestion has a Description column, which describes the change made to the original blocking key, a Definition column, which contains the modified version of the blocking key, and a further four columns, which show the effectiveness and efficiency of the suggested blocking key.
To apply a suggested blocking key change, replace the existing elementSpecifications value for the specified blocking key with the suggestion given in the Definition column. For example, if your blocking keys contain the following:
{
"description": "FullPostcode",
"elementSpecifications": [
{
"elementType": "POSTCODE",
"elementModifiers": [
"STANDARDSPELLING"
],
"includeFromNChars": 5,
"truncateToNChars": 7
}
]
}
And you want to apply the following suggestion to add surname to your FullPostcode key:
[
{
"elementType": "POSTCODE",
"elementModifiers": [
"STANDARDSPELLING"
],
"includeFromNChars": 5,
"truncateToNChars": 7
},
{
"elementType": "SURNAME",
"algorithm": {
"name": "DOUBLE_METAPHONE"
},
"includeFromNChars": 1,
"truncateToNChars": 10
}
]
Then you would replace the value of elementSpecifications in your FullPostcode key with the Definition given and update the description to reflect the change:
{
"description": "FullPostcode+Surname",
"elementSpecifications": [
{
"elementType": "POSTCODE",
"elementModifiers": [
"STANDARDSPELLING"
],
"includeFromNChars": 5,
"truncateToNChars": 7
},
{
"elementType": "SURNAME",
"algorithm": {
"name": "DOUBLE_METAPHONE"
},
"includeFromNChars": 1,
"truncateToNChars": 10
}
]
}