# Models for Merging Passes and Creating Linked Datasets

Limited Bayesian Model

Set Cutoff Probability = 0.90 or some other high value in the Specify Match dialog. Set Linked Data Sets = 1 on the Merge Passes tab. LinkSolv uses your estimates of total matches, error probabilities, and frequencies of data values in all probability calculations. All candidate pairs over the specified cutoff probability are accepted as linked pairs and assigned keep status = LP.

Take All Pairs

Take 1-1 Pairs, Take Max Pairs (new label for Take 1-1 Pairs), Draw 1-1 Pairs, or Take LSAP Pairs

Use one of these methods if you expect most true links to be one-to-one rather than many-to-many. Set Cutoff Probability = 0.01 or some other low value so that almost all true links are over the cutoff. Set Pairs to Analyze to one of these values on the Merge Passes tab. Set other merge parameters following the guidelines for Take All Pairs. All of these methods group many-to-many pairs into sets based on having a common record – if two pairs have the same record from table A or from table B then they are assigned to the same set. The methods differ in how one-to-one pairs are selected from each set but one-to-one pairs are always assigned Keep Status = LP and others get Keep Status = IP.

Take Max Pairs includes the pair from each set with maximum probability in the one-to-one linkage and excludes competing pairs (a common record) with lower probabilities. The process repeats until all pairs in all sets have Keep Status = LP or IP.

Draw 1-1 Pairs extends the original Bayesian model so that one-to-one pairs are drawn as part of the overall probability model. This method is preferred by theorists, particularly if you plan to analyze multiple imputations using IVEWARE or SAS PROC MIANALYZE. LinkSolv calculates the probability of each one-to-one permutation of records in each set. For example, if a set includes records A1 and A2 from table A and records B1 and B2 from table B then the one-to-one permutations are (A1, B1; A2, B2) and (A1, B2; A2, B1), either of which might be drawn in each iteration.

Take LSAP Pairs treats selecting linked pairs as a Linear Sum Assignment Problem (LSAP). Given the probabilities of each one-to-one permutation calculated for Draw 1-1 Pairs, Take LSAP Pairs takes the one-to-one permutation which maximizes the sum of match weights, which is the same as the permutation with maximum probability.

Best Pairs

LinkSolv ranks pairs in sets by probability from greatest to least and then selects as many pairs from the top of the list as possible given your specified False Positive Rate. Remember that Match Probability = 0.9 means that 9 out of 10 such links are true and 1 out of 10 is false. So, each 0.9 link in the list contributes 0.9 links toward Expected True Positives and 0.1 links toward Expected False Positives, and similarly for other probabilities. For a given sample of pairs from the top, LinkSolv estimates False Positive Rate = Expected False Positives / (Expected False Positives + Expected True Positives).