As I understand it, the "by" operator makes a comparison. Proc makes a separate comparison for each group in the comparison datasets. It basically looks like starting a separate "proc compare" for each "by" group.
The id operator, on the other hand, matches key records between two matched data sets and messages about the number of common elements and the number in one data set, but not in the other. You would use this if your datasets have a common primary key, that is, a combination of variables that uniquely identify each record, and you want the โprooc collationโ to take each matching pair and compare them.
source share