A similarity coefficient for use with data consisting of a series of binary variables that is often used in cluster analysis. The coefficient is given by 𝑠𝑖𝑗 = 𝑎/ 𝑎 + 𝑏 + 𝑐 where a, b and c are three of the frequencies in the 2 × 2 cross classification of the variable values for subjects i and j.