Motivation: Structural knowledge, extracted from the Protein Data Bank (PDB), underlies numerous potential functions and prediction methods. The PDB, however, is highly biased: many proteins have more than one entry, while entire protein families are represented by a single structure, or even not at all. The standard solution to this problem is to limit the studies to non-redundant subsets of the PDB.
View Article and Find Full Text PDF