Added metrics for age bracket (`n_off_accuracy`), as well as
binary/multiclass false-{positive,negative} rates (`binary_fpr_fnr`
and `multiclass_fpr_fnr`). Also, added mutual agreement between models
`agreement_fraction` and `agreement_elements`.
New `possible_capability_values` base method for estimators. The goal
is to return a list of possible values for each non-numerical
capability, as some performance metrics may benefit from this complete
listing.
Additionally, added a new exception to indicate when a capability is
invalid in a given context, e.g., when the model does not support it.
The rationale of using `csvfile` instead of `pandas` directly, was to
avoid a fairly heavy dependency, since we were only reading the CSV
data. Now, since we need to do some fairly convoluted filtering to
calculate the subgroup metrics, its better to use pandas now.