Click here to flash read.
The issue of distinguishing between the same-source and different-source
hypotheses based on various types of traces is a generic problem in forensic
science. This problem is often tackled with Bayesian approaches, which are able
to provide a likelihood ratio that quantifies the relative strengths of
evidence supporting each of the two competing hypotheses. Here, we focus on
distance-based approaches, whose robustness and specifically whose capacity to
deal with high-dimensional evidence are very different, and need to be
evaluated and optimized. A unified framework for direct methods based on
estimating the likelihoods of the distance between traces under each of the two
competing hypotheses, and indirect methods using logistic regression to
discriminate between same-source and different-source distance distributions,
is presented. Whilst direct methods are more flexible, indirect methods are
more robust and quite natural in machine learning. Moreover, indirect methods
also enable the use of a vectorial distance, thus preventing the severe
information loss suffered by scalar distance approaches.Direct and indirect
methods are compared in terms of sensitivity, specificity and robustness, with
and without dimensionality reduction, with and without feature selection, on
the example of hand odor profiles, a novel and challenging type of evidence in
the field of forensics. Empirical evaluations on a large panel of 534 subjects
and their 1690 odor traces show the significant superiority of the indirect
methods, especially without dimensionality reduction, be it with or without
feature selection.
No creative common's license