Machine learning has been increasingly used in crime fighting in recent years, from early experiments in predictive policing to the use of facial recognition technology in China. Similar technology has been deployed in airport security, whilst smart image recognition has also allowed CCTV to take on predictive capabilities.
A recent project from researchers at the Forensics and National Security Sciences Institute (FNSSI) has used machine learning to better profile criminals, with a specific focus on the DNA contained in samples.
“There is a massive amount of data that is not being considered, simply due to our limited capability as human beings,” the researchers explain.
Probabilistic Assessment
The researchers have developed a machine learning based approach for predicting the number of individuals contributing to mixed DNA samples. The method, which they call the Probabilistic Assessment for Contributor Estimate (PACE), has been licensed to software company NicheVision in the hope that it will be commercialized.
The current approach for separating mixed DNA samples to identify individual genetic information requires analysts to know how many people contributed to the sample, which is fiendishly difficult.
The researchers used machine learning to train computers to solve such predicaments on their own in double quick time. There is ample data by which to train the algorithms, thus making forensic science a good playground for using machine learning in.
Training data
The researchers trained the algorithm on a huge volume of data obtained from the ew York City Office of the Chief Medical Examiner and the Onondaga County Center for Forensics Sciences, and the training allowed PACE’s predictions to get better and better. It was eventually tested on a mixed sample with known numbers of contributors, and it was capable of identifying them with flying colors.
Indeed, so good was PACE, that it managed to out perform a human expert by 6% on a three person sample, with that figure going up to 20% when the sample contained four peoples DNA. As with most other algorithmic approaches, it can also do in seconds what would normally take several hours to be completed using current methods. It’s a result that the researchers believe represents a huge leap forward in the way DNA is analyzed.
“Incremental improvements happen in technology development all the time, but this could completely change how the problem of ‘deconvoluting’ mixed samples is solved,” they say. “It looks like disruptive technology.”