We ran a number of sessions at the recent ICAC (Internet Crimes Against Children) taskforce event in Atlanta GA, and before our main presentation we polled attendees to see what data sources they want to match against when running triage for CSAM.
These are the responses we received:
On average participants selected three options.
Historically most police agencies have had limited choices of data to power triage. In practice most used a hash list supplied by a national agency or their tool vendor. It's therefore unsurprising to see Project VIC at the top of the list.
It's also not surprising to see that images from the Cybertip under investigation are a popular choice.
I'm delighted to see that more than half of respondents also recognised value in data from other ICACs, and more than third wanted to use their own historic data and data from international sources.
Cyacomb customers now have access to around 10 major Contraband Filters produced by leading data owners including Project VIC, Interpol, IWF and numerous US ICACs. Some of these organisations will also provide hash lists to law enforcement, but at least half do not routinely do so. This is usually because of concerns around the privacy and security of hashes, especially in the UK and EU where (whether you agree on this point or not) they are regarded as personally identifying information of victims and offenders. Contraband Filter technology is secure by design, and so data owners are confident sharing Contraband Filters where they would not be able to share hash lists for use with other tools.
It is also possible to use Cybertip data with Cyacomb's tools (get in touch if you need help doing this) and that's something we plan to make even simpler in the future.
I'm delighted to see this result, because too often I hear people stating confidently that ‘you should use the biggest authoritative dataset you can (e.g. Project VIC) and you don't need anything else - any smaller dataset will be included in the bigger one...’
We know that's not true. And here’s why...
While Contraband Filters are secure by design (neither we nor anyone else can see the contents), we are able to draw some inferences about how much overlap there is in the content represented by the filters. It's not an exact science, but we can say with a high degree of confidence that while some of the largest filters might overlap by as much as a third of their content, most overlap by far less.
Some of the uniqueness might be from images that look the same but have been slightly altered, some from images that have been graded differently, and some from corrupt or incompletely forensically recovered images. Some will come from images that are new and have been seen only once or on a handful of cases, so they have not yet made their way into all datasets. Some uniqueness may also be due to regional variations in what is circulating.
Cyacomb Examiner and Examiner Plus both allow users to load multiple Contraband Filters at once very simply and with almost no impact on performance.
Practical experience tells us that the uniqueness in each Contraband Filter adds value, with users regularly telling us that they get important results from different sources when using more than one.
This means that every additional Contraband Filter used is adding to the probability of finding content at triage.
Today most data exists in silos, some larger than other, but none of them complete. The challenges in resolving this are huge. It would require international co-operation not just on matters of privacy and data protection, but also on the different classifications schemes and legal thresholds, as well as technical implementation and cybersecurity.
Until we find a solution to that problem, Cyacomb's Contraband Filter technology helps get one step closer to the ideal of the best possible data being available wherever it is needed.
No harm can come from using multiple sources, so why not make the most of our ever expanding CSAM data eco-system?
Please click here to start downloading your file.