In collaboration with Dr James Walkerdine & Dr Phil Greenwood of Relative Insight and Dr Paul Rayson & Dr Alistair Baron of the University of Lancaster.
Big Data linguistic research is pioneering a new field in online child protection called Digital Persona Analysis (DPA), automating the process of detecting sexual predators online who masquerade as children.
The research, accredited to the Global Uncertainties Programme (predecessor to PaCCS), was led by Professor Awais Rashid of the University of Lancaster and funded by the Engineering Physical Sciences Research Council (EPSRC) and Economic Social Research Council (ESRC).
More children than ever are using social media and network sites online, increasing the number of children at risk from sexual predators. Online child protection is now a key concern.
DPA analyses vast quantities of data at a higher level, reporting individual personas and behaviours to investigators. It exploits the “Isis Toolkit” (part of a larger Isis project), that aimed to detect criminals who hid behind multiple identities (mainly adults posing as children). This research built on internationally-leading work on corpus comparison techniques that used statistical natural language analysis to look at the conversational behaviour of the British population in the 1990s. The majority of the research utilised a semantic analysis where keywords are characterised based on contextual information. This created the ability to operate in the face of noisy language data and deceptive behaviour; hence enabling the Isis Toolkit to detect masquerading tactics with a high degree of accuracy.
‘The Isis Toolkit can detect, with an accuracy of 94%, when an adult is masquerading as a child compared to children participating in controlled experiments.’
This highly interdisciplinary research (combining computer and behavioural science with linguistics) has led to collaboration between departments within the university and a wide range of other communities outside academia. This work has delivered impact in four different areas:
The Isis Toolkit has been licenced by spin-out company, Relative Insight, where it has been successfully incorporated within broader security and child protection commercial offerings. The generalisation of the technology and the scaling up of the analysis capabilities via a cloud-enabled API, now allows for it to be applied to any type of investigative activity where digital text needs to be analysed. In addition to the security domain, the company has further diversified and uses the technology to support brands and advertising agencies extract insights from digital sources in order to better understand their target consumers. Due to the success of the latter, Relative Insight has grown rapidly and now employs 13 people with clients including Havas, Saatchi & Saatchi, Ogilvy, Twitter and Microsoft Mobile.
- Law Enforcement
Successful live trials with UK Police Forces, along with the Child Exploitation & Online Protection, have demonstrated the accuracy of the Isis Toolkit on real data sets while also decreasing the amount of analysis time required.
“[The Isis toolkit] provides the ability to focus analysis on specific information [and] allows investigations to be more focused and therefore potential victims of grooming or contact abuse to be identified more easily”
Quote from the evaluation of the aforementioned live trials.
The toolkit has also been licensed for use by the Canadian Royal Mounted Police who see this research as an ‘operational necessity’. Large-scale agreements with other international customers could emerge in the future.
Through developing the Toolkit, researchers have created and delivered internet safety lessons to over 500 students. This has generated strong links with teachers leading to the development of comprehensive lesson plans on e-safety topics for Key Stages 2-5 that have been rolled out across the region through the South Lakes Teaching School Alliance (SLTSA). In fact, the Isis Toolkit has been deployed worldwide through the release of the free iTunes app called ChildDefence, empowering children to protect themselves online. This was built upon in 2014 when it formed the basis of one of the WeProtect projects, an online child protection initiative set up by Prime Minister David Cameron.
- Internet Governance
The research led to a policy paper prepared for the Chartered Institute for IT (BCS), and presented to Alun Michael, MP in 2009. This paper subsequently was selected as the single UK contribution to the 2009 and 2010 Internet Governance Forums (in Sharm-Al-Sheikh and Vilnius respectively). It also provided written evidence to the Commons Select Committee on Education (2010), as well as contributing to the Proposal for a Directive of the European Parliament and of the Council on combating the sexual abuse, sexual exploitation of children and child pornography, repealing Framework Decision 2004/68/JHA (COM/2010/0094).
A report requested by the European Parliament’s Committee on Gender Equality used the Isis toolkit as a case study to support the use of a modified toolkit in assisting in the detection and management of cyber coercion and rape of women and girls.
In just raising awareness of online protection issues this research inevitably leads to an impact on the general public. It is crucially important in this area of online child protection to impact both public policy and to inform the general public.