Text mining to support the evaluation of research grant applications Olivier Eulaerts Text Mining & Analysis Competence Centre DG Joint Research Centre European Commission
European Commission's science and knowledge service Support EU policies with independent evidence throughout the whole policy cycle. Contributing to e.g. a healthy and safe environment, secure energy supplies, sustainable mobility and consumer health and safety.
JRC Competence centre on text mining Support policy makers with text mining tools and services across policy fields. Text mining Data harvesting, processing and visualisation Computational linguistic Scientometrics IT product development.
EMM OSINT Suite - desktop software application - find, acquire, extract and analyse information from the Internet and local sources - contains tools to automate various tasks in the process of gathering intelligence from open sources. Europe Media Monitor - see, explore and understand current news reported by world s online media - monitoring >8000+ news sources - 70 languages - advanced information extraction techniques - automatically determines what is being reported, where things are happening, who is involved, what they said. TIM innovation suite tools to explore and map technological development bridging patent data, scientific publications data, grant data technology monitoring and detection of trends.
What is the issue? https://conservationbytes.com/2015/05/04/twenty-tips-for-writing-a-research-proposal/
What is the issue? Duplication of research grants Difficulty to detect scientific overlap in research grant applications and grants No means to detect applications for grants submitted to two or more different funding sources No means to spot resubmission of past failed applications
What is the issue? Study in US Funding agencies urged to check for duplicate grants, Nature, January 2013, volume 493. Reviewing US grant applications in publicly accessible databases. 1,300 applications with potential overlap (over 850,000 applications). 167 pairs very similar. ~$70 million in overlapping funds may have been awarded over the period 2002-2012 Europe? No such study. Or is there? European context 28 (fragmented) public funding systems for research in Member States, Funding for research at international level (fragmented). 14 (fragmented) public funding systems in H2020 associated countries
JRC contribution Semantic similarity platform for research grants applications To give evaluators the means to compare incoming applications to a corpus of grants and other relevant documents Increase quality of applications Decrease duplication To give applicants the possibility to retrieve previous similar grants Increase quality of incoming applications To give non-public funding entities means to better assess the quality of submitted projects To support the dialogue between the European Commission and funding agencies on data standards Using (open) grant data from MS, Commission, and other sources. (Legal support from Central IP service of the Commission).
Evaluation process Expert evaluation Application Translation + English text in user interface Semantically similar grants, publications Flagging of application/grant pair with similar applying entities Semantic comparison Module + Entity matching module Indexing Index of grants data + other relevant data Grants data from funding agencies
Technical feasibility It is working on my machine Most similar patents to a grant on hydraulic actuators. (Grant from National Research, Development And Innovation Office of Hungary). Most similar EU grants to a grant related to nanotechnology. (Grant from FP7). Most similar publications with a proposal on oxidative catalysis using metalloenzyme. (Grant from National Research, Development And Innovation Office of Hungary).
What is next? Now: Finalising proposal before submitting for ISA² funding Public funding agencies from Spain (FECYT, SEIDI) and Hungary (NKFIH) ready to take part. Looking for 2 additional MS to be represented for proof-of-concept phase. 2018: Proof-of-concept User requirements (MS funding agencies + EC) Development pilot platform + networking Evaluation by evaluators Go/no-go (2019-2020: Full deployment)
Thank you for your attention! On JRC: https://ec.europa.eu/jrc/en On TIM: www.timanalytics.eu On EMM: http://emm.newsbrief.eu/newsbrief/clusteredition/en/latest.html For any inquiry: JRC-TMA-CC@ec.europa.eu Or Olivier.eulaerts@ec.europa.eu