WP1 - Web Scraping for Job Vacancy Statistics

Similar documents
Big Data ESSNet - WP1 Research Plan for SGA-2 (version 2.0)

REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL

Skillsnet workshop. "Job vacancy Statistics"

First quarter of 2014 Euro area job vacancy rate up to 1.7% EU28 up to 1.6%

REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL

ESS activities on economic globalisation and GVCs. FDI liaison platform Meeting 18 June 2014

A European workforce for call centre services. Construction industry recruits abroad

The EU ICT Sector and its R&D Performance. Digital Economy and Society Index Report 2018 The EU ICT sector and its R&D performance

Implementation of the System of Health Accounts in OECD countries

European Vacancy Monitor

ESSnet WP1: Webscraping Job Vacancy advancement review - France

YOUR FIRST EURES JOB. Progress Monitoring Report. Targeted Mobility Scheme. EU budget: January June 2016 Overview since 2015

ECHA Helpdesk Support to National Helpdesks

New versions of the GALI proposed by Eurostat

PRESS RELEASE Brussels, 11 June 2015

Document: Report on the work of the High Level Group in 2006

ManpowerGroup Employment Outlook Survey New Zealand

Info Session Webinar Joint Qualifications in Vocational Education and Training Call for proposals EACEA 27/ /10/2017

Unmet health care needs statistics

Q Manpower. Employment Outlook Survey Global. A Manpower Research Report

Erasmus+: Knowledge Alliances and Sector Skills Alliances. Infoday. 23 November María-Luisa García Mínguez, Renata Russell (EACEA) 1

IN-PATIENT, OUT-PATIENT AND OTHER HEALTH CARE ESTABLISHMENTS AS OF

Information Erasmus Erasmus+ Grant for Study and/or Internship Abroad

KA3 - Support for Policy Reform Initiatives for Policy Innovation

Q Manpower. Employment Outlook Survey Global. A Manpower Research Report

Measuring R&D in the Nonprofit Sector: The European Experience

ManpowerGroup Employment Outlook Survey Hong Kong

Long term implications of the ICT revolution: applying the lessons of growth theory and growth accounting

NHS Vacancy Statistics. England, February 2015 to October 2015 Provisional experimental statistics

State of the art of CSPA Implementation

Exploiting International Life Science Opportunities. Dafydd Davies

***** 2 October 2018 (pre-ministerial day) *****

Erasmus for Young Entrepreneurs Users Guide

Spreading knowledge about Erasmus Mundus Programme and Erasmus Mundus National Structures activities among NARIC centers. Summary

Developing an EU Standardised Approach to Vocational Qualifications in Healthcare Waste Management

CALL FOR APPLICATIONS FOR STATE SCHOLARSHIPS IN HUNGARY 2018/2019

ManpowerGroup Employment Outlook Survey Global

ManpowerGroup Employment Outlook Survey Global

PUBLIC. 6393/18 NM/fh/jk DGC 1C LIMITE EN. Council of the European Union Brussels, 1 March 2018 (OR. en) 6393/18 LIMITE

Employment in Europe 2005: Statistical Annex

Seafarers Statistics in the EU. Statistical review (2015 data STCW-IS)

ManpowerGroup Employment Outlook Survey India

HEALTH CARE NON EXPENDITURE STATISTICS

An action plan to boost research and innovation

A QUICK GUIDE TO MARIE CURIE ACTIONS 2010

ManpowerGroup Employment Outlook Survey Czech Republic

ERA-Can+ twinning programme Call text

Meeting Summary. Expert Meeting on new labour market data sources. 18 June 2018 OECD Boulogne Conference Centre

Measuring the socio- economical returns of e- Government: lessons from egep

Q Manpower. Employment Outlook Survey India. A Manpower Research Report

Making High Speed Broadband Available to Everyone in Finland

Manpower Employment Outlook Survey

Quarterly Monitor of the Canadian ICT Sector Third Quarter Covering the period July 1 September 30

Introduction & background. 1 - About you. Case Id: b2c1b7a1-2df be39-c2d51c11d387. Consultation document

ManpowerGroup Employment Outlook Survey Global

This document is a preview generated by EVS

BRIDGING GRANT PROGRAM GUIDELINES 2018

Notification of Intent to Invite International Competitive Bids for the

RELAUNCHED CALL FOR APPLICATIONS FOR STATE SCHOLARSHIPS IN HUNGARY 2017/2018

FOR EUPA USE ONLY ERASMUS+ PROGRAMME EN

Employability profiling toolbox

Jobs Demand Report. Chatham-Kent, Ontario Reporting Period of October 1 December 31, February 22, 2017

Equal Distribution of Health Care Resources: European Model

The ERC funding strategy

Toolbox for the collection and use of OSH data

EU-initiatives relating to dams and tailings management. SveMin Environment Conference Johannes Drielsma 12 October 2016

RECRUITING IN EUROPE

ManpowerGroup Employment Outlook Survey Singapore

ERC Grant Schemes. Horizon 2020 European Union funding for Research & Innovation

E S S n e t B i g D a t a S p e c i f i c G r a n t A g r e e m e n t N o 1 ( S G A - 1)

ESF Member Organisation Fora on. European Alliance for Research Career Development

Report from the CMDh meeting held on November 2013

APPLICATION FORM ERASMUS STAFF TRAINING (STT)

Manpower Employment Outlook Survey

FREINZ Final Report. Executive Summary

Labour market policy expenditure and participants

Manpower Employment Outlook Survey India. A Manpower Research Report

Jobseeking in other EU/EEA countries while drawing Swedish unemployment benefit second quarter 2004

Manpower Employment Outlook Survey Australia

International Cooperation Types of Activities

Manpower Employment Outlook Survey Australia

Persistent identifiers the needs. Gerry Lawson (NERC), Barcelona Thursday 6th September 2012

EU Poison Centres Webinar. 27 May 2014, 9:00am BST

Birth, Survival, Growth and Death of ICT Companies

THE RELATIONSHIP BETWEEN EDUCATION AND ENTREPRENEURSHIP IN EU MEMBER STATES

NOTICE OF SELECTION ERASMUS FOR TRAINEESHIP GRANTS Academic year 2018/2019

The European Institute of Innovation and Technology (EIT) A Body of the European Commission Status, past and future

Harmonized European standards for construction in Egypt

COMMISSION STAFF WORKING DOCUMENT. Assessment of stakeholders' experience with the European Professional Card and the Alert Mechanism procedures

Brokerage for the first ProSafe Call Dina Carrilho Call Secretariat Foundation for Science and Technology (FCT), Portugal

HORIZON 2020 Instruments and Rules for Participation. Elena Melotti (Warrant Group S.r.l.) MENFRI March 04th 2015

a guide to re-evaluation

COMMISSION STAFF WORKING DOCUMENT IMPACT ASSESSMENT. Accompanying the document

ManpowerGroup Employment Outlook Survey Hong Kong

The Erasmus+ grants for academic year are allocated as follows:

EIT RawMaterials Call for KAVA Up-scaling projects Instructions and process description

ITU Statistical Activities

FOHNEU and THE E UR OPEAN DIME NS ION. NANTES FR ANC E 7-9 NOVEMB ER 2007 Julie S taun

The EUREKA Initiative. Matteo Fedeli EUREKA Secretariat

EU harmonization of the information for emergency health response (Art. 45 Regulation 1272/2008 )

Transcription:

WP1 - Web Scraping for Job Vacancy Statistics Big Data ESSNet CG Meeting, Brussels, 26-27 October 2017 Nigel Swier

Rationale Current Official Estimates (Survey) Online data Frequency Quarterly Real-time? Industry Sector Enterprise Size Job type / skills Geography National Totals More frequent More timely More granular Less burden Cheaper???

Participants (BD ESSNet WP1) United Kingdom (lead) Germany Sweden Slovenia Italy Greece Original Partners France Belgium Denmark Portugal Joined for SGA-2

Summary of Issues Complex and dynamic landscape. Many possible routes for accessing OJV data Processing OJV data is highly resource intensive Fundamental differences between OJV data and JVS concepts Incomplete (and unrepresentative) coverage Technology Platforms Data Science Skills

Review of SGA-2 Objectives

Task 1: Data Access To explore the feasibility of web scraping job vacancies from enterprise websites using the approaches developed by WP 2. To compile a list of URLs linked to enterprise units on the business register. The methods developed by WP2 will only provided limited information. This will not help us deliver experimental outputs by the end of the ESSNet, but it could be of benefit in the longer term. UK have developed a framework for building website specific mini-bots to obtain OJV counts from enterprise websites

Task 2: Data Handling If useful information can be extracted from enterprise websites, to develop a method for integrating this with data from job portals. To investigate and develop text-mining and machine learning approaches to extract information from unstructured text (e.g. supplementary information for coding/validating occupation, deriving skills, qualifications). Several text mining / classification experiments underway or planned: Greece (ISCO-08), France (FAP/ROME), Belgium (NACE) Avoiding overlaps with CEDEFOP

Task 3 Methodology and Technology To refine methods to improve the quality of the experimental job vacancy estimates produced during SGA-1 including improvements to linking and handling of jobs advertised on the web. To consider which stages of the GSBPM for job vacancy statistics could incorporate these new data sources and methods (e.g. data collection, data integration, non-response adjustment, increased precision) Still a lot of work to do around quality and estimation. This will not be complete by the end of the ESSNet

Task 4: Statistical Outputs To produce improved experimental estimates incorporating additional sources (More details later) To produce new experimental statistical products in the domain of job vacancies (e.g. estimates by geography and/or occupation group) (More details later) To explore whether the findings of this pilot could be used for new applications. For example: Comparing vacancies and associated skills requirements within an area to skills with the local labour market (Explored as part Eurostat Hackathon) Maintenance of occupation classification and coding frames (Greece, Sweden, France) An input into flash estimates for economic statistics (UK focus more details later)

Task 5: Future Perspectives Hold a two day workshop for sharing experiences in the field of job vacancy data for official statistics purposes (September/October 2017) Complete! Develop and implement a strategy for ongoing engagement and development on the use of web scraped job vacancy data for statistical purposes within the ESS. A longer-term roadmap for moving experimental ESSnet research into statistical production. Basis for collaboration established with CEDEFOP Continue as part of next ESSNet?

WP1 Meeting: Thessaloniki, 21-22 September

CEDEFOP Collaboration CEDEFOP project to scrape and process OJV data for all member states Completion in 2020 (but with some data available from end-2018) Need to avoid duplicating processes / activities CEDEFOP have identified similar issues around data quality Question: How can the ESS(Net) add value? Answer: Expertise in quality and privileged access to JVS micro data

Meeting objectives: Fully explore what experimental outputs could be produced by the end of SGA-2 Integrate new partners (and new people) into the WP Elaborate the collaboration with CEDEFOP Agree roles and develop a plan for activities and deliverables for SGA-2 Think about the future

Key Outcomes: Arrangements for sharing code & collaborating (e.g. Slack) Concrete actions agreed with CEDEFOP Country based pilot research plans for SGA-2 focused on producing concrete results Information shared through an expanded network

SGA-2 Research Plans

United Kingdom Data: Burning Glass, 2 major job search engines plus web scraping framework, CEDEFOP pilot, JVS Processing/Methodology Match OJV counts (from various sources) to JVS reporting units. Use the JVS counts and machine learning to train a model with OJV counts, industry, size indicators as features Use model to predict current vacancy estimates Expected Output(s): Weighted estimates? Job Vacancy now cast estimates

Enterprise count comparison (by portal)

Greece Data: Scraped job ads Already manually coded data to be used as training data Methodology: Use of text mining and ML approaches to classify job titles (and job descriptions) to ISCO-08 Expected Output: Occupation codes classified to job titles

Slovenia Data: Data from two job portals, enterprise websites, administrative data from Employment Service Processing/Methodology: Expand framework for collecting data by web scraping, administrative data and (some) enterprise websites Expected Output: Estimates of available OJVs (reference day/month) Estimates of newly available OJVs (reference day/month)

France Data: Web scraped data Public Employment agency data (partnerships with 150 job portals) Processing/Methodology: Cleaning, deduplication, Harmonising site specifi nomenclatures Matching OJV with administrative and JVS data Expected Output: Portal specifc vacancy count (monitor trends over time and compare with JVS)

Germany Data: Federal Employment Agency (portals and JVS data), CEDEFOP pilot, Stepstone Processing/Methodology: Matching of OJV and JVS data Explore feasibility of matching (particular challenges in Germany) Expected Output: Reliable matching methodology

Sweden Data: Swedish Employment Agency 2 large job portals Processing/Methodology: De-duplication and Matching Further develop quality framework Time series modelling Expected Output: Time series comparisons of OJV and JVS data Disaggregations (geography, NACE, ISCO)

Belgium Data: Administrative data from regional employment agencies Processing/Methodology: Machine learning model for predicting NACE code based on job description text. Multi-lingual model (French, Dutch, German, English) Expected Output: Model for predicting NACE

Looking Ahead

Looking Ahead Short-term (until May 2018) Workshop planned to validate CEDEFOP web scraping system (March 2018, Milan) Strategy for ongoing engagement: March 2018 Final SGA-2 technical report (including a roadmap for moving experimental research into production) May 2018 (Review at beginning of May?) Long-term An ESSNet network collaborating with CEDEFOP? Part of the next ESSNet? (but still might not completely deliver production ready outputs)

Budget Issues Denmark may need to withdraw Portugal role in WP1 still not defined