Following on from our feature on CVs last month, we investigate CV parsing ñ i.e. the extraction of information from CVs and loading it onto databases. Our expert contributors explain its significance, and what the future may hold.
Michelle Stewart, Marketing Manager at SkillsMarket describes the background to CV parsing for us: ìWhilst the introduction of online job boards and email has had a dramatic impact on the ease and speed that recruiters can advertise jobs and applicants can respond, nothing in this world comes for free. The downside has been the dramatic increase in the administration burden on recruiters, with recruiters in many cases becoming less productive. Generally speaking it might take a recruiter just 5 minutes to identify whether the candidate already exists on the recruitment database, update or add the contact details and any other relevant information and then attach a CV but when you receive just 12 a day, this turns into an hourís work. And recruitersí time is not cheap.
An answer had to be found and fast and it had to be centred around a technical solution that could handle unlimited volumes of CVs. Around the world a number of academics and experts from our fields started to look at this field and apply document parsing techniques from other industries to the CV. The biggest complication was that every CV is unique in layout, format, content and electronic file type.î
Steven P Finch, Founding Director and Chairman of Daxtra Technologies tells us of CV parsingís James Bond type origins: ìEnter a new breed of sophisticated CV parsing and analysis systems that work by applying highly complex pattern and language analysis techniques originally developed in the 1990s under military funding programmes to analyse vast collections of intelligence reports for shady intelligence agencies.
Adapting these algorithms to be sensitive to the type of information written in CVs allows the new generation of parsers to extract not only contact information, but also rich details of work histories, educational details and the skills that candidates have in the context in which they have used them. These systems can realistically achieve accuracies in excess of 92%, which can be described as ënear human accuracyí. They can process hundreds or thousands of CVs every day, they never get tired, and they do not suffer from ëFriday afternoon syndromeí. Most large recruitment agencies (and many smaller ones) use some form of automated data entry system that uses one of these new more accurate systems from a small clutch of vendors such as the UK's Daxtra Technologies, Canada's TTC or Burning Glass of California. This allows them to load thousands of CVs onto their databases every day much more quickly and cost-effectively than doing it manually.î
Davor Miskulin Director of Sales at Burning Glass Technologies says that parsing is today starting to be widely used across the online recruitment players from job boards to agencies to applicant tracking systems as well as corporate HR departments of the large employers. He explains how it works: ìWe deploy patented neural network technology to identify candidate data directly from the resume with the highest degree of precision. Using a set of techniques called Statistical Natural Language Processing (SNLP), our artificial intelligence engine is able to read and understand resumes by making probabilistic determinations based on linguistic context and inference. These algorithms are powered by data contained in our Knowledge Mineô, a unique repository of historical career data extracted from millions of placement decisions. This is important because no two resumes are identical, either in format or content. To keep the model in tune with emerging trends, the Knowledge Mineô is continually updated with new resumes approximately every twelve months.î
Advantages of CV parsing
Davor Miskulin says the advantages of his technology suite will unlock significant value for its users by dramatically reducing the cost of resume processing, and by helping recruiters to focus in on the most promising candidates, significant improvement in candidate experience as well as management of the existing employee base. Christine Ducos, Marketing Manager at Taleo says: ìCV processing tools are designed to integrate with any recruiting solution. Whether youíve invested in a commercial or custom-built system, these industry-leading tools can significantly improve your hiring process by targeting known problem points and improving the hiring experience for candidates and recruiters. She says features include:
ï Integrate with online application forms to minimise completion time and improve the candidate experience
ï Enable streamlining of the application process to attract valuable passive candidates
ï Eliminate manual data entry
ï Process incoming e-resumes quickly and efficiently Scale well to
ï Handle any volume. Convert resumes from any file type (MS Word, PDF, > RTF etc.) into standardised fielded dataî
So where's the catch?
Steven Finch says: ìAlthough the level of accuracy of these parsing engines is very good for some sets of CVs, they are sensitive to subtleties of language and local convention. Just because one product works well on CVs written in English from America, for example, does not mean it will work as well for CVs from Britain, let alone ones written in French or German! Although it is quite usual for vendors to advertise that their system works on all languages, some fail to mention very significant differences in performance. Companies who get a wide geographical distribution of applications need to take care to evaluate these products thoroughly before committing to buy. A difference in accuracy between 88% and 94% may not sound huge, but it represents a near doubling in the number of errors the system will make, and a near doubling in the costs of having to manually put the errors right.î
However, Davor Miskulin says that as his software is language independant in its core, an additional advantage is that it can handle foreign languages, like French, Dutch, German, or others as required, including Chinese and Japanese. In addition models are geography specific, so that every country/language has its own model, capturing details about candidate skills, knowledge and experience as well as career transition as dictated by the local employment market. He sees the disadvantage is that: ìLike any automated technology, parsing is not 100% accurateÖ.leading vendors are currently achieving between 90-95% accuracy and it seems that given the problem they are facing (unstructured text) this is the practical limit of the parsing systems. However, one should not forget that this level of accuracy is only a few percentage points lower then manual human (expensive) processing.î Michelle Stewart sees similar potential problems: ìWhilst a CV parser will automatically extract data from a CV, there is still a need to check for duplicates, handle errors and fill in the gaps that the CV parser will inevitably leave behind. Those companies using an off the shelf CV parser typically end up paying an in-house resource to manage this for them.î
The future
Christine Ducos predicts ìAs parsing technology improves over time, this can only increase the speed of extraction and also the quality of the fit of CV type unstructured data to structured application fields.î
Chris Buckley at HR Smart says: ìLeading providers have introduced ëintelligent parsingí where the technology can look at the CV of a person and conduct a database search to find similar people. (i.e. Find me one like this?) The same technology can be used to compare job attributes with CVís by linking it to a ëknowledge databaseí (i.e. Find me people for this Job). Latest developments include the ability to infer skills from a persons CV, even ones they did not know they had!î He also believes that Applicant Tracking Systems will use intelligent parsing to constantly monitor corporate as well as public CV databases to ensure they are the first to contact job seekers who will stand a high chance of meeting the job and company requirements.
Davor Miskulin thinks the fact that todayís advanced searching technologies can couple the processes at the initial stages of the recruitment process and that intelligent search engines looking for ëwork experience storyí rather then isolated keyword(s) will further differentiate role and type of services generalist job boards could offer in attempt to resist downward pricing pressures from likes of Google, Craigís Lists and etc. He foresees that such services will be wrapped under ëpremiumí job placement services, thus in effect deliver real revenue from the CV databases. He predicts that: ì2007 will for sure not be a repetition of 2006 and business as usual ñ ënumber of visitorsí will matter as much as the ëperceived valueí it delivers both to advertisers and candidates.î
Steven Finch predicts: ìPerhaps some of the most exciting advances are yet to come. Our R&D effort in DAXTRA is in constant search for new ways of streamlining the recruitment process, and several exciting new advances are already coming to the market. Candidate profiling/ routing promises to classify candidates according to their industry sector, career stage and level, their particular skills and qualifications so that they may be screened, or appropriate vacancies automatically suggested. The goal is to help recruiters more effectively find the people they need for the vacancies they need to fill.î
So as we have seen, the need for CV parsing is obvious in this online recruitment age. As Steven Finch concludes: ìRecruitment will always be about people, but new technology such as the new generation of CV parsing and analysis systems promise to give recruiters a very effective tool they can use to match the right candidate to the right vacancy. And earn the right fee!î
Case studies from SkillsMarket
Les Duncan, UK Managing Director of Hays IT, Europeís largest specialist recruitment company, commented: ìAs soon as we receive a CV we now automatically convert it into an iProfile for candidates to verify online. It has already transformed the way in which jobseekers keep us updated with their latest skills and availability. As a result of the iProfile we also get updated with the latest details on over 35% of all jobseekers, which compares to an industry average of approximately 9% even with a healthy spend on advertising. More importantly we also get candidates updating that are not actively looking for work and these are invaluable during a skills shortage as there is now.î
Marcus Villa-Buil, Director of one of the UKís leading recruitment companies, Veritas, commented: ìInitially we chose to go with an off the shelf CV parser. Whilst it was significantly faster than manual data entry we found that we still needed to employ resource to manage all the administration that came with it and because it didnít achieve the same accuracy levels as manual data entry we also needed to undertake our own final data accuracy checks. We eventually decided to switch to SkillsMarketís CV processing service simply because they make CV processing easy by taking away all this admin for you.î
CV Parsing ñ what is it and why do you need it?

Online Recruitment Magazine Feature