The Right — and Wrong — Way to Use Work-Sample Tests

Imagine that you’re in a band and you need to find a new guitarist. A friend of yours recommends someone they know who has been playing the guitar for over 10 years and even graduated from Berklee College of Music, one of the most prestigious music schools. That all sounds good, but you’d probably still want to hear them play first, wouldn’t you?

Oddly enough, most companies do not embrace this idea. They base their hiring decisions on education level, years of experience, candidate claims, and similar factors, without ever actually seeing how the candidate performs doing actual work.

That’s where work-sample tests come in. There is no better hiring method because they are based on the idea that the best predictor of future behavior is observed behavior in similar conditions.

Predictive Validity

To understand why work-sample tests are so good at predicting job performance, we need to look at their predictive validity, which is a psychometric measure that shows how well a score on an assessment predicts some criteria. In other words, it measures how well a test tests what it’s supposed to test.

Research shows that work-sample tests have the highest validity when it comes to predicting the potential job performance of applicants. (Only two other hiring methods have a similar predictive validity: cognitive ability tests and structured interviews.)

Various types of work-sample assessments are already relatively common in certain industries. For instance, candidates for programming jobs are often screened using small programming tasks as the very first step in the hiring process. Though the approach is not as common in other industries, there is no reason why it shouldn’t be. All you need to know is how to ask the right questions.

Types of Questions to Ask

Work-sample assessments are essentially practical tests of skill. But for them to accurately assess a candidate’s abilities, you need to understand which level of skill the test questions should target. Bloom’s Taxonomy can help us here.

The framework identifies six cognitive abilities related to learning and orders them by complexity, from lowest to highest level. Here’s a brief overview of each level, using foreign-language learning as an example:

Level 1: Remembering. This level is simply about recalling or recognizing pieces of information. You might remember that the French word for dog is chien, but this doesn’t make you proficient in the language.

Level 2: Understanding. Understanding pieces of text or audio in a foreign language is a step above merely remembering specific words. But this is still a passive skill that’s relatively easy to learn compared to being able to express yourself in another language.

Level 3: Applying. Using language to have a conversation or write an email requires both remembering words and understanding context to apply that knowledge practically. It demonstrates some language proficiency.

Level 4: Analyzing. Analyzing requires the ability to break down information into essential parts, often to solve a problem. For example, figuring out the intention or meaning behind a piece of text to translate it accurately.

Level 5: Evaluating. Evaluating requires the ability to analyze something to determine its value, like discussing the literary value of a book. This means you need to be able to judge and criticize information and draw conclusions.

Level 6: Creating. Creating requires all the previous levels of knowledge to make something new, such as writing an article, essay, or fictional story.

The only levels you should be concerned about are 3, 4, and 6. You can safely ignore the others.

Why? Because Level 1 and 2 questions are trivial and answers can be easily searched for online. Meanwhile, Level 5 questions do demonstrate a large amount of knowledge; however, it’s extremely difficult to score them since evaluation is subjective.

If someone wrote a long argument about why your favorite movie is terrible, would you give them a good or bad score? It would be hard to stay unbiased, but even if you end up agreeing, this doesn’t necessarily mean that the person is right, just that they are convincing. Someone displaying this level of competence may be very knowledgeable about the subject, but it’s difficult and time-consuming to judge their ability.

Surprisingly, Level 6 questions are quite easy to score and grade objectively. If you’re hiring a journalist, you can ask them to write a news report based on a fictional story. If you’re hiring a programmer, you can ask them to create a small program.

Less Bias, Quicker Results

One of the biggest benefits of work-sample testing, aside from directly being able to see what a candidate can do, is that you can avoid bias by using tests.

A good work-sample test avoids bias completely. By giving all candidates the same assessment and judging them based on their knowledge and skills, you’re not taking into account age, race, gender, sexuality, or any other irrelevant proxy that can cause bias, even subconsciously. Seeing what each candidate can do under the same conditions allows you to objectively and transparently compare all candidates based purely on their abilities.

This is also why it’s a good idea to have a short work-sample test (around 30 minutes) as the very first step in your hiring process. Each corporate job opening gets 250 applicants, on average. Work-sample tests are both a quicker and more accurate way to filter them out, compared to resume screening.

As soon as a candidate applies for a job opening, you can send them a work-sample test, or even include a link to an online work-sample assessment in your application form. This way you can quickly and easily narrow down the list of candidates to the more qualified ones right from the start. You can always have a second round of testing later if you want a longer and more in-depth assessment of skills.

It’s a much more accurate method of candidate selection than resume screening, which is the typical first step in the hiring process for many companies. But screening resumes is so inaccurate that it’s useless, and it can easily trigger unconscious biases. Work-sample tests are much better on both accounts.

Mistakes to Avoid

Some companies make the mistake of asking candidates to do too much work. They might have trial projects that last a day or even several days. Such tasks miss the point entirely. The tasks should represent work, yes, but they should be a sample of work, and take at most a few hours.

Yet even a few hours is on the longer side; you can have a work-sample task that takes about 30 minutes. Three such work-sample tasks would take an hour and a half. That is more than enough time to assess a candidate’s abilities.

Any task that takes too long will just make the candidates give up entirely and decide that it’s not worth their time. They may feel like you’re asking them to do unpaid work, rather than just testing their skills.

It’s also important to remember that work-sample tasks shouldn’t be used as the sole method of selecting whom to hire. As accurate as they are, no hiring method is good enough on its own to avoid the necessity of using other methods. Work-sample tests are best combined with other highly effective hiring methods, such as cognitive ability tests and structured interviews.

Ultimately, work-sample tests are one of the most accurate and effective hiring methods. You can use them to find out which candidates are good at what they claim to be and filter out those who are only good at promoting themselves without being able to back it up. They can also help you find more humble candidates who don’t like bragging about their accomplishments but are highly skilled. Don’t miss out on them.