Receive daily articles & headlines each day in your inbox with your free ERE Daily Subscription.

Not logged in. [log in or register]

Is Your Hiring Test a Joke?

by Aug 23, 2012, 10:09 am ET

When something looks good on the surface, but completely without merit, it is called a joke. You might not have thought of this before, but many hiring tests fit that bill. I’m talking about tests that deliver numbers and data that look good on the surface, but do nothing to predict candidate job success … in other words, scores do a better job predicting vendor sales than employee performance. Let me explain why, beginning with how professionals develop a hiring test.

What Works: Professional Standards

Professionals always start with a job theory that sounds something like this: “I believe factor X affects job performance.”

Next, they draft some X items and give their test to hundreds of people, tweaking and tuning the items along the way. Then they use one or more methods to test whether scores are directly associated with job performance; for example they might give their test to everyone upon hiring, ignore the scores, and later compare test scores to job performance. This is called predictive validity. They could also give their test to people already on the job and compare test scores to job performance. This is called concurrent validity. Both methods have their pros and cons.

Drafting a stable, solid, and trustworthy hiring test takes months of writing, editing, running studies, and systematically examining the guts of the test at both the item and factor level. This is the only way to know test scores consistently and accurately predict job performance.

Bad Joke Examples

A while ago I reviewed a test supposedly developed for retail hires. The vendor’s own test manual showed scores predicted nothing. Not shrinkage. Not theft. Not turnover. Not performance. Zilch … nada …nothing! Still, the vendor with a straight face, claimed it “could be helpful” for hiring … you know …. like claiming it predicts job performance even though it doesn’t?

Another time I was asked by a proud author to look at their web test. I intentionally answered every multiple choice question with the same letter (i.e., a technique to see if it would produce junk scores). After the vendor told me the test results described me exactly, I explained what I did. Then, I went on to explain the kind of work necessary before it could be considered professional. They replied their investors would never stand for that. Wouldn’t it be nice to have … you know … accuracy?

In a final example, a user claimed a certain well-known test would predict management success based on ego-drive. He maintained this trait was desirable for managers. I said that was a nice thought, but if I was rejected for having a low ego-drive score, I would want to see proof ego-drive was necessary for job performance and then demand to see a study that showed my score predicted job performance. We did not talk much after that. I guess I was being downright unreasonable by expecting a test user to show scores predicted job performance

How to Develop a Joke Test: Begin With Ignorance

Ignorance is not a permanent condition. It can be fixed. So why do people think, without taking a single class in identifying job skills, measuring job performance, or psychometrics, they know how to develop a hiring test that meets professional standards?

It takes cooperative organizations, patient candidates, honesty, accuracy, and a boatload of statistical work. In fact, here is a link to a book on how professionals do it. If you think you want to develop a test, or fix the one you market now, read this book thoroughly. If you only want to buy a good test, ask your vendor for proof he/she followed the standards. If the vendor never heard of it, or claims it’s too complicated for the average person, then the test is probably bogus!

How to Develop a Joke Test: Assume Personality Scores = Skill

I attended a course on the DISC once when the instructor mentioned it was often used to hire salespeople. What? DISC factors predict job performance? DISC scores are just differences between how people answer questions, not differences in performance! Not only is DISC scoring weird, the “either/or” scoring method requires rejecting one factor every time another is chosen — thus two people can provide completely different answers but get the same score! Furthermore, its theory was originally based on soldier-behavior under combat conditions. And, just because the vendor thinks all salespeople should be pushy, does that mean all customers enjoy dealing with salespeople who are high D’s?

Personality score differences are not skill differences.

How to Develop a Joke Test: Average Everything

Averages are particularly insidious because they look job-credible. For example, a vendor gives a generic (usually homegrown) test to 100 truck drivers, or 200 salespeople, or some other job title, averages the scores, and exclaims his/her test scores predict success in driving a truck, selling, or in some other occupation! Are all the people in the sample equally competent? Did they all earn high marks for job performance or low turnover? Are all the truck drivers in the group doing identical work? How might you explain why some individual truck drivers score exactly the same as individuals in other jobs?

Remember that, on average, a person with one foot in a fire and the other in a bowl of ice is perfectly comfortable. Of course, a disreputable test vendor is perfectly comfortable selling junk because he/she really does not know, think, or care about selling averages.

How to Develop a Joke Test: Toss and Stick

Imagine giving a test to a high-performing group of employees, averaging their scores, and using the mean as the job target. Whoa! The state of job prediction science just regressed to throwing lots. This technique is plagued with problems: the vendor assumes each factor affects job performance; average scores hide individual differences; people in the low group are often ignored; and, the biggest joke of al l… the differences probably happened by chance. I had one vendor tell me that “Toss-and-Stick” was just another way to confirm a test works. I must have missed that class in grad school.

How to Develop a Joke Test: Circus Acts

Let me introduce you to Professor Bertram Forer. Forer gave his college students a personality test, but instead of giving back their actual scores, he gave each student an identical report gathered from several horoscopes. Using a 0-to-5 agreement scale, students averaged 4.26.  In other words, although entirely different people received the same personality description, virtually all individuals agreed it described them to a “T”.This experiment later became termed the Barnum Effect, after PT Barnum, who always made sure he had something for everyone. Junk test vendors take advantage of the Forer effect when people get so excited about their test scores in a training workshop, they want to take the test into the hiring/promotion arena.

Another user Circus Act is the “one-off” effect. That is, some users tend to think their recollection of one or two exceptions makes the rule. This often sounds like, “That can’t be right. I knew someone who …” That’s bad human judgment at work, and a great reason why people need to base hiring/promotion decisions on hard test facts. And, let’s not forget, interviews are tests: verbal ones. They have something to measure, use questions, and right/wrong answers.

How to Develop a Joke Test: Summary

The marketplace filled with junk and deception: wrong-headed vendors seek more sales; trainers and managers mistakenly think training tests predict job performance; professional test practices are treated with ignorance and disrespect; occupational averages wrongly predict performance; meaningless organizational groupings and averages predict nothing; and, so forth.

Think about it. When someone uses or sells an unprofessional test they are really saying, “I don’t care how many careers are ruined by my bogus test scores, or how much money is lost by making a bad hire, these inaccurate tests help make better hiring decisions.”

Are you laughing yet?


photo from Bigstock

This article is provided for informational purposes only and is not intended to offer specific legal advice. You should consult your legal counsel regarding any threatened or pending litigation.

  • charles handler

    Wendell thanks for the excellent insight. I fight battles every day to help educate my clients about what it takes to do things right. On the surface tests may all look the same, but what lies beneath is a serious differentiator. Testing is serious business and if you dont start with reliable and sound items that you can prove measure what you intend them to, you dont have a real test.

    I encourage test consumers to involve a professional if you are not sure what to look for. It will make a big difference in the end.

  • Paul Basile

    Wendell, all true and useful. Also true and important is that there are good assessments, excellent assessments with predictive and concurrent validity that are massively underused. Like all good things, good assessments attract poor imitators. But let’s not forget that there exist legitimate and valuable solutions to be leveraged.

  • Paula Soileau

    Nice piece and I agree with comments from Dr Handler and Paul Basile. there are good assessments out there – that are predictive and actually make a difference. It is up to the buyer to understand what an assessment measures, and whether it is predictive. I run across businesses frequently that think an assessment is an assessment, or that don’t know what the specific assessment they’re using really measures and whether it is predictive. I also talk with businesses who, well-intended, use good tools but apply them to uses they were not intended for.

  • Keith Halperin

    Thanks Dr. Williams. Adding on to what you said, ISTM that the purpose of selling products and services for recruiting, hiring, and many other purposes is not to provide something that works, but to:
    1) Enrich the seller/provider of the good or service.
    2) Make the buyer feel better for buying the good or service.

    I would like to see a neutral, objective organization which doesn’t accept advertising (like Consumer’s Union) provide empirical, data-based reviews and ratings of recruiting products and services and *employers). Then it wouldn’t be “recruiter/manager/applicant/employee emptor”.


    Keith “Show Me the Proof” Halperin

    * I thought Glassdoor might be like this, but I was wrong.

  • Ken Schmitt

    Excellent article- there’s nothing funny about a poor hiring process and bad hiring. In an article I wrote this week entitled “Is Your Hiring Process a Turn-Off” ( I looked at ineffective hiring processes as well. If it isn’t poor tests, it’s inefficient and ineffective hiring procedures. Hiring a “best in class” talent requires a “best in class” process.
    Ken Schmitt

  • Keith Halperin

    @ Ken: “Best in class”?
    Most companies are lucky to get “showed up in class”….


    Keith “Was Best in Class Many, Many Years Ago in A School Far, Far Away” Halperin

  • Carol Schultz

    Wendell: Thank you, thank you, thank you for writing this. I believe there are something like six thousand of these types of tests. I’d assert there are so many because at PT Barnum said, “There’s a sucker born every minute.” Why else would there be so many?

    I can’t tell you the number of vendors that have called me to present their profile system and how it’s the “best”. They wan’t me to recommend it to my clients. I’ve also had vendors try to get me to train on their products (for thousands of dollars) so I can then sell directly to my clients for something like 20%. Clearly there are plenty of suckers out there who buy into this model. Can you imagine Oracle or SAP telling their resellers that they have to pay thousands of dollars for the privilege of reselling a product that ultimately make the OEM millions?

    I recently had a client in NYC ask me about profiling his organization. I told him I don’t use these types of tests unless it’s absolutely necessary and only as an adjunct to a process we put in place that works.