Promises, Promises: How to Identify a Bad Hiring Test (Part I of II)

test You probably already know there are hundreds of self-report tests promising great hires. What you might not know is most of them are poorly designed junk. Why is this important to know? First, the test user — not the test vendor — is primarily responsible for test use. Second, junk tests hire too many wrong people and turn away too many right ones. Finally, if the first two do not bother your conscience, consider the cost of poor hiring practices is estimated between 20% and 50% of yearly payroll. So how do you identify the junk tests? Let’s start with a few basics.

What’s a Test?

A test is anything used to distinguish between a qualified and unqualified job candidate. It has questions. It has answers. And, it is scored. The most popular test (and one of the least accurate) is called an interview. Tests also come in other formats including application forms, resume reviews, candidate sourcing, web applications, pencil and paper questionnaires, and so forth. Everyone uses tests. Get used to the idea.

Better, Not More

In your heart of hearts, you probably already know that not all test factors are important to performance in all jobs. People who do the research consistently identify six job-fit factors, three job-performance factors, and three job-skill factors that make the performance short list. What about the rest? Overlapping or irrelevant to job performance. In fact, thanks to computers, we know that approximately 28,000 personality related descriptors collapse into only about 5-7 general factors. These are often loosely referred to as the Big Five: In other words, you can ask someone to complete a 28,000-item questionnaire, score their answers, and get only about five-seven common themes. More is not better; more is just confusing.

On the other hand, there are just three critical skill factors that consistently relate to job performance. These are mental horsepower, organizational ability, and interpersonal skills. It’s common sense. High performers are smart enough to do the job, self-organized, and have people skills. What about tests that ask a few questions and then produce long narrative reports? Well, I think you already know the answer to that question: it sells more tests.

How do you know which factors should be measured for your job? Isolate only the ones that directly affect high and low performance. Unsure about which factors or whose test to use? Stick with unstructured interviews. At least your mistakes will be randomized.

Hide and Seek

Do tests that rely on self-descriptive questions uncover hidden secrets? Consider this. You answer a few questions (i.e., Do you like being around people? Are you generally the life of the party? Are you outgoing? Do you easily talk with strangers?) and then someone adds up your scores. Miracle of miracles! Your personal report says you like being around people, being the life of the party, being outgoing, and communicate easily with strangers! Tests that just total up your scores and report back either your own answers or words that could describe most anyone should stay in the training room. Only a well-designed hiring test can identify patterns that might hinder or facilitate work.

Performance or Preferences

Self-reported tests come in a variety of applications. Clinical tests evaluate dysfunctional behavior so professionals can effectively treat patients (e.g., MMPI). They often contain personally invasive questions about sexuality, violence, psychopathic thoughts, and anti-social behavior (e.g., bad mojo unless you are a licensed professional treating a mental patient).

Workshop tests evaluate differences between people (e.g. DISC, MBTI, ACL, and so forth). They often contain factors the author thinks are interesting. Many of the most popular workshops tests are based on old, obsolete research. Their authors are long gone. One test I recently reviewed even quoted Hippocrates as a source. I’m surprised they did not reference the four-humors medical theory as well.

Hiring tests are usually deadly dull. Casual readers should immediately be suspicious of any vendor who claims their test invokes lost mysteries, the author was the founder of a mystery school, or alien astronauts hand-delivered this wisdom to earth 1,000 years ago. Also beware if the vendor is well-known for its training workshops, is not a trained selection test expert (there are only a few thousand in the world), or will sell the test to anyone without examining their educational pre-qualifications. That should be your second warning not to use it for selection.

I find it’s fun to ask vendors, “Was your test specifically developed to predict job performance?” Legitimate ones are quick to produce pages and pages of studies and data. Wanna-bes either get surly or argue, “No, but the scores can be useful when making a hiring decision!” Huh? On one hand, they say their test does not predict performance, but on the other hand, they say it can be “useful” making a hiring decision. Whoa! Did someone put funny stuff in the brownies? If a specific test cannot tell me about a candidate’s potential job performance, am I not wasting my organization’s money and taking considerable risk with someone’s welfare?

Legitimate hiring tests are not only designed to predict job performance, they include boring job-related factors and studies that vendors are only too happy to share. By the way, if you don’t understand the mumbo-jumbo, then hire a selection test expert for a few hours to review it. It’s cheaper than a bad hire. I just finished reviewing a vendors’ report that contained so many errors, wrong terms, bad science, and misinformation that I considered submitting it to Saturday Night Live.

Lies, Damn Lies, and Statistics

Reasonable recruiters and hiring managers want to reduce turnover, improve productivity, and minimize training. They do this by testing applicants before putting them on the job. In other words, they want test-score assurance their decision will be the right one. Here is the bad news. Although you might have hoped you would never again encounter statistics, it is the language of test validation. And, brothers and sisters, it’s an area where many vendors use smoke and mirrors. Here are some examples:

Small or Unequal Samples: small or unequal sample sizes often produce squirrely data that cannot be trusted or generalized.
Group Clustering: dividing people into two groups and comparing their scores only tells you the groups are different. It does not tell you anything about individual skills.
Shot-Gunning: giving a broad multi-factor test to a group and looking for correlations is no assurance the factor causes performance (i.e., both ice cream sales and shark attacks increase in the summer).
Unclear Targets: productivity has many definitions; you need production data that is trustworthy and free from personal opinion.

I’ll continue this article in part two.