Some readers might recall Cliff Clavin, a character on the old “Cheers” television program. Cliff was a postal worker with a massive blind spot. In spite of all evidence to the contrary, Cliff thought he was smart. Unfortunately, Cliff’s IQ was only slightly higher than first-class mail. Albert Einstein, on the other hand, was a theoretical physicist who expanded our understanding of the physical universe. He was so smart that researchers kept his brain in a jar for study (after he died, that is). Now, let’s suppose both Cliff and Al decided to apply for a job. Let’s further suppose they both took a test that asked questions about their intelligence, problem-solving ability, school subjects, success attitudes, sales ability, customer service, and management style. Will their test scores accurately predict ability? Hopefully, you said, “No way! A management test, personality test, sales test, or any other kind of self-reported test, generally predicts success only if someone is too dull to fake good. That is, we could probably trust a low score, but we would have to be very cautious of high ones.” Excellent response! We should also not be surprised to learn that controlled research studies confirm the fact that people who “fake good” on self-reported tests can outscore folks who give honest responses. Burn that into memory: People who “fake good” on self-reported tests can outscore folks who give honest responses. This is the problem with many tests marketed for hiring. At first blush, they may seem like the answer to all your prayers, but experience shows they give a false sense of security. Al’s high score in “problem solving” for example, might be the same as Cliff’s, with one “small” difference: Al has an abundance of mental horsepower that Cliff lacks. Validity? Validity means someone conducted a formal study that showed test scores predicted performance for their job. The same validity study cannot be assumed to work for your job. Validity is local ó local to the organization and local to the job. Local. The only time a test user should trust someone else’s validity data is when he or she knows (really, really knows) that both jobs are virtually the same. But since everyone insists that his or her company is different, using external validity studies become problematic, yes? Well, let’s just make validity even more complicated. Validity scores are often assumed to fall along a straight line: a score of 10 equals 10% performance, 50 equals 50% performance, 100 equals 100% performance, and so forth. That’s what traditional statistics evaluate: straight-line, normally distributed relationships. The trouble with relationships, however, is that they are generally not linear. A 20% difference in test scores seldom translates into a 20% difference in job performance. Test scores and performance ratings are often error-filled, and test scores can be too low, just right, or too high. For example:
Sorry about that last mental image. It was cruel, but a few weeks of therapy should help. Putting Your Gut First? Psychologists tend to be a pretty liberal bunch. While I was in grad school, many of my classmates argued for the “job equality” of men and women. Nothing wrong there. So I tried a little experiment in cognitive psychology to see if inner feelings matched public words. I divided the class into four groups and gave each group private instructions: Group 1 was to brainstorm a list of desirable business and management adjectives; Group 2 was asked to brainstorm a list of undesirable business and management adjectives; Group 3 was asked to brainstorm a list of “male” adjectives; and Group 4 was asked to brainstorm a list of “female” adjectives. When everyone was done, I asked each group to report. Guess what? Male adjectives matched the desirable business and management list, while female adjectives matched the undesirable business and management list. Their inner feelings “short circuited” their public statements! This exercise demonstrated how internal stereotypes can unconsciously affect external decisions (even among folks who argued they knew better). The same error-prone stereotyping applies to models such as social styles, MBTI, DISC profiles, sales styles, and leadership styles ó fun, but often impractical, unrealistic, and downright pejorative to qualified applicants. Take a Flyer on Poor Test Data? A few articles ago, some readers mounted a micro-attack against the use of empirical data to make hiring decisions. The argument was, “Tests cannot tell us everything about a candidate. Sometimes you have to ‘go with your gut’ and ‘take a flyer’!” (I think that means to take a chance). Okay. Of all the hiring opinions I have heard, that is certainly one of them. We can argue all day from the sidelines, but are most line managers willing to take a chance on an untried candidate? I floated the idea of “taking a flyer” with a few of them. Their reaction was not positive. In fact, the managers I spoke to were downright hostile that anyone would even think of asking them to either interview an unqualified applicant or hire someone who could not demonstrate skills before starting a job. Hmmm, I wonder why? Conclusion Testing is like quicksand. It looks harmless and easy, but it is very deep and has the potential to swallow users without a trace.
And finally, always be certain your tests can separate the Einsteins from the Clavins.