Validation: Sense or Nonsense?

Why don’t physicians bleed patients anymore to let out the “bad” blood? Why did they stop freely administering opiates, radioactive water, and addictive drugs as cure-alls? Why don’t they feel the bumps on your head to diagnose personality type? Because these are voodoo science. But voodoo science is not limited to medicine. It is still alive and well in testing and selection.

In fact, I thought it might be time to re-visit some nonsense validation procedures. In other words, learning whether test usage predicts vendor revenue or your candidate’s job performance.

Occupational Norms

You might have heard it before: We have norms for truck drivers, customer service, managers, fill-in-the-blank positions. Yes, somewhere along the line, an enterprising organization had a bunch of tests, a bunch of data, and a bunch of occupational titles. Someone suggested. “What a great idea if we could examine the norms of each occupation and see if they are different!” Sso far so good. Then they said, “Why don’t we use this information to sell more tests!” Bad idea. Voodoo science. They should have stopped while they were ahead. I’ll explain why.

Let’s suppose someone surveyed 100 truck drivers (TD), 100 customer service reps (CSR), and 100 managers (MGR). Do you suppose they all perform equally? Were all high performers? Identical within each group? Hardly! Just clustering norms into groups is no assurance that a single person in that group will conform to the overall norm, a conforming person is either a high or low performer, a non-conforming person is either a low or high performer, or any applicant who fits/doesn’t-fit the norm will be good or bad.

Tests sites that offer occupational norms are interesting, but only brain candy. It might be interesting to learn groups are different, but wouldn’t you really like to know whether the applicant can do the job? Deciding to hire or not hire a candidate based on whether he or she matches a group norm is voodoo science.

High-Low Groups

OK. Now we are going to ramp it up a bit. A few years back, a skeptical reader told me his test company divided people into a high- and a low-performance group, gave both groups the same test, developed norms, and used those norms to hire people. I said, “Sorry, but no.” He wrote back to say that his boss said there was more than one way to “validate” tests. “Yeah,” I thought, “There is a right way and a voodoo science way.”

Dividing groups into high and low performers has more than a few problems. In addition to the ones I described in occupational norms, we now have this thing called “performance.” In many cases, a performance rating is half fact and half opinion. We just do not know which is which. For example, I have seen many employees who were job duds, but skilled at sucking up to their manager. Co-workers knew he or she was a slacker, but their opinions were unimportant. The manager thought the dud did a good job. Accurate testing depends on knowing whether you are measuring actual job skills or suck-up skills. In the meanwhile, you can add validation based on high-low group averages to validation based on performance group classifications. Both are examples of voodoo science because they tell you nothing about the individual.

Training Tests

At one time, I was like everyone else: wowed by a training test. I answered 20 questions about being reserved, quiet, thoughtful, and so forth. I got the results back in the workshop. The 367- page report scores indicated I was reserved, quiet, and thoughtful. Amazing! “What a powerful tool for hiring!” I thought. Wrong! Voodoo science.

Training tests can tell us a great deal about a person’s self-descriptions, but only if the test-taker is in-touch with reality, honest, and knows himself or herself well. Those are big assumptions. The second thing about a training test is it was not designed to predict differences in job performance. That is a special condition. For example, do you know equally successful people in your organization who have different personality types? Do you know people with the same personality type who perform at substantially different levels? If you hire salespeople, customer service reps, or managers with the same personality profile, do you also have the luxury of selecting customers, prospects, and subordinates with matching personalities?

Personality differences and similarities must be carefully thought out. The only time they can be used to predict job performance is when you can separate correlation (e.g., shark attacks and ice cream sales are positively correlated) from cause (e.g., sharks attack swimmers who resemble shark food). Whoa! I can hear the keys furiously typing … ”But, what about culture and manager match???” Yes, that is also important, but successful hiring requires knowing your priorities, in this order: first, job skills; second, manager chemistry; and third, culture fit. Think about it. When is the last time you heard a manager comment, “Sure, he’s a job-dufus, but we have good chemistry!” or, “Yep, she cannot find her way out of a paper bag, but she really fits our culture!” Voodoo science: Job first, manager and cultural fit second.

Bogus Tests

Can you trust a scale that reports a different number every time you stand on it? A scale that is not calibrated to a uniform standard? A scale that reads weight when you really need volume? This nonsense represents what happens when test vendors fail to follow professional test development standards. A hiring test is supposed to measure something directly related to job performance, deliver stable scores over time, and accurately predict job performance. Anything less will produce bad hires and reject good candidates. If your vendor wants to sell you a hiring test, ask to see the report showing he or she followed professional hiring standards. Oh, yes, be sure to avoid vendors who promote matching candidates to occupational norms, high-low group validations, or cross-market their test for training. Users, not vendors, are responsible for test use. After all, your job is to make sure scores accurately predict performance for your job and your organization.

Performance Prediction

Imagine attending a Witch Doctor convention. People attending the workshops are arguing violently about what color chicken feathers is most effective in curing disease. You suggest they use modern antibiotics. The group hurls back a challenge, “Antibiotics are not perfect … we reject them!” Then they go back to something they know: arguing about chicken feathers. Voodoo science.

Hiring is a probability game with both controllable and uncontrollable variables. We won’t worry about uncontrollable stuff, but, do know that antibiotics are better than feathers. We also know high -quality hiring depends on identifying critical job skills and accurately measuring them. Identification, accuracy, and criticality get more high performers. It’s a fact.

Although it’s the chicken-feather of choice, once they screen out the blatantly unqualified candidate, casual interview-tests are no better than chance. This is due to applicant faking, unclear questioning techniques, personality factors, unclear objectives, and so forth. Adding behavioral event interview structure to interviews helps improve interview accuracy by clarifying critical factors, improved probing techniques, and making it hard for candidates to fake answers. But it is not easy to do and takes time to master.

Now here is the part people would rather not hear. Without getting all statistical, the more a test resembles the critical elements of the job, the more accurate it will probably be. For example, is an interview or actually solving a problem more accurate at predicting problem solving? A pencil-and-paper test or a realistic sales simulation at predicting sales success? A personality test or a planning exercise? Remember. Skills first. Manager and cultural fit second. What you measure will always be what you get. What you ignore (or mis-measure) is always left to chance.

Money Money Money

Here are some final thoughts about organizations both large and small. Line managers know the most about employee performance problems and its associated costs. They have the budget and they feel the pain. Aside from lawsuits and EEOC challenges, HR often has no idea what bad hires cost. I think this is because many of them have little budget and less pain. Although HR has the greatest potential to do something about bad hires, it tends to do the least. I think it all comes down to money and time.

HR is seldom willing to spend the money and take the time to calculate employee ROI. Instead it tends to spring for web-screening services that reduce their department workload, but do little to improve employee quality organization-wide. Meanwhile, line managers are left to their own devices. Considering the difference between their perceived and potential value to organizational profit, my advice to HR is to toss-out voodoo science practices, work with line managers to calculate the financial benefit of reducing turnover, improving new-hire performance, and reducing training time. Ask line managers for the budget to do something about it.