Internet Sourcers are continuously looking for new places on the Web to search for talent. The traditional “non-traditional” methods of building long search strings, flipping and x-raying sites, and peeling back URLs are here to stay, but we are always in need of ways to uncover new data that can be flipped and x-rayed. Lately, Internet researchers have been talking about a phenomenon called the “Invisible Web,” which refers to places on the Internet that are not readily accessible by traditional search engines. When search engines index pages on the Web, they do so through very powerful Web spiders and robots that follow links and index every page and every word on every page. There are growing numbers of pages on the web that the spidering technology simply cannot index: these make up the Invisible Web. Since Invisible Web pages are not indexed by the search engines, they will never appear in search results. These pages should not be confused with the Web pages that are indexable by the search engines but for various reasons are not actually indexed. While the major search engines do an excellent job of indexing a significant portion of the Web, this process is very intensive and demanding on the host computers. Simply put, search engines are not equipped to index every indexable page, particularly since the number of new pages is increasing dramatically every day. The current statistics indicate that even the best search engines index less than one-third of what is indexable. The Invisible web is different. It primarily includes the information stored in databases. When a spider comes across a database, it typically can only index the address of the database; it cannot crawl through any of the information actually stored in the database. This presents a challenge for researchers, because there are thousands of free-to-search databases filled with extremely valuable information that is not readily accessible through traditional search methods. In order to access the information, a researcher needs to visit the website that houses the database. To do this, the research must first know that the databases actually exist. Search strings that attempt to uncover information relative to a candidate, a skill set, or a competitor generally will not necessarily identify the databases that may house this information. Fortunately research librarians have been on the case for the last few years, and some have compiled lists of databases that can be accessed by the general public, but that most people are not aware even exist. Gary Price, a research librarian at George Washington University is at the forefront of this work. He has developed one of the most comprehensive collections of lists and links to databases. Gary’s website contains links to hundreds of lists and databases on the Web, including such things as transcripts from business and government leaders’ speeches, databases of press releases, links to worldwide news resources, lists of the company rankings published by major business magazines and trade journals, demographic and economic resources, patent information, and much, much more. One of the most beneficial features of Gary’s site is that he has created a database of his database links. When visiting the site, click on the “Direct Search” link, and it will take you to the database. Simply enter the topic of interest into his search interface and the tool will find the database listings that might best fit your needs. It is not a very sophisticated database, but it is a great timesaver. For the innovative Internet sourcer, sites like Gary’s offer a goldmine of information. Access to databases that many sourcers never knew existed exponentially increases the amount of information to research to find the right candidates. For example, a researcher looking for a senior executive with particular industry experience now has access to a database of press releases which she can search to find the names of potential candidates. While press releases sometimes show up in the results of a general search engine search, it is much easier to search a captive database. On the same note, a sourcer looking for names of PR Managers can search the press releases to find who published them. Typically the PR contact name and phone number are listed at the bottom of the press release. An Internet Sourcer looking to find a senior developer with specific industry expertise can research the patent databases to find names of individuals listed on patents. Often the names of all the contributors, both major and minor, are listed in the patent submissions. These names can then be researched further through the search engines to find additional information. While the individuals listed in the press releases or patent information may not be the exact target candidate, links to their company websites can lead to a wealth of resources that can be flipped, x-rayed and peeled back for additional information. In addition to Gary Price’s site, there are other sites that provide resource links to the Invisible Web. They include:
- Invisibleweb.com – This aptly-named site, developed by IntelliSeek, is a database of links to searchable resources on the Web. They have a custom-developed crawler that detects just the searchable sources on the web. Subject-matter experts (yes, humans) then review the resource to determine the type and quality of information, indexing and categorizing it into their database.
- Fossick.com – Fossick is an Australian/New Zealand term which is used to mean “ferret out, rummage, and search”. This site lets you do just that, offering a selective collection of over 3,000 specialty search engines and topical guides.
- Roger Williams University Library – Offers its own links to Invisible Web resources and databases.
- Wall Street Executive Library – A site developed to allow senior-level professionals to easily do business research in the areas of Current Events, Finance, Stocks, Economic Trends, Demographics, and other industry information. It provides links to many of the top resources on the Web.
- Complete Planet – As of the end of December 2000, this site provided links to over 38,500 databases and search resources on the Internet.
Article Continues Below
The bottom line is that, through the work of companies like IntelliSeek and researchers like Gary Price, the Invisible Web is no longer invisible. Increasingly there are resources to uncover and reach these databases. Internet Research librarians are having a field day, and so too should the Internet Sourcer for candidates and competitive intelligence. The research process includes just a few simple steps. First, you must uncover the appropriate database to search by exploring some of the sites listed above. Second, you must query these databases to gather useful information. Finally, you should use that information as input into search strings to uncover yet more useful information. In the end you will save time and money in finding the “passive” candidates for your hard-to-fill positions. <*SPONSORMESSAGE*>