RANDY COLLICA IS a modern-day treasure hunter. As a senior business analyst in Palo Alto-based Hewlett-Packard Co.’s customer data and knowledge services department, his job is to mine data in search of insight that can help marketers better understand various customer segments. He stumbled upon a veritable gold mine a few years ago as he riffled through notes taken by HP’s call-center representatives. “I just knew there had to be nuggets of valuable information in there, given the volume of data we had,” says Collica. “But I also knew that finding them would be impossible if we didn’t have a tool to automate the analysis.”

Although standard data-mining systems can detect patterns hidden within structured tables of information, such as the transactional data of an ERP system, they are essentially useless with unstructured data–and notes taken during a phone call are about as unstructured as data gets. So Collica turned to text mining, a type of data-mining technology that combs through text and gives it structure so it can be analyzed.

Collica’s hunch turned out to be right: text mining revealed, as one example, that customers in lower-value segments ask a lot more questions about business processes, such as HP’s contract-negotiation procedures, than do the company’s best customers. “That insight has been invaluable in helping marketers come up with solutions and campaigns targeted at different customer groups,” says Collica.

The latest generation of technology, developed by vendors flush with post-9/11 government investment (see sidebar, page 81), is still far from perfect. But it is allowing corporations with large data sets to perform important feats they couldn’t before. “It really is the next frontier of understanding in business intelligence,” says Martin Schneider, an analyst at The 451 Group in New York.

Key to the improvements have been advances in natural language processing, a method of extracting meaning from printed words that now allows the software to “understand” complex phrases about 80 percent of the time. Text-mining systems can also be programmed to assign value to expressions. Suppose a telesales representative has entered the following note: “Nov. 15 - Cstmr not happy w/cell phone. Wants to switch to Yellow Inc.” The software can recognize that November 15 is a date; that “cstmr” is a customer; that he has a cell phone and is unhappy, which is bad; and that he wants to switch to a competitor, which is worse.

Once that kind of information is extracted, it can be structured in a format similar to a database and further analyzed, often more quickly than a human analyst can locate his reading glasses.

And the possibilities aren’t limited to customer service. San Francisco-based LoanPerformance, a provider of credit-risk-decision support tools for residential mortgage operators, uses text mining to offer its clients improved predictive analytics. Traditional risk-scoring solutions for loss mitigation and delinquency management incorporate only structured data such as a borrower’s interest rate, outstanding balance, and monthly payments. That ignores rich information that could help a mortgage servicer better determine how likely a delinquent borrower is to miss more payments or, ultimately, default. “If someone says they missed a payment because they lost their job, that’s different from ‘I forgot to send my check,” explains Damien Weldon, director of mixed-data analytics at LoanPerformance. When the company included data mined from call-center conversations in its scoring calculations, accuracy rose by 15 to 20 percent.

Text mining is finding fertile ground in the life-sciences and pharmaceutical industries, too. The brain-tumor research department at Children’s Memorial Hospital in Chicago uses text mining to comb through reams of medical journals and unearth gene-pairing information that can accelerate critical scientific breakthroughs.

Pharmaceutical companies like Pfizer Inc. mine patent documentation for insight on new directions in research. “This serves as an early-warning system to identify trends,” explains Mark Burfoot, head of information management at Pfizer in St. Louis. “We can see what competitors are doing and, by linking that information with our own R&D data, make a decision about whether it’s an area we should be looking into.”

Text mining often starts as a way to automate manual processes and then spreads as companies see its potential. At Bank of America N.A., the E-commerce team used to manually read, sort, and categorize the comments it received on surveys and feedback forums. Now text mining does the job instantly, producing graphs and charts about prevailing attitudes that help the team prioritize proposed service enhancements. Johnson Controls Inc., the Milwaukee-based autoparts supplier, first started using text mining in its call center several years ago, then began mining notes from the company’s 7,000 field-maintenance and installation engineers, searching for ways to improve products and reduce maintenance costs. More recently it has set up a program to scour Web logs and chat rooms to assess consumer opinions on car batteries. Next, the company plans to mine warranty claims for early warnings on product defects.