stata bleg: text matching
Attention Stata people (esp. Sr. Rossman): Let’s say I have a data base of articles. I have a variable with the author’s name. Then I want to match the author’s name with other data (e.g., Fabio Rojas is matched with height 5′ 8″).
Merge 1:m is the command, but there’s a problem. Let’s say that my author data base doesn’t use the same spelling (e.g., Fabio G. Rojas or fabio rojas). Then the merged data set will have missing data.
Is there a way in Stata to offer the programmer a choice of possible matches to minimize missing data caused by variations in spelling? If not, what program or language has an easy to use tool box for this sort of stuff?