I am working on a project that has the names of various drugs. Often I find something like Proscratinol and Proscratinol XR (extended release). I would like to find a request to pick up all the names of this nature so that I can put the “parent” drug in the table, and these “baby” drugs reference it, so when I write a request for counting the number of drugs, I'm not double Proscratinol account because it has XR, CR and any other version. I wrote the following to strike him.
;with x as ( select drug_name from rx group by drug_name ) select distinct * from x,x as x2 where LEFT(x2.drug_name,5) = LEFT(x.drug_name,5) and x.drug_name !=x2.drug_name
This will give me a list of all drugs whose names have the first five letters. Five is completely arbitrary. What I have so far is good enough, but I would like to order the results in a downward similarity. Therefore, I would like their X-most characters read on the left to be the same.
eg. Phenytoin and Felip will be 3 (their first three letters are the same)
with x in the form (select drug_name from rx group by drug_name)
select x.drug_name as xDrugName ,x2.drug_name as x2DrugName ,case when LEFT(x2.drug_name,6) = LEFT(x.drug_name,6) then LEN(left(x.drug_name,6)) else '0' end from x,x as x2 where LEFT(x2.drug_name,5) = LEFT(x.drug_name,5) and x.drug_name !=x2.drug_name group by x.drug_name,x2.drug_name
Instead of hard coding the int to the left function in the above query, I need this integer expression to return how many identical characters separated the two lines. Any good way to do this?