If you want to accept all these values ββfor the plant height - 3 mm, 3 cm, 3 inches, 3 ", 3 ', 3 - then what you want to do is impossible. There is no automated way to determine if 3' means" 3 cm "or 3 inches
There are several different ways to solve this problem. But the approach you take depends in part on who is interested in the data.
If the data that your users upload is mostly used by the users themselves, they will endure more work for you. If the data is mainly used for others, it will not.
You may be mean or liberal in what you accept. The more mean you are, the more automated your processes are.
Perhaps you can be really stingy by providing your users an Excel spreadsheet with validation .
If you need to be very liberal in what you accept, you probably need several tables to map one standard value to many non-standard values. A table for displaying colors may look like this:
Standard User-supplied -- brown brown brown brownish brown brn brown puce
First you need to fill in this table manually. But the more data you process, the more likely it is that a custom color will already be present in the table.
This table should not exist in the database. It can be external, and the data cleaning process can be external. The βTβ part for ETL for data warehouses is sometimes done this way. (βETLβ means βExtract, Convert, and Download.β)
Mike Sherrill 'Cat Recall'
source share