The problem is that you represent nominal variables on a continuous scale, which imposes (false) ordinal relations between classes when using machine learning methods. For example, if you encode an address as one of six possible integers, then address 1 is closer to address 2 than address 3,4,5,6. This will cause problems when you try to find out something.
Instead, translate the 6-digit categorical variable into six binary variables, one for each categorical value. Then your original function will give six functions in which there will be only one. Also, keep age as an integer value as you lose information, making it categorical.
As for the approaches, this is unlikely to be of great importance (at least at the initial stage). Go with what is easier for you to implement. However, before running on your test suite, make sure that you perform any cross-validation selection option in dev, since all algorithms have parameters that can significantly affect the accuracy of training.
source share