I think the formula is slightly incorrect. It should be:
| (focal length(mm)) * (real height of objec(mm)) * (image height(pixels)) |
distance(mm) = | ------------------------------------------------------------------------------------- |
| (object height(pixels)) * (sensor height(mm)) |
You will need to know (1) the actual height of the object (in mm), (2) the height of the object in the image (in pixels), and (3) the height of the sensor (in mm).
For (1) you need to measure in the real world or calculate relative to other things (for example, if there is something else of a known height at the same distance, you could determine the height relative to that object.) If you can't do that then you need some sort of calibration routine. Depending on the application you might consider making a database of 'known objects' as an estimation. For example if your app was a dashboard camera where you wanted to measure the distance to vehicles in front of you, you could have standard 'car' 'truck' 'motorcycle' 'bike rider' objects whose sizes you define. Then when you are measuring distances you would apply the correct pre-defined size to the formula. Wouldn't be 100% accurate but would be close (and you can fine-tune your database if you need more options).
For (2) you need to do image processing to find the location of the object in the image. This might also mean classify the object as one of your pre-defined types as described above.
For (3) you can make an assumption or you could have a database of know hardware, detect what hardware is in use and apply the appropriate size. For example, find out what type of sensors are in the Nexus 5 and put that in the DB. Then when a user runs the app, check the phone and if it is a Nexus 5 load the appropriate sensor size.