Make a project that will get data from a wikipedia page that will be used to instantiate a Country class object which I will use to insert data into a database table of Countries.
Whats the problem?
Well I have used jsoup to extract the list of countryNames from wikipedia and with simple string concatenation (https://en.wikipedia.org/wiki/ + countryName) to iterate country pages and extract data.
The problem I have is that the data from the pages changes, sometimes it is not where I expected it to be. My idea was to use webdriver and with help of By.xpath to select elements, this will not work out since I have already ran into a problem:
Religion in France can be found in the side table(the class="infobox geography vcard" one) but for Algeria it is not so easy, as the same xpath would give me the type of government.
But if I go to the religious part of the wikipage for France I will end up with France being a secular country instead of getting the list of religions in France. While for Algeria that is where I will find the religions since they are not listed in the side table(the class="infobox geography vcard" one).
Any idea how I should go on about this? Give up on webdriver and find some library like jsoup to extract data? Still i do not see a way around the difference between pages for different countries.