BookstoreScraper is application based on Spring Boot. It allows to scrap data from two biggest polish online bookstores - EMPIK and MERLIN. You can scrap books within 3 options:
most precise book (you simply give title and it looks for most precise book)
categorized book (currently 5 categories of book are available: BIOGRAPHY, CRIME, GUIDES, FANTASY, ROMANCES)
There is ranking option. It is comparing books from each bookstore and if title repeats the book is higher in the ranking.
There is also history system which tracks every action of logged user as there is provided simply security.
There are a lot of classes so I will paste just the most important classes, but if you can I will paste github link so you can check it out so you can tell me also about structure and other unit tests that are not posted here.
Let's start from the class responsible for scraping data from the site:
MerlinSource which implements BookServiceSource interface. I'm not gonna paste EmpikSource as it looks really similiar.
BookService - it fetches result from scraping data classes and wrap it into Map.
CategorizedBookRankingService - as I said this is service responsible for comparing books and creating ranking.
Let's go further to Account staff.
AccountService - create useres.
LoggedAccountService - retrieve logged account id
MerlinUrlPropeties - I got .yml file with corresponding url's.
JSoupConnector - create connection to url and return document
HistorySystemService - service responsible for fetching account history and saving account history.
I wanted to post also: MerlinUrlPropertiesTest,AccountHistorySystemServiceTest , but I couldn't as post was too long.
That's all for now. Looking forward to hear your opinions. Thanks.
posted 1 year ago
Bump. Really looking forward to see what I did wrong. Thanks.
The first person to drink cow's milk. That started off as a dare from this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop