We have around 50 microservices to monitor all these services we have created one application which checks the availability of critical functionalities in this services and if any of the services are down alert will be sent on mail. Problem with this approach too many consumptions of sitescope alert license which are costly. Is their any better solution to monitor microservices?
I'm not particularly familiar with Sitescope, so can't compare like-for-like.
In your approach:
an application that checks the availability of critical functionalities and if any of the services are down alert will be sent on mail
Are you checking availability i.e. can this function be called by the test application _or_ functionality i.e. making a test/synthetic call to check that functionality actually works?
In general, I'd aim to build three layers of instrumentation in a microservice application:
* Log aggregation i.e. consuming and indexing logs from all microservices (using ELK stack, Splunk, cloud provider log tools, etc). These allow you to investigate system behaviour, but can also drive dashboards and alerts.
* Time series metrics for dashboards and alerts (using Prometheus and Grafana, or a commercial tool like Datadog or Amazon Cloudwatch, depending on where/how you host). This data might be emitted from multiple sources e.g. infrastructure, app frameworks, service code...
* Distributed tracing (using Jaeger, OpenTracing, or a commercial option) to understand service interactions
This excerpt from the book Production-Ready Microservices, by Susan Fowler, provides a list of useful questions to assess the production-readiness of microservices and a microservice ecosystem. The questions are related to key metrics, logging, dashboards, alerting, and on-call rotations to give you an idea of what kinds of tools and processes you need to have in place to effectively monitor microservices.