How to Build a Flexible API Monitoring Exporter with Gin-Vue-Admin and Prometheus
This article walks through extending a simple Prometheus Exporter into a full-featured API monitoring solution using Gin-Vue-Admin, detailing backend task scheduling, database schema, multi-protocol checks (HTTP, TCP, DNS, ICMP), dynamic cron management, and frontend integration for managing and visualizing health metrics.
Previously a simple Prometheus Exporter was created. The new version adds several features:
Interface management via a frontend page with data stored in a database
Response validation in addition to status‑code checks
Frontend display of interface availability percentage
Flexible configuration of probe items, including adjustable frequency, result validation, and enable/disable control
The implementation follows a straightforward flow:
User creates a probe task, which is saved to the database
The backend registers a scheduled job for the new task
A backend goroutine watches for updates or deletions and adjusts the scheduled jobs accordingly
The probe task generates Prometheus metrics for monitoring and alerting
Backend Implementation
Define the database model for a probe task:
<code>type DialApi struct {
global.GVA_MODEL
Name string `json:"name" form:"name" gorm:"column:name;default:'';comment:接口名称;size:32;"`
Type string `json:"type" form:"type" gorm:"column:type;default:'';comment:拨测类型 HTTP TCP PING DNS;size:8;"`
HttpMethod string `json:"httpMethod" form:"httpMethod" gorm:"column:http_method;default:GET;comment:HTTP请求方法;size:8;"`
Url string `json:"url" form:"url" gorm:"column:url;comment:拨测地址;size:255;" binding:"required"`
RequestBody string `json:"requestBody" form:"requestBody" gorm:"column:request_body;comment:请求BODY;size:255;"`
Enabled *bool `json:"enabled" form:"enabled" gorm:"column:enabled;default:false;comment:是否启用;" binding:"required"`
Application string `json:"application" form:"application" gorm:"column:application;comment:所属应用;size:32;"`
ExceptResponse string `json:"exceptResponse" form:"exceptResponse" gorm:"column:except_response;comment:预期返回值;size:32;"`
HttpStatus int `json:"httpStatus" form:"httpStatus" gorm:"column:http_status;type:smallint(5);default:200;comment:预期状态码;size:16;"`
Cron string `json:"cron" form:"cron" gorm:"column:cron;comment:cron表达式;size:20;"`
SuccessRate string `json:"successRate" form:"successRate" gorm:"column:success_rate;comment:拨测成功率"`
CreatedBy uint `gorm:"column:created_by;comment:创建者"`
UpdatedBy uint `gorm:"column:updated_by;comment:更新者"`
DeletedBy uint `gorm:"column:deleted_by;comment:删除者"`
}
</code>The struct captures fields such as the probe URL, expected response, status code, and cron expression, all of which are filled in through the frontend.
For newly created tasks, a
Runmethod registers a scheduled job using the timer utilities of
gin-vue-admin:
<code>func (j *StartDialApi) Run() {
var dialService = service.ServiceGroupApp.DialApiServiceGroup.DialApiService
pageInfo := dialApiReq.DialApiSearch{}
dialApiInfoList, _, err := dialService.GetDialApiInfoList(pageInfo)
if err == nil {
var option []cron.Option
option = append(option, cron.WithSeconds())
for _, dialApi := range dialApiInfoList {
c := utils.ConvertToCronExpression(dialApi.Cron)
dialApi.Cron = c
dialService.AddSingleDialApiTimerTask(dialApi)
}
} else {
global.GVA_LOG.Error("获取拨测任务列表失败")
}
}
</code>This method checks whether a scheduled task already exists; if not and the task is enabled, it adds the task to the timer.
The core timer‑addition logic is encapsulated in
AddSingleDialApiTimerTask:
<code>func (dialService *DialApiService) AddSingleDialApiTimerTask(dialApiEntity dialApi.DialApi) {
var option []cron.Option
option = append(option, cron.WithSeconds())
idStr := strconv.Itoa(int(dialApiEntity.ID))
cronName := global.DIAL_API + idStr
taskName := global.DIAL_API + idStr
task, found := global.GVA_Timer.FindTask(cronName, taskName)
if !found {
if *dialApiEntity.Enabled {
_, err := global.GVA_Timer.AddTaskByFunc(cronName, dialApiEntity.Cron, func() {
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(0)
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(0)
switch dialApiEntity.Type {
case "HTTP":
ok := checkHTTP(dialApiEntity)
if ok {
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(1)
} else {
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(1)
}
logHealthCheckResult(ok, nil, dialApiEntity, "HTTP")
getSuccessRateFromPrometheus(dialApiEntity)
case "TCP", "DNS", "ICMP":
var ok bool
var err error
switch dialApiEntity.Type {
case "TCP":
ok, err = checkTCP(dialApiEntity)
case "DNS":
ok, err = checkDNS(dialApiEntity)
case "ICMP":
ok, err = checkICMP(dialApiEntity)
}
if ok {
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(1)
} else {
global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(1)
}
logHealthCheckResult(ok, err, dialApiEntity, dialApiEntity.Type)
getSuccessRateFromPrometheus(dialApiEntity)
default:
global.GVA_LOG.Error("未知的检测类型", zap.String("DetectType", dialApiEntity.Type))
}
}, global.DIAL_API+idStr, option...)
if err != nil {
global.GVA_LOG.Error(fmt.Sprintf("添加拨测定时任务失败: %s : %s , 原因是: %s", idStr, dialApiEntity.Name, err.Error()))
}
}
} else {
if task.Spec != dialApiEntity.Cron {
global.GVA_LOG.Info(fmt.Sprintf("修改定时任务时间: %s", dialApiEntity.Name))
global.GVA_Timer.Clear(global.DIAL_API + idStr)
dialService.AddSingleDialApiTimerTask(dialApiEntity)
} else if !*dialApiEntity.Enabled || dialApiEntity.DeletedAt.Valid {
global.GVA_LOG.Info(fmt.Sprintf("停止拨测任务: %s", dialApiEntity.Name))
global.GVA_Timer.RemoveTaskByName(cronName, taskName)
}
}
}
</code>When a task runs, it records success/failure metrics, logs the result, and updates the success rate by querying Prometheus:
<code>func getSuccessRateFromPrometheus(dialApiEntity dialApi.DialApi) {
successQuery := fmt.Sprintf(`sum(rate(health_check_results{name="%s", type="%s", status="success"}[1h]))`, dialApiEntity.Name, dialApiEntity.Type)
totalQuery := fmt.Sprintf(`sum(rate(health_check_results{name="%s", type="%s"}[1h]))`, dialApiEntity.Name, dialApiEntity.Type)
successResponse, err := utils.QueryPrometheus(successQuery, global.GVA_CONFIG.Prometheus.Address)
if err != nil {
global.GVA_LOG.Error("Failed to query success rate from Prometheus", zap.Error(err))
return
}
totalResponse, err := utils.QueryPrometheus(totalQuery, global.GVA_CONFIG.Prometheus.Address)
if err != nil {
global.GVA_LOG.Error("Failed to query total rate from Prometheus", zap.Error(err))
return
}
var successValue, totalValue float64
if len(successResponse.Data.Result) > 0 {
for _, result := range successResponse.Data.Result {
if v, ok := result.Value[1].(string); ok {
if f, err := strconv.ParseFloat(v, 64); err == nil {
successValue = f
}
}
}
}
if len(totalResponse.Data.Result) > 0 {
for _, result := range totalResponse.Data.Result {
if v, ok := result.Value[1].(string); ok {
if f, err := strconv.ParseFloat(v, 64); err == nil {
totalValue = f
}
}
}
}
if totalValue > 0 {
successRate := CalculateSuccessRate(successValue, totalValue)
var dialService = DialApiService{}
dial, err := dialService.GetDialApi(strconv.Itoa(int(dialApiEntity.ID)))
if err != nil {
global.GVA_LOG.Error("获取任务失败", zap.String("err", err.Error()))
return
}
successRateStr := fmt.Sprintf("%.2f", successRate)
if dial.SuccessRate != successRateStr {
dial.SuccessRate = successRateStr
if err := dialService.UpdateDialApi(dial); err != nil {
global.GVA_LOG.Error("更新任务成功率失败", zap.String("err", err.Error()))
return
}
}
}
}
func CalculateSuccessRate(success, total float64) float64 {
if total == 0 {
return 0
}
return (success / total) * 100
}
</code>Protocol‑specific check functions are provided for HTTP, TCP, DNS, and ICMP:
<code>func checkHTTP(dialApiEntity dialApi.DialApi) bool {
idStr := strconv.Itoa(int(dialApiEntity.ID))
var response *http.Response
var httpErr error
switch dialApiEntity.HttpMethod {
case "GET":
response, httpErr = http.Get(dialApiEntity.Url)
case "POST":
response, httpErr = http.Post(dialApiEntity.Url, "application/json", strings.NewReader(dialApiEntity.RequestBody))
}
if response != nil {
if httpErr == nil && response.StatusCode == dialApiEntity.HttpStatus {
if dialApiEntity.ExceptResponse != "" {
bodyBytes, err := io.ReadAll(response.Body)
if err != nil {
return false
}
return strings.Contains(string(bodyBytes), dialApiEntity.ExceptResponse)
}
return true
}
global.GVA_LOG.Info(idStr+":"+dialApiEntity.Name+"拨测结果与预期不一致")
return false
}
global.GVA_LOG.Error("拨测失败: "+dialApiEntity.Url)
return false
}
func checkTCP(dialApiEntity dialApi.DialApi) (bool, error) {
conn, err := net.DialTimeout("tcp", dialApiEntity.Url, 5*time.Second)
if err != nil {
return false, err
}
defer conn.Close()
return true, nil
}
func checkDNS(dialApiEntity dialApi.DialApi) (bool, error) {
_, err := net.LookupHost(dialApiEntity.Url)
if err != nil {
return false, err
}
return true, nil
}
func checkICMP(dialApiEntity dialApi.DialApi) (bool, error) {
pinger, err := ping.NewPinger(dialApiEntity.Url)
if err != nil {
return false, err
}
pinger.Count = 2
if err = pinger.Run(); err != nil {
return false, err
}
return true, nil
}
</code>A background goroutine watches update and delete channels to modify or remove scheduled jobs accordingly:
<code>func startUpdateDialCron() {
var dialService = service.ServiceGroupApp.DialApiServiceGroup.DialApiService
for {
select {
case updateId := <-global.UpdateDialAPIChannel:
if updateId != "" {
dial, err := dialService.GetDialApi(updateId)
if err != nil {
global.GVA_LOG.Error("获取任务失败", zap.String("err", err.Error()))
continue
}
global.GVA_LOG.Info("更新定时任务", zap.String("updateId", updateId))
cronName := global.DIAL_API + updateId
taskName := global.DIAL_API + updateId
if _, found := global.GVA_Timer.FindTask(cronName, taskName); found {
global.GVA_Timer.Clear(cronName)
c := utils.ConvertToCronExpression(dial.Cron)
dial.Cron = c
dialService.AddSingleDialApiTimerTask(dial)
}
}
case deleteId := <-global.DeleteDialAPIChannel:
if deleteId != "" {
cronName := global.DIAL_API + deleteId
taskName := global.DIAL_API + deleteId
if _, found := global.GVA_Timer.FindTask(cronName, taskName); found {
global.GVA_LOG.Info("删除定时任务", zap.String("updateId", deleteId))
global.GVA_Timer.RemoveTaskByName(cronName, taskName)
}
}
}
}
}
</code>Frontend Display
A simple Vue page is built to manage probe tasks, allowing creation, editing, enabling/disabling, and viewing of success rates.
The interface also shows detailed task information and provides controls to toggle task status.
Monitoring and Alerting
Metrics are exposed to Prometheus under the name
health_check_results. Creating a Prometheus job for this exporter allows collection of success/failure counts.
An alert rule can be defined to trigger when the success rate falls below a threshold (e.g., 100%).
With these components, the system provides a complete API monitoring solution that stores configuration, schedules checks, records results in Prometheus, and visualizes success rates on the frontend.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.