Monitoring yente
We recommend you monitor your production deployment of yente and set up alerting in order to make sure your deployment is operational and serving up-to-date data.
Service health endpoints
yente provides standard health check endpoints:
/healthz: Returns200 OKif the Python application is responsive. Use this for basic liveness probes./readyz: Returns200 OKif the search index is available and searchable. Use this for readiness probes to ensure the service doesn't receive traffic before the initial indexing is complete.
Note that /readyz will return 200 OK even if the index is stale, as long as it is searchable. Read on for how to monitor data freshness.
Monitoring catalog and index freshness
To ensure that your yente instance is serving the latest data and that the indexing process is working correctly, you should monitor the state of the data catalog and the search index.
The /catalog endpoint provides information about the datasets configured in your instance and their current indexing status. If you're running yente with the
default configuration, indexing the default collection
usually looks something like this:
{
"datasets": [
{
"name": "default",
"title": "OpenSanctions",
"updated_at": "2024-01-27T12:54:18",
"version": "20260127125418-igl",
"index_version": "20260127125418-igl",
"index_current": true,
"load": true
// ...
},
],
"current": ["default"],
"outdated": [],
"index_stale": false
}
A few fields are of interest here:
updated_at: The timestamp when the dataset was last updated, i.e. the timestamp of the latest available version in the catalog.version: The version identifier for the latest available data for this dataset.index_version: The version identifier of the data currently loaded in the search index.index_current: A boolean indicating ifindex_versionmatchesversion.index_stale: If any datasets configured to be indexed haveindex_current: false,index_stalewill betrueand those datasets will be inoutdated
For monitoring purposes, I suggest you do the following:
- Check that
updated_atis at most X time old. This will alert you if the catalog isn't being updated anymore (or the default collection hasn't published any new data in that time). We suggest 24 hours to account for any transient issues. - Check if
index_stale. This will let you know if something is preventing new data from being indexed and made searchable.
Note that both are required: if catalog updates are broken, yente will never know that a new version of a dataset has been published, and consequently never mark the index as stale.