Platform

Monitor a Render App: Cron, Workers, Health Checks

15 min read

The short version: To monitor a Render app properly, you have to watch the platform around the web service, not just the web service. A 200 OK on your public URL says nothing about whether your Render Cron Job fired, whether your background worker is still draining its queue, whether a private service is reachable, or whether the last auto-deploy actually came up. Render's own health check barely helps either: it gates deploys and restarts a dead port, and its notifications, off until you configure them, only cover deploy-failed and service-unhealthy, never a stalled Cron or a logically broken 200. Four Velprove patterns close the gap: a Cron sentinel, a worker heartbeat, a build-SHA assertion, and a browser login monitor that signs in on the real Render user path. The browser login monitor is the one that matters most: it opens a real browser, signs in as a dedicated test user, and asserts on data that only renders if the database read behind your Render app actually succeeded. It is free on every plan at a 15-minute interval. Each runs from 5 global regions on the free plan, no credit card required.

Why a Render URL monitor watches the smallest part of your app

A single GET on your Render web service URL watches exactly one thing: the web service. That is the smallest part of most real Render deployments. The web service is the front door. Behind it sit Cron Jobs that run on a schedule as their own service, background workers that process queues with no inbound traffic at all, private services that talk only on the internal network, and an auto-deploy pipeline that ships on every git push. None of those have a public URL, and a status-code probe pointed at / cannot see any of them.

Render free-tier spin-down and keep-alive economics are a separate hosting decision, covered in the indie-hacker free-stack guide; this post assumes you have already made that call and is about the platform surface a URL monitor cannot see.

The reason the distinction matters: page-level failures change the page, so a URL monitor catches them. Platform-level failures degrade the product without changing the page. Your marketing site can keep serving a clean 200 for hours after the Cron that bills your customers stopped firing, after the worker that sends your emails wedged on a poison message, or after a deploy went green in the dashboard while shipping a broken build. The rest of this post is the platform layer.

What Render's native health check actually does

This is the part most people get wrong, so it goes first. Render has a built-in health check, and it is not an uptime alert. Render's health-check docs describe it plainly: "Every few seconds, Render sends health checks to each running web service and private service instance to confirm it's healthy and able to receive traffic." By default that probe is a TCP socket connection to one of your service's open ports, with port 10000 as the default. It is asking one question: is the port accepting connections.

Render uses the answer for two things. First, deploy gating. When all new instances pass their health checks at the same time, Render considers the deploy successful and starts routing traffic to them. Second, auto-restart. If an instance fails consecutive health checks for 60 seconds, Render automatically restarts it. Both are useful. Render can also notify you, by email or Slack, when a deploy fails or a service becomes unhealthy, but those notifications are off until you enable them under Integrations, and they fire only on those two coarse events. The health check itself just gates the deploy and silently restarts the box.

Two limits compound this. The default probe is a TCP socket, so it confirms the port is open, not that your app returns correct responses. An app that accepts the connection and then 500s on every request passes the default health check indefinitely. And the docs are explicit that "health checks only apply to web services and private services." Background workers and Cron Jobs are not health-checked at all, because they have nothing to probe. So Render's native check is a liveness probe for the front door, scoped to the front door. Its strongest signal, a service-unhealthy email, still only means the port stopped answering. What an HTTP health path should return when you add one is its own design question, covered in what a health endpoint should return. The point here is narrower: an external monitor is the only thing in this stack that catches a Cron that stopped firing, a worker that wedged, or a 200 that is logically broken.

Render Cron Jobs alert on a failed run, not on a run that never fires

If you are coming from Vercel, reset your mental model. On Vercel a Cron Job is a route inside a function. On Render a Cron Job is its own service. That difference is the whole reason this section exists. Render's Cron Job docs state the execution model: "Render guarantees that at most one run of a given cron job is active at a given time," and "Render stops an active run after 12 hours." Billing is prorated by the second with a minimum monthly charge of one dollar per cron job. It is a separate paid service. It is not part of the free tier, which covers web services, Render Postgres, Render Key Value, and static sites only.

Render does send a notification when a cron job execution fails, meaning the job ran and exited non-zero, if you enable failure notifications under Integrations. What it does not give you is a did-not-fire alert. When the scheduler simply stops triggering the job, or a run hangs and never exits, no email goes out, because from Render's side nothing failed. The dashboard shows the last run; nothing tells you the next one never happened. For the Vercel-side version of this same gap, see the Vercel platform-layer guide; the pattern rhymes, but the Render plumbing is different because the Cron is a standalone service, not an embedded route.

The pattern that works is a Cron sentinel. The freshness logic lives on an endpoint, not in the monitor, because a monitor cannot reason about "should this have run by now." The Cron writes a timestamp on success. A tiny companion web service reads that timestamp, computes its age, and returns 503 when the Cron has gone stale:

// companion web service, /cron-freshness
const last = await db.get("cron:billing:lastRun");
const ageMs = Date.now() - new Date(last).getTime();
const STALE_MS = 25 * 60 * 60 * 1000; // 25h grace for a daily Cron

return new Response(ageMs > STALE_MS ? "stale" : "ok", {
  status: ageMs > STALE_MS ? 503 : 200,
});

Note the topology. The Cron Job has no URL of its own, so you do not monitor the Cron directly. You monitor a regular web service that reports on the Cron. The companion can be your existing web service with one extra route. A static Response Body contains-today's-date assertion does not work here, because the monitor stores whatever string you typed once and then checks for that same stale value forever. Let the endpoint compute freshness and let the status code carry the signal.

Setting up the Cron sentinel in Velprove

The sentinel is four concrete steps. The first two are code on your side; the last two are clicks in the Velprove new HTTP monitor wizard.

  1. Have the Cron write a timestamp on success. At the end of the Cron Job, write the current time to Render Postgres or Render Key Value under a fixed key. Do this only after the work is durably done, not on entry, so a stuck run cannot look fresh.
  2. Add a freshness endpoint to a companion web service. Expose a small route that reads the timestamp, computes its age, and returns 503 when the age exceeds the Cron's cadence plus a grace window, 200 otherwise. This can be one extra route on your existing web service.
  3. Create a Velprove HTTP monitor on the endpoint. Create a new HTTP monitor, set the URL to https://<your-app>/cron-freshness, and on the Verify step add two Success Conditions in order: Status Code Equals 200 (the freshness endpoint returns 503 the moment the Cron goes stale, so a 200 is the entire check), and Header Contains on a header your app always sets here, for example a static x-app: render-cron-sentinel, so a cached gateway error page that happens to return 200 cannot pass.
  4. Pick a probe region and interval. On the Schedule & Alerts step, choose one of the 5 global regions and set the interval. A daily Cron is comfortable on the 5-minute Free interval; a sub-hourly Cron is worth the 1-minute interval on a paid plan so a missed run surfaces inside one cycle instead of five.
The two Success Conditions for the Cron sentinel. Status Code Equals 200 is the whole check, because the freshness endpoint returns 503 the moment the Cron goes stale. The Header Contains rule on x-app proves the response came from your service, not a cached gateway error page that happens to return 200.
Velprove new HTTP monitor wizard on the Verify step showing two Success Conditions for a Render Cron sentinel: row one Status Code Equals 200, row two Header Contains with header name x-app and value render-cron-sentinel.

Background workers and private services are invisible to HTTP

Render background workers are the clearest case of a thing a URL monitor structurally cannot see. Render's background-worker docs define them precisely: "Background workers are services that run continuously (like a web service or a private service), but they don't receive any incoming network traffic." No inbound traffic means no public URL means nothing for an HTTP monitor to hit. Private services have the same property on the public side: they are reachable only on the internal network. A worker that quietly dies mid-shift produces zero change in anything a URL probe can observe.

The fix is the same shape as the Cron sentinel, which is the point. The worker writes a timestamp to Render Postgres or Render Key Value after each successful unit of work, or on a fixed internal tick if the work is bursty. A companion web service reads that timestamp, computes its age, and returns 503 when the worker has been quiet longer than its expected cadence allows. A Velprove HTTP monitor asserts Status Code 200 on that endpoint, and Velprove flips the worker to a failing state the moment the heartbeat goes stale. The worker is now observable without ever giving it a public surface. Design the endpoint payload the same way you would any other, using the health-endpoint response-contract guide as the reference for the response contract.

One discipline matters: the worker must write the timestamp on real progress, not on loop entry. A worker stuck retrying the same poison message in a tight loop is still "running." If it stamps the timestamp at the top of the loop, the sentinel stays green while no actual work completes. Stamp it after a unit of work is durably done, and the sentinel goes red when throughput stops, which is the failure you actually care about.

Did my last auto-deploy actually come up?

Render auto-deploys by default. Render's deploy docs are explicit: "Whenever you push or merge a change to that branch, by default Render automatically rebuilds and redeploys your service." That is convenient and it is also a blind spot. A green deploy in the Render dashboard means the new instances passed the port-level health check from the earlier section. It does not mean the build that came up is the build you intended, or that it serves correct responses.

The probe is a build-SHA assertion. Expose the current git SHA or a release version on a light endpoint, for example /version returning { "sha": "a1b2c3d" }, wired from an environment variable Render sets at build time. A Velprove HTTP monitor asserts Body Contains the SHA you expect to be live. When a deploy reports green but serves a stale or wrong build, the assertion fails and the monitor tells you. This is the cheapest possible deploy verification: one endpoint, one Body Contains rule, no CI integration. It pairs well with a multi-step probe of your auth flow if you want to assert the deploy is functionally healthy, walked through in the multi-step API monitoring walkthrough.

The browser login monitor on the real Render user path

The patterns above prove the platform's moving parts are alive. They do not prove a real user can sign in and see their data. That is where the differentiator lives, and it is the reason a Render-hosted SaaS needs more than HTTP probes. Velprove's browser login monitor opens a real browser, signs in as a dedicated low-privilege test user, follows the post-login redirect, and asserts a string on the landing page that only renders if a real database read succeeded.

The detail that makes this catch platform failures is the assertion target. Point the login flow at a page whose render depends on a real read from Render Postgres or Render Key Value. By default Velprove verifies login by confirming the URL changed. Under Customize detection, switch Success verification to "Page contains text" and set it to a string that only appears when that read returned data: a customer name, an invoice ID, a known plan label. A web service can return 200 with an intact page shell while the database read behind it silently fails. A text-present assertion on post-database content catches that. A status-code probe never will. This is one of the clearest cases of when a browser monitor beats an HTTP probe.

Use a dedicated test account, never real admin credentials. The test user should have the smallest set of permissions that still renders a real data-backed page. If the account is ever compromised or its session leaks into logs, the blast radius is one throwaway low-rights login, not your production admin. The browser login monitor is free on every plan, including the free plan, at a 15-minute interval. That is enough to catch a multi-hour database-backed outage and a login regression within one window.

The browser login monitor on the real Render user path. It signs in as a dedicated low-privilege test user, then Success verification is switched from the default URL-change to Page contains text so the monitor passes only when a string that depends on a real database read renders on the landing page.
Velprove new browser login monitor wizard on the Configure Browser Login Monitor step, with the Login Page URL, Username or Email and Password fields filled for a dedicated low-privilege test user, and the Customize detection panel expanded so Success verification is set to Page contains text with a post-login data string.

Your monitor is the source of truth, not the status page

Every hosted platform has a status page, and every status page is a lagging indicator by design. It publishes after the vendor has detected, confirmed, and triaged an incident. Your external monitor publishes the moment a probe fails. Those are different clocks, and the gap between them is exactly the window in which your customers are experiencing an outage you have not been told about yet.

This is not a claim about Render specifically having a bad status page. It is the general property of vendor status pages: they are reactive, internal-first, and scoped to platform-wide events. A status page will almost never reflect that your Cron stopped firing, your worker wedged, or your last deploy shipped a broken build, because those are your incidents, not the platform's. The working assumption for anyone running a SaaS hosted on Render is that your Velprove monitor flags first and the vendor status page confirms later, if it ever mentions your problem at all.

The honest probe-cost tradeoff on Render

Render bills web services by instance time, not per request. A monitor hitting your web service every minute does not run up a per-invocation meter the way a function-billed platform would. The real cost question on Render is interval discipline, not invocation count. Probe the things that fail fast at a fast interval, and probe the things that fail slowly at a slower one.

Concretely: a 5-minute interval on the Free plan is fine for a daily Cron sentinel and a worker heartbeat, because a daily job being late is a minutes-to-hours problem, not a seconds problem. A paid Velprove plan buys the 1-minute interval that matters for a sub-hourly Cron or a customer-facing API where one minute of detection lag is one minute of silent revenue loss. The rule is the principle, not a number: match the probe interval to how fast the thing fails, and do not pay for 1-minute probes on a job that runs once a day.

For the Render-versus-Vercel-versus-Cloudflare view of this same platform-layer decision, the sibling guides are the same patterns applied to Vercel and the Cloudflare Workers + Pages guide. If your Render service runs Next.js, the render-layer half is the Next.js render-layer guide.

Frequently Asked Questions

Does Render's built-in health check alert me when my app goes down?

Not on its own. The health check is a liveness probe that gates deploys and restarts a dead instance. Render can email or Slack you when a deploy fails or a service becomes unhealthy, but that is off until you enable it under Integrations and only covers those two coarse events. By default the probe is a TCP socket on the open port, so it cannot tell you the product is logically broken, that a Cron stopped firing, or that a worker wedged. An external uptime monitor is what catches those.

How do I monitor a Render Cron Job?

A Render Cron Job is its own service, not a route inside your web service. Render emails you if a run executes and fails, but it has no did-not-fire alert for a run that never starts. Render Cron Jobs are a separate billable service with a minimum charge of one dollar per month per cron, and they are not part of the free tier. The working pattern is to have the Cron write a timestamp on success, then have a tiny companion web service compute freshness server-side and return 503 when the timestamp is stale. A Velprove HTTP monitor asserts Status Code 200. The endpoint flips to 503 when the Cron stops firing, and the monitor catches it within one probe interval.

How do I monitor a Render background worker that has no URL?

A Render background worker runs continuously but receives no incoming network traffic and has no public URL, so an HTTP monitor cannot reach it directly. Use a heartbeat-endpoint pattern. The worker writes a timestamp into Render Postgres or Render Key Value after each successful unit of work. A small companion web service reads that timestamp, computes its age, and returns 503 when the worker has gone quiet longer than its expected cadence. A Velprove HTTP monitor on that endpoint asserts Status Code 200, and the worker becomes observable without giving it a public surface.

Does Render auto-deploy mean I do not need to verify deploys?

No. Render auto-deploys on every git push to the connected branch by default. A green deploy in the dashboard means the new instances passed Render's port-level health check, not that the application is serving correct responses. Expose the build SHA or release version on a light endpoint, then have a Velprove HTTP monitor assert Body Contains the SHA you expect to be live. When a deploy reports green but ships a broken or stale build, the assertion fails and you find out from the monitor instead of from a customer.

Is monitoring a Render app different from monitoring a Next.js app?

Yes, they are two different layers and they compose. Monitoring a Next.js app is the render layer: cold starts, stale pages, auth-protected routes that return 200 with an empty shell. Monitoring a Render-hosted app is the platform layer: Cron Jobs firing, background workers staying alive, private services reachable, the last auto-deploy actually coming up. Render commonly hosts Next.js, so most teams running Next.js on Render need both, and the companion Next.js monitoring guide is written to be read together with this one.

Can I use Velprove's free plan to monitor a commercial app on Render?

Yes. Velprove allows commercial use on every plan, including free, with no per-monitor surcharge for revenue-generating services. A funded SaaS running its API and background workers on Render can put the Cron sentinel, the worker heartbeat, the deploy-SHA assertion, and one browser login monitor on the free plan and stay compliant indefinitely. You move to a paid plan for faster probe intervals or more browser login monitors, not for permission to monitor something that makes money.

The free Velprove plan covers 10 monitors at a 5-minute HTTP interval, 1 browser login monitor at a 15-minute interval, multi-step API monitors up to 3 steps, and every probe runs from 5 global regions with email alerts and 1 status page. That is enough to land the browser login monitor, the Cron sentinel, the worker heartbeat, and the build-SHA assertion for a single production Render app. Start with the free plan. No credit card required.

Start monitoring for free

Free browser login monitors. Multi-step API chains. No credit card required.

Start for free