How to Evaluate Watchdog Reports?

Watchdog performs daily monitoring of your website's page speed using Monitoring PLUS. If any key metric changes, Speed Watchdog notifies you. This text explains who should evaluate these reports and how they should do it.

Watchdog Report in Email A Watchdog speed report has arrived. What now?

Our Watchdog has been fine-tuned based on our consulting experience with clients, and we use it ourselves, but interpreting its reports does require some knowledge.

Due to the nature of synthetic measurements, for example, Watchdog might send a notification even if there isn't a major issue on the website.

Armed with the knowledge from this text, you'll handle Watchdog reports like a pro.

How Does the Watchdog Work?

In a nutshell, Watchdog operates as follows:

It collects data from synthetic measurements. (See the difference between measurement types.)
Every day, it tests all URLs entered in the test settings.
For each speed metric, it creates a single number for the entire site from this data.
Each metric has different limits for permitted changes.
If a limit is exceeded, you'll receive a notification via email, Slack, or Teams.

Learn more about how the Watchdog works.

Reports Are Primarily for Developers

We recommend that primarily developers and team members working on web speed daily monitor Watchdog reports. This requires technical knowledge and time.

For managers, marketers, UX specialists, and similar professions, we offer other reports in Monitoring PLUS. For instance, a monthly email report on speed status or a team dashboard.

We suggest that managers disable Watchdog notifications:

Email Settings for Watchdog You can disable Watchdog notifications in email settings if your colleagues are already monitoring the changes.

How to Generally Evaluate Reports?

When you receive an email or Slack notification, your first questions for evaluating the report should be:

Does the change affect user metrics, i.e., CrUX data?
Were any web updates deployed on the days the metric changed (visible in notes)?
Which specific page types are impacting this change?
Can the change be seen in the test run detail?

TIP: You can find a detailed tutorial on Vzhůru dolů that describes the functionality of Monitoring PLUS and Watchdog using a specific problem example.

Does the Change Affect User Metrics?

Those with more experience know that not all metrics are equal. Watchdog collects data from synthetic measurements daily, but we're interested in their impact on user data.

We can't evaluate user data (CrUX) daily yet, as they have a several-week delay.

Let's evaluate the change in Watchdog. Check the Domain report where we have data from the Chrome UX Report (CrUX). Is there a visible problem with the same metrics? Here are a few things to consider:

It's good to know that CrUX data is calculated cumulatively, typically for nearly a month back, so only a small change might be visible in domain data. Even that could be suspicious.
Also, note that a change in user data might occur up to three days after a real change on the website.
Finally, remember that not all user-obtained metrics can be measured synthetically. See the guides for specific metrics below in this text. For example, the synthetic TBT metric may (or may not) affect interaction speed, i.e., INP.

Remember that user data (CrUX) is calculated cumulatively for the last 28 days and may have a few days' delay.

Continue with further evaluation steps only if you see changes in both Watchdog and user data.

Watchdog vs CrUX Metrics in Watchdog can sometimes have a wild trajectory, but you won't see it in CrUX data. In such cases, there's no need to panic and search for a problem. The long-term trend is what matters. See also the Domain report.

Is It a Cyclical Problem?

Another common pattern in data is seasonality, causing cyclical repetition of certain issues. How to spot such a problem?

In the Watchdog report, look at the long-term trend (at least 3 months). Some metrics (like TBT) tend to deteriorate and improve cyclically.

Seasonal traffic might also play a role in your case, often reflected in server response time, TTFB.

How to Evaluate Individual Metrics?

In the following section, you'll see specific guides for handling individual metrics. They might seem similar, but that's not entirely the case. Pay attention to each metric individually.

Backend (TTFB)

Time To First Byte (TTFB) shows server response time, but it also considers your entire server infrastructure and user connection speeds.

It's a very important metric, whose deterioration will directly impact Core Web Vitals, especially load speed (LCP). Due to the crawl budget, i.e., Google's ability to crawl your site, TTFB also affects SEO.

How to evaluate reports on server response time changes (TTFB)?

Does it also affect user metrics?
Compare the metric's trend in Watchdog with the TTFB value for users (Domains report > Metric distribution trend). See the part How are user metrics evolving? above.
Were web updates deployed on the days the metric changed?
Notes in graphs can help you mark important deployments on the site.
Which specific page types affect this change?
Check the Pages report to see the TTFB metric trend for individual page types on your site. In both synthetic and CrUX data, you'll see if the problem affects the entire site or just specific parts.
Could your server infrastructure be temporarily under greater load?
During campaigns or seasons, there might be temporary server response deterioration. It's crucial that the optimal threshold of 0.8 seconds in user data isn't exceeded long-term.
Do you see a continuous deterioration in TTFB?
Monitor memory and CPU usage graphs from your hosting provider. From our experience, hardware upgrades are often underestimated and can be a solution.
Can the change be seen in the test run detail?
By clicking on the synthetic measurement results graph, you'll access the test run detail with a Lighthouse report. Compare the results with the test run detail from the previous day. The Lighthouse report can also help identify specific problem causes and optimization opportunities for this metric. Developers should also know how to measure Web Vitals directly in the browser, where they can access test run detail outputs.

Focus on optimizing the TTFB metric both continuously and not just reactively when a problem arises.

Read our detailed article on how backend developers can help with speed.

First Contentful Paint (FCP)

First Contentful Paint (FCP) shows the time needed to render the first content on your website.

It's an important auxiliary metric, whose changes often affect load speed (LCP) and various user metrics like bounce rate.

How to evaluate reports on FCP metric changes?

Does it also affect user metrics?
Compare the metric's trend in Watchdog with the FCP value for users (Domains report > Metric distribution trend). See the part How are user metrics evolving? above.
Were web updates deployed on the days the metric changed?
Notes in graphs can help you mark important deployments on the site.
Which specific page types affect this change?
Check the Pages report to see the FCP metric trend for individual page types on your site. In both synthetic and CrUX data, you'll see if the problem affects the entire site or just specific parts.
Is the core problem measurable by another metric?
Did server response time, i.e., TTFB metric, also change in reports? Backend speed can directly affect both FCP and LCP metrics, so you often find the culprit here.
Is the problem in critical resources and can it be found in technical indicators?
Did FCP change without TTFB changing? The difference between these two metrics lies in the critical resources needed for the first page rendering. Look for changes in the Technical report and indicator values like HTML data volume, CSS data volume, the number of blocking JS… There might be a change here causing FCP deterioration.
Can the change be seen in the test run detail?
By clicking on the synthetic measurement results graph, you'll access the test run detail with a Lighthouse report. Compare the results with the test run detail from the previous day. The Lighthouse report can also help identify specific problem causes and optimization opportunities for this metric. Developers should also know how to measure Web Vitals directly in the browser, where they can access test run detail outputs.

Focus on optimizing the FCP metric both continuously and not just reactively when a problem arises.

Largest Contentful Paint (LCP)

Largest Contentful Paint (LCP) shows the time needed to render the main content on a specific page of your website.

It's one of the three most important metrics, part of Core Web Vitals, and a metric that often correlates with conversion rates on e-commerce sites. How to evaluate reports on LCP metric changes?

Does it also affect user metrics?
Compare the metric's trend in Watchdog with the LCP value for users (Domains report > Metric distribution trend). See the part How are user metrics evolving? above.
Were web updates deployed on the days the metric changed?
Notes in graphs can help you mark important deployments on the site.
Which specific page types affect this change?
Check the Pages report to see the LCP metric trend for individual page types on your site. In both synthetic and CrUX data, you'll see if the problem affects the entire site or just specific parts.
Is the core problem measurable by another metric?
Did server response time, i.e., TTFB metric, also change in reports? Backend speed can directly affect both FCP and LCP metrics, so you often find the culprit here.
Problems can also often be found in FCP metric changes, see above. If TTFB didn't change, but both FCP and LCP did, it indicates a problem with FCP. The culprit might be an increase in critical resource data volume.
Is the problem in resources for LCP elements and can it be found in technical indicators?
If you don't see changes in TTFB or FCP metrics, the problem might be in the difference between FCP and LCP, which involves downloading and rendering the largest page elements. These are often asynchronous – images, JavaScript components, web fonts, and others. Look for changes in the Technical report and indicator values like JS data volume, font data volume, or image data volume.
Can the change be seen in the test run detail? Click on the synthetic measurement results graph to access the test run detail with a Lighthouse report. Compare the results with the test run detail from the previous day. The Lighthouse report can also help identify specific problem causes and optimization opportunities for this metric. Developers should also know how to measure Web Vitals directly in the browser, where they can access test run detail outputs.

Focus on optimizing the LCP metric both continuously and not just reactively when a problem arises.

Total Blocking Time (TBT)

Total Blocking Time (TBT) shows the total time JavaScript blocks the browser, potentially slowing user interaction responses and worsening the INP metric.

The interaction response metric (INP), an important part of Core Web Vitals, can only be measured for users (CrUX data), so Watchdog can't report its changes.

TBT is measurable synthetically, indicating possible INP deterioration, but there's no direct correlation. TBT deterioration can worsen INP, but INP can worsen without TBT change. Therefore, we recommend monitoring both Watchdog reports (TBT) and user data development (INP).

It's also worth mentioning that TBT is the most "volatile" of all metrics. The graph will show values "jumping" from lower to higher. Watchdog intelligently reports only significant changes, and we recommend you do the same. Just take TBT with a bit more "grain of salt" than other metrics.

How to evaluate reports on TBT metric changes?

Does it also affect user metrics?
Compare the metric's trend in Watchdog with the INP value for users (Domains report > Metric distribution trend). See the part How are user metrics evolving? above.
Were web updates deployed on the days the metric changed?
Notes in graphs can help you mark important deployments on the site.
Which specific page types affect this change?
Check the Pages report to see the TBT metric trend for individual page types on your site. In synthetic data, you'll see if the problem affects the entire site or just specific parts. Similarly, you can examine the INP metric development in the same report.
What impact do third-party components have?
In the "Pages" report, also look at the 3PBT metric trend, which shows the part of TBT attributed by our automation to third-party components. If the third-party blocking time share exceeds half, pay attention. The Test Run Detail report will reveal which specific third parties are problematic.
Can the change be seen in the test run detail?
Click on the synthetic measurement results graph to access the test run detail with a Lighthouse report. Compare the results with the test run detail from the previous day. The Lighthouse report can also help identify specific problem causes and optimization opportunities for this metric. Developers should also know how to measure Web Vitals directly in the browser, where they can access test run detail outputs.

Generally speaking – if the INP metric is okay, but TBT or 3PBT metrics have significantly worsened, then these changes might not be critical.

Still, it's good to regularly focus on optimizing the INP metric, both continuously and not just reactively when a problem arises.

Cumulative Layout Shift (CLS)

Cumulative Layout Shift (CLS) measures unexpected layout shifts users experience during page loading and subsequent usage.

It's an important metric, one of three Core Web Vitals that Google uses for SEO and PPC evaluation:

How to evaluate reports on CLS metric changes?

Does it also affect user metrics?
Compare the metric's trend in Watchdog with the CLS value for users (Domains report > Metric distribution trend). See the part How are user metrics evolving? above.
Were web updates deployed on the days the metric changed?
Notes in graphs can help you mark important deployments on the site.
What if CrUX and synthetic data for CLS differ significantly?
It's important to understand that CLS measured by Watchdog (i.e., synthetic measurement) can differ from CLS from user data (CrUX). While synthetic measurement can only capture layout shifts visible during initial web loading, CrUX data shows "layout shifts" even during subsequent page usage. If you see low synthetic CLS but high CrUX CLS, unwanted shifts won't be visible during the first load but during subsequent page usage.
Which specific page types affect this change?
Check the Pages report to see the CLS metric trend for individual page types on your site. In both synthetic and CrUX data, you'll see if the problem affects the entire site or just specific parts.
Can the change be seen in the test run detail?
Click on the synthetic measurement results graph to access the test run detail with a Lighthouse report. Compare the results with the test run detail from the previous day. The Lighthouse report can also help identify specific problem causes and optimization opportunities for this metric. Developers should also know how to measure Web Vitals directly in the browser, where they can access test run detail outputs.

Focus on optimizing the CLS metric both continuously and not just reactively when a problem arises.

Summary

Evaluating Watchdog speed reports, along with the monitoring itself, is a crucial part of improving or maintaining good website speed.

It's important for the team to decide who will monitor and evaluate the reports regularly. Technical skills and knowledge of metrics and the web are required.

If you haven't done so yet, we recommend studying how Watchdog works and how to properly set notifications via email or Slack.

Authors: Martin Michálek

Tags:Monitoring Notifications Monitoring PLUS

PreviousWatchdog: Alerts NextChart Notes

How to Evaluate Watchdog Reports?

How Does the Watchdog Work?#

Reports Are Primarily for Developers#

How to Generally Evaluate Reports?#

Does the Change Affect User Metrics?#

Is It a Cyclical Problem?#

How to Evaluate Individual Metrics?#

Backend (TTFB)#

First Contentful Paint (FCP)#

Largest Contentful Paint (LCP)#

Total Blocking Time (TBT)#

Cumulative Layout Shift (CLS)#

Summary#

Preferences