-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aligning technology origins and adoption metric between the crux and the crawl data #42
Comments
What would the adoption share be if we included CrUX data in the total? i.e. we insist on both to be in this dataset? |
I'm assuming you're talking about total adoption, but let's check category share as well. In 2024-11 we have:
CMS / WordPress stats
SELECT
category_total,
tech_adoption,
tech_adoption / category_total AS category_share
FROM (
SELECT
adoption.mobile AS tech_adoption
FROM `httparchive.reports.cwv_tech_adoption`
WHERE date = "2024-11-01"
AND technology = 'WordPress'
AND rank = 'ALL'
AND geo = 'ALL'
)
CROSS JOIN (
SELECT
origins AS category_total
FROM `httparchive.reports.cwv_tech_categories`
WHERE category = 'CMS'
) We loose 1.9M origins, when we don't include origins from part
SELECT
category_total,
tech_adoption,
tech_adoption / category_total AS category_share
FROM (
SELECT
COUNT(DISTINCT root_page) AS tech_adoption
FROM crawl.pages
WHERE date = '2024-11-01'
AND client = 'mobile'
AND 'WordPress' IN UNNEST(technologies.technology)
)
CROSS JOIN (
SELECT
origins AS category_total
FROM `httparchive.reports.cwv_tech_categories`
WHERE category = 'CMS'
)
WITH crux AS (
SELECT DISTINCT
CONCAT(origin, '/') AS root_page
FROM `chrome-ux-report.materialized.device_summary`
WHERE
date = '2024-11-01'
AND device IN ('phone')
), pages AS (
SELECT DISTINCT
root_page
FROM crawl.pages,
UNNEST(technologies) AS tech
WHERE
date = '2024-11-01'
AND client = 'mobile'
AND 'CMS' IN UNNEST(tech.categories)
)
SELECT
category_total,
tech_adoption,
tech_adoption / category_total AS category_share
FROM (
SELECT
adoption.mobile AS tech_adoption
FROM `httparchive.reports.cwv_tech_adoption`
WHERE date = "2024-11-01"
AND technology = 'WordPress'
AND rank = 'ALL'
AND geo = 'ALL'
)
CROSS JOIN (
SELECT
COUNT(DISTINCT root_page) AS category_total
FROM crux
INNER JOIN pages
USING (root_page)
) Email / MailChimp
Shrinking the sample of origins will have more noticeable impact for less popular technologies. |
Quoting @rviscomi :
I see 2 issues here:
tablet
andNULL
clients from CrUX - so more unmatched origins (1.9M). Nogeo
andrank
available for aggregation.A promising analysis logic
Calculate adoption with crawl data, as it's the original source.
This will help us to solve adoption with the most complete set of origins, including the CrUX's
tablet
andNULL
clients.But only the global ones,
geo
dimension is part of CrUX and thus unavailable. We could still use INNER JOIN there.The text was updated successfully, but these errors were encountered: