IMPORTXML is a built-in Google Sheets function that pulls data from a web page using an XPath query, writing the result straight into a cell. It is free, needs no setup, and is genuinely useful for a handful of static pages. But it breaks in three predictable ways: it returns nothing on JavaScript-rendered sites, it fails after roughly 50 calls per spreadsheet, and it slows the whole sheet once you have more than a dozen formulas. A server-side scraper avoids all three because it fetches pages off the spreadsheet, renders dynamic content, and retries failures. This guide shows how IMPORTXML works, exactly where it breaks, and when to switch.
What is IMPORTXML in Google Sheets?
IMPORTXML imports data from a structured page by locating elements with an XPath query. You give it two things: a URL and an XPath expression that points at the node you want. The result lands in the cell, and Sheets re-checks it roughly every hour while the document is open. A basic call looks like this:
=IMPORTXML("https://example.com/about", "//h1")
That //h1 is an XPath query meaning "the first level-one heading on the page". You can target deeper nodes by writing a longer path expression, for instance a div nested inside an article, all kept inside the XPath string as code. The sibling function IMPORTHTML grabs a whole table or list by index instead of an XPath:
=IMPORTHTML("https://example.com/pricing", "table", 1)
For a clean, static page, both work and cost nothing.
How do you write an IMPORTXML formula?
Three steps. First, open the target page and find the element you want. Second, write an XPath that selects it. Simple expressions like //h1, //title, or //span grab common elements; more specific ones use attributes, written inside the XPath string. Third, drop the URL and the XPath into the formula. If the page is static and the XPath is correct, the value appears.
The friction is the XPath itself. You need to know the query syntax, and a small change to the page's structure can break a formula that worked yesterday, returning an error with no explanation. That brittleness is fine for one page and painful across hundreds.
Where does IMPORTXML break?
It breaks in three places, and all three are common.
- JavaScript-rendered pages return nothing.
IMPORTXMLreads the raw HTML the server sends, before any scripts run. Most large, modern sites build their content with JavaScript, so the data you want is not in that raw HTML. The function returns an empty cell, a perpetual "Loading..." state, or an error. This alone rules out a huge share of the web. - Roughly 50 calls per spreadsheet. Each sheet runs only about 50 import calls before they start failing, and each call returns up to 50,000 characters. A list of 500 URLs cannot be done with native formulas.
- The whole sheet slows down. Once you have more than a dozen import formulas, the document recalculates erratically and crawls, because every edit can trigger a refresh of every import. Behaviour becomes unpredictable.
On top of those, there is no retry, no rendering, and no error handling. When a page is slow, blocks the request, or needs a header, the formula simply fails, and there is nowhere to add a fallback. Hammer a site with formulas and Google rate-limits you, returning errors that look like the data does not exist.
IMPORTXML vs a real scraper: what is the difference?
The core difference is where the work runs. IMPORTXML runs inside the spreadsheet on Google's import infrastructure, with its tight limits. A real scraper runs the requests on a server outside the sheet, which changes everything downstream.
- Rendering. A scraper can render JavaScript and read the page a browser would build.
IMPORTXMLonly sees raw HTML. - Scale. A scraper has no 50-call ceiling because the work never touches the spreadsheet. Runs of thousands of URLs are normal.
- Resilience. A scraper retries slow pages, backs off on rate limits, and skips dead links without crashing the run.
IMPORTXMLjust fails. - No six-minute wall. Writing your own Apps Script to loop over URLs hits Google's six-minute execution cap and dies mid-run. A server-side scraper has no such limit.
The trade is simplicity. IMPORTXML is free and instant for one static page. A scraper is the tool the moment you scale or hit a dynamic site. For the full method comparison, see web scraping in Google Sheets, and for the definition of the category, what a web scraper is.
When should I keep using IMPORTXML?
Keep IMPORTXML when the job is small and the page is friendly. It is the right tool when you need one value from a static page, a single table from a simple site, or a quick check you will run once or twice. No install, no cost, no account. For a one-off lookup, reaching for a scraper is overkill.
Switch to a scraper when any of these are true: the target site renders content with JavaScript, you have more than a few dozen URLs, you need the run to retry and resume reliably, or you want to feed the output straight into an AI step. At that point the native function stops being a shortcut and starts being a bottleneck.
How does a scraper handle this in the same spreadsheet?
ReplyLabs runs as a sidebar inside Google Sheets, so you stay in the spreadsheet but the fetching moves to a server. You select a column of URLs, choose an engine, and results stream back into a new column. The in-house engine starts at $0.002 per URL and auto-falls back to another engine when a page resists; Jina ($0.005) returns clean text and Firecrawl ($0.0075) handles the hardest JavaScript pages. Only succeeded URLs are charged, so dead links cost nothing, and new accounts get $20 of free credit. To compare a full run against your current native-formula approach, use the cost calculator.
Common questions
Can IMPORTXML scrape JavaScript pages?
No. IMPORTXML reads the raw HTML the server returns before any JavaScript runs, so content built in the browser is not visible to it. The formula returns an empty cell, a stuck "Loading..." state, or an error. A server-side scraper that renders the page reaches that content.
What is the IMPORTXML limit in Google Sheets?
About 50 import calls per spreadsheet before they start failing, with each call returning up to 50,000 characters. Performance also degrades once you have more than a dozen import formulas in one sheet.
Why does IMPORTXML return an error or #N/A?
Usually the page is JavaScript-rendered, you have exceeded the per-sheet import limit, or Google is rate-limiting your requests. The native functions give you no way to retry or render, which is why scale and dynamic pages need a different tool.
Is a scraper worth it over the free function?
For one static page, no. The moment you have dozens of URLs, a JavaScript site, or a need for retries, a scraper that runs off the sheet is the only thing that finishes the job. See web scraping in Google Sheets or start with ReplyLabs.