Collect data from the web
Turn any website into structured data — describe what you want and the agent goes and gets it.
The idea
You want structured data from the web. Company info, product pricing, job postings, real estate listings, news — anything that exists on a public website. Instead of copying and pasting, you describe what you need and SheetOS collects it for you.
The result is a table with typed columns and real data, ready to sort, filter, and build on.
Describe what you want
Open the agent and write a prompt. Good prompts have three parts:
- What — the type of data you’re after
- Where — a specific site, URL, or topic to search
- Which fields — the columns you want (optional — the agent will choose if you don’t specify)
Examples
Find the top 20 SaaS companies by revenue. Include company name, website, estimated revenue, and employee count.
Get all job postings from stripe.com/jobs. Include job title, team, location, and a link to the posting.
Extract the pricing plans from https://example.com/pricing. Include plan name, monthly price, and what’s included.
The more specific you are — especially with URLs — the faster and more accurate the results.
Results
The agent creates a table with:
- Named columns matching the data you asked for, each with an appropriate type (text, number, URL, date, etc.)
- Rows filled with real data extracted from the web
- A new page tab at the bottom of the workbook so you can see the results immediately
If the agent can’t find data for a field, it leaves the cell empty rather than guessing.
Refine the results
The table is live — you can keep talking to the agent to adjust it:
- Add columns — “add a column for founding year”
- Add more rows — “find 10 more companies”
- Filter what’s there — “remove rows where revenue is under $1M”
- Fix specific values — “the employee count for Acme Corp should be 500”
- Change the source — “also check crunchbase.com for the funding data”
Each request updates the existing table. The agent doesn’t start over unless you ask it to.
Tips
Give URLs when you have them. “Get data from https://example.com/products” is faster and more accurate than “find products from Example company.” The agent doesn’t have to search — it goes straight to the source.
Be specific about fields. “Find companies” is vague. “Find companies with name, website, employee count, and last funding round” gives the agent a clear schema to work with.
Start small, then expand. Ask for 10 rows first. If the results look good, ask for more. This lets you catch issues early before the agent does a lot of work.
Use follow-ups for corrections. If a column has wrong data, tell the agent what’s wrong: “the prices in column C are in euros, convert them to USD.” It’s faster than re-collecting everything.
Note
The agent searches the open web. It works best with publicly accessible data — it can’t log into accounts or access paywalled content.
Next
Tables
Learn more about table structure, column types, and sorting.
Automations
Keep your collected data fresh with scheduled updates.