# Component test This page is not linked from any user-facing surface; it exists so the U8 smoke check can verify every Mintlify-style MDC component renders without prerender errors. Not listed in `_nav.yml`. ## Card / CardGroup ::card-group{cols="2"} :::card{icon="rocket" title="One"} Linkable card with icon. ::: :::card{icon="code-2" title="Two"} Plain card. ::: :: ## Callouts ::callout{type="note"} Neutral callout. :: ::callout{type="tip"} Brand-tinted tip callout. :: ::callout{type="info"} Info callout. :: ::callout{type="warning"} Signal-orange warning callout. :: ## Steps ::steps :::step{title="First step"} Initial action. ::: :::step{title="Second step"} Follow-up. ::: :: ## Tabs ::tabs :::tab{title="One"} Tab content one. ::: :::tab{title="Two"} Tab content two. ::: :: ## Accordion ::accordion{title="Click me"} Hidden content. :: ## Code group ::code-group ```bash [cURL] echo hello ``` ```js [JavaScript] console.log('hello') ``` :: ## Frame Plain text inside a frame wrapper. # Account settings Your account is you as a person: your name, your email, and your password. It is separate from [teams](https://trawley.ai/docs/users/teams-and-members), which own scrapers and billing. One account can belong to many teams. ## What you can manage - **Profile.** Your name and personal details. - **Email address.** The address you sign in with and receive notifications at. - **Password.** Your sign-in credentials. ![Account settings](https://trawley.ai/docs/users/account-settings.png) ## Changing your email When you change your email address, Trawley sends a verification step to confirm the new address belongs to you before the change takes effect. This keeps your account secure if someone else ever gets access to a session. ::callout{type="tip"} Use an email you will keep access to long term, especially if you are the administrator for a team's billing. Account recovery and important notices go to this address. :: ## Account vs team If you are looking for plan limits, invoices, or members, those live in [billing and plans](https://trawley.ai/docs/users/billing-and-plans) and [teams and members](https://trawley.ai/docs/users/teams-and-members), not here. Account settings are just about you. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/teams-and-members icon: users title: Teams & members --- Manage the teams you belong to. ::: :::card --- href: https://trawley.ai/docs/users/billing-and-plans icon: gauge title: Billing & plans --- Plans, usage, and invoices. ::: :: # Billing & plans Billing happens at the [team](https://trawley.ai/docs/users/teams-and-members) level. A team is on a plan, and that plan sets how much you can do each month. ## What a plan covers Plans include monthly allowances for the work Trawley does, such as: - **Scrape runs** that collect data from sites. - **Search API calls** when you query results from your own apps. - **Results API calls** when you read results programmatically. When you reach an allowance, that activity pauses until the next billing period or until you upgrade. Searching from the dashboard and building scrapers are part of normal use; the metered allowances mainly affect heavy or programmatic usage. ![A team's plan and usage](https://trawley.ai/docs/users/billing-and-plans.png) ## Seeing your usage A team's billing area shows your current plan and how much of each allowance you have used this period. Check it if a [scheduled run](https://trawley.ai/docs/users/scheduling) stops happening or an API call returns a limit error, since you may have reached an allowance. ## Upgrading and managing billing Upgrades and payment details are handled through a secure billing portal. From there you can change plan, update your card, and view invoices. ::callout{type="tip"} If you mostly hit limits from frequent [schedules](https://trawley.ai/docs/users/scheduling), slowing a scraper's cadence can be cheaper than upgrading. Match each scraper's schedule to how often its source actually changes. :: ## For developers API allowances and what a `429` looks like are covered in [rate limits and quotas](https://trawley.ai/docs/developers/rate-limits-and-quotas). ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/scheduling icon: zap title: Scheduling --- Balance freshness against usage. ::: :::card --- href: https://trawley.ai/docs/users/account-settings icon: settings title: Account settings --- Manage your personal account. ::: :: # Browser settings Trawley scrapes by loading a site in a real browser, the same way you would. A scraper's browser settings control how that page is loaded, which matters for sites that are slow, heavy, or particular about how they render. ## When to adjust them Most scrapers never need changes here. Reach for browser settings when a site: - Loads content slowly or in stages, so the scraper grabs the page before it is ready. - Behaves differently depending on how it is visited. - Returns empty results even though the data is clearly visible to you. ![Browser settings](https://trawley.ai/docs/users/browser-settings.png) ## A good order to try things ::steps :::step{title="Confirm it is a loading issue"} If results are empty but the page looks fine in your own browser, the scraper may be reading the page too early or being treated as a bot. ::: :::step{title="Adjust browser settings"} Tune how the page is loaded so the content has time to appear. ::: :::step{title="Consider a proxy"} If the site is actively blocking automated visits, browser settings will not help. A [proxy](https://trawley.ai/docs/users/proxies) is the right tool. ::: :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/proxies icon: shield title: Proxies --- Get past sites that block automated traffic. ::: :::card --- href: https://trawley.ai/docs/users/the-doctor icon: wrench title: The doctor --- Diagnose a scraper that stopped working. ::: :: # Choosing fields Fields are the pieces of data Trawley captures from each item. Choosing them well makes your results easier to read, search, and filter. ## What makes a good field - **One value per field.** Keep `price` and `bedrooms` separate rather than a single "details" blob. Separate fields are searchable on their own. - **Capture a stable identifier.** A link or title that uniquely identifies each item lets Trawley track changes between runs and power [diffs](https://trawley.ai/docs/developers/diff-endpoint). - **Only what you need.** Every field is something to maintain. Skip data you will not use. ## Data types Each field has a type. The type controls how the value is stored and, crucially, how you can search it later. | Type | Use for | Lets you | | ------- | ------------------------------- | --------------------------------------- | | Text | Titles, descriptions, locations | Search by meaning and keywords | | Number | Prices, counts, sizes | Filter by ranges (under £500k, 3+ beds) | | Date | Listed dates, deadlines | Filter and sort by time | | Boolean | Yes/no flags (has parking) | Filter to true or false | ::callout{type="tip"} Getting number and date types right matters most. Trawley uses them to build exact filters during search, so "under £500k" only works if `price` is a number, not text. The wizard suggests types for you, but it is worth a quick check. :: ## Letting the wizard help You do not assign types by hand from a blank slate. As you describe your fields in the [setup wizard](https://trawley.ai/docs/users/the-setup-wizard), it proposes a sensible type for each one based on the real values on the page. You confirm or correct. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/pagination icon: boxes title: Pagination --- Make sure the scraper collects every page of items. ::: :::card --- href: https://trawley.ai/docs/users/preview-and-validate icon: check title: Preview and validate --- Check field values against the real page. ::: :: # Exporting When you want to work with your data outside Trawley, export it. You get a single file containing every result from the scraper, in the format you choose. ## Formats - **CSV** opens directly in spreadsheets like Excel or Google Sheets. Each field is a column. - **JSON** is best for feeding another tool or a developer workflow. ![Exporting results](https://trawley.ai/docs/users/exporting.png) ## What you get An export includes all of the scraper's current results, not just the page you are viewing. The file is named after your scraper, and the columns are your field names, so it is ready to use without cleanup. ## Exporting from code Developers can trigger the same export programmatically and choose the format with a single request. See the [export endpoint](https://trawley.ai/docs/developers/export-endpoint). ::callout{type="tip"} For a one-off analysis, export and open in a spreadsheet. If you need the data to stay in sync with each run, call the [results endpoint](https://trawley.ai/docs/developers/results-endpoint) or [search](https://trawley.ai/docs/developers/hybrid-search) from your app instead of re-exporting by hand. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/searching-results icon: search title: Search instead --- Often you want an answer, not the whole file. ::: :::card --- href: https://trawley.ai/docs/developers/export-endpoint icon: file-text title: Export endpoint --- Automate exports from your own code. ::: :: # How Trawley works Trawley has a small vocabulary. Learn these five words and the rest of the product makes sense. ## Scraper A **scraper** points at a website and extracts a **list of items** from it, for example every property on a listings page or every product in a catalogue. A scraper is the thing you build once and reuse. ::callout{type="info"} Every Trawley scraper works on lists. You give it a page that shows many items, and it captures each one as a record. :: ## Field A **field** is a single piece of data you want from each item, like `price`, `bedrooms`, or `location`. You choose the fields when you build the scraper, and the wizard helps you map them to the right part of the page. ## Job A **job** is one run of a scraper. Each time the scraper collects data, whether you run it by hand or on a schedule, that is a job. Jobs move through a simple lifecycle: pending, running, then completed or failed. ## Result A **result** is one item the scraper captured during a job, with a value for each field. A run over a listings page produces one result per listing. ## Team Your scrapers belong to a **team**. Teams let you share scrapers and collaborate, and they are where billing and plan limits live. You can belong to more than one team and switch between them. ## How they fit together ```text Team └── Scraper (points at a site, defines fields) └── Job (one run) └── Result (one captured item, with field values) ``` ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/the-setup-wizard icon: wand-2 title: Build your first scraper --- Use the wizard to define a scraper by chatting. ::: :::card --- href: https://trawley.ai/docs/users/running-a-scrape icon: play title: Run and schedule --- Run a scraper and keep it up to date automatically. ::: :: # Job logs & status Every run keeps a record. The job's status tells you the outcome, and its logs tell you the story of what happened along the way. ## Reading a job Open a job from a scraper's run history to see its status and logs. ![A job's logs](https://trawley.ai/docs/users/job-logs-and-status.png) The logs show the steps the run took: the pages it visited, the items it captured, and any problems it hit. When a run does not produce what you expected, the logs are the first place to look. ## When a job fails A failed job usually points at one of a few causes: - **The site changed.** The page layout moved and the scraper's mapping no longer matches. The [doctor](https://trawley.ai/docs/users/the-doctor) is built for this. - **The site blocked the request.** Some sites refuse automated visits. Trying a [proxy](https://trawley.ai/docs/users/proxies) often gets past this. - **A temporary glitch.** The site was briefly down or slow. ## Retrying For a transient failure, retry the run. If you suspect the site blocked Trawley, there is also an option to retry without a proxy (or with one), so you can quickly test whether the proxy was the issue. ::callout{type="tip"} If the same job fails repeatedly in the same way, stop retrying and run the doctor instead. Repeated identical failures mean something needs fixing, not another attempt. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/the-doctor icon: wrench title: The doctor --- Diagnose and repair a broken scraper. ::: :::card --- href: https://trawley.ai/docs/users/warnings icon: flag title: Warnings --- Catch problems before they fail a run. ::: :: # Nested pages A listing page often shows a summary of each item, while the full detail lives on the item's own page. Nested scraping follows the link for each item and pulls fields from that detail page too. ## When you need it Use nested pages when the data you want is not on the list. For example, a property listings page might show price and address, but bedrooms, floor area, and the full description only appear when you open a specific listing. ## How it works When the assistant detects that each item links to its own page, it can open those pages and suggest extra fields from them. Your scraper then captures both the list-level fields and the detail-page fields for every item. ## Setting it up In the [setup wizard](https://trawley.ai/docs/users/the-setup-wizard), tell the assistant which fields come from the detail page: "open each listing and also grab the floor area and the full description." It analyses an item's page and proposes those fields, testing them against the real page like any other field. ::callout{type="warning"} Nested scraping visits one extra page per item, so runs take longer and use more of your plan's allowance than a list-only scraper. Only add detail-page fields you actually need. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/preview-and-validate icon: check title: Preview and validate --- Check that detail-page fields populate correctly. ::: :::card --- href: https://trawley.ai/docs/users/running-a-scrape icon: play title: Running a scrape --- Run the scraper and watch the job complete. ::: :: # Pagination Most listing pages show items in batches. Pagination is how Trawley moves through all of them so your scraper captures the full set, not just what is on the first screen. ## How sites paginate Trawley handles the common patterns: - **Next button.** The site has a "Next" or numbered page links. Trawley clicks through them. - **Load more.** A "Load more" button reveals additional items each time it is clicked. Trawley keeps clicking until there are no more. - **Infinite scroll.** New items load as you scroll down. Trawley scrolls to pull them in. ## Setting it up You do not need to know which pattern a site uses. During the [setup wizard](https://trawley.ai/docs/users/the-setup-wizard), the assistant checks the page for pagination and tells you what it found. If it detects a "Next" button or a "Load more" control, it configures the scraper to follow it. ::callout{type="tip"} If the assistant misses pagination, point it out: "there's a Load more button at the bottom" or "results continue on page 2". It will re-check and wire it up. :: ## Why it matters Pagination decides how complete your data is. A scraper that only reads the first page silently misses everything after it. It is worth confirming during setup that the preview includes items you know appear further down. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/nested-pages icon: link title: Nested pages --- Capture extra detail from each item's own page. ::: :::card --- href: https://trawley.ai/docs/users/preview-and-validate icon: check title: Preview and validate --- Confirm the scraper reaches every item. ::: :: # Preview and validate Before you save a scraper, Trawley shows you a preview of the data it would capture from the real page. Validating here is the difference between a scraper that just works and one you have to fix after its first run. ## The preview As you build, the assistant pulls a sample of real results and shows them with your fields as columns. You see actual values, not placeholders, so you can spot problems immediately. ## What to check ::steps :::step{title="Every field has a value"} Scan for empty columns. A field that is blank across the preview is mapped to the wrong place. Tell the assistant and it will re-map it. ::: :::step{title="Values look right"} Check that prices are numbers, dates are dates, and text is the intended text. A price showing `Offers over £400,000` may need cleaning to a plain number. ::: :::step{title="The right number of items"} If the page shows 24 listings and the preview has 12, pagination or the item selector needs attention. ::: :::step{title="Detail fields populate"} If you added [nested page](https://trawley.ai/docs/users/nested-pages) fields, confirm they are filled in, not empty. ::: :: ## Fixing problems Every problem is fixed by describing it in the chat. "The bedrooms field is empty", "only the first page is showing", or "the price includes the currency symbol" all give the assistant enough to correct the configuration and refresh the preview. ::callout{type="tip"} Validate against an item you know well. Pick a listing you can open yourself, then confirm Trawley captured the same values you see. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/running-a-scrape icon: play title: Run your scraper --- Save, run, and collect the full dataset. ::: :::card --- href: https://trawley.ai/docs/users/the-doctor icon: wrench title: If something breaks --- Diagnose a scraper that stops returning good data. ::: :: # Proxies Some websites try to block automated visits. A proxy routes your scraper's requests through a different network path, which often gets past that blocking so the scraper can read the page. ## When you need a proxy Consider a proxy when: - A scraper that works on other sites returns nothing on this one. - A [job fails](https://trawley.ai/docs/users/job-logs-and-status) in a way that suggests the site refused the request. - The site loads fine for you in a normal browser but not for the scraper. ![Proxy settings](https://trawley.ai/docs/users/proxies.png) ## Testing whether the proxy is the issue When a run fails, you can retry it with or without a proxy. That is the fastest way to find out whether the proxy is helping or whether the problem lies elsewhere. If a run only succeeds with a proxy enabled, leave it on for that scraper. ::callout{type="tip"} Do not enable a proxy by default. Proxied requests can be slower, so use one only for sites that actually need it. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/browser-settings icon: globe title: Browser settings --- The other lever for sites that will not load cleanly. ::: :::card --- href: https://trawley.ai/docs/users/warnings icon: flag title: Warnings --- See when Trawley flags a scraper as unhealthy. ::: :: # Quickstart Welcome to Trawley. You point it at a website, describe what you want in plain language, and it builds a scraper that turns that site into structured, searchable data. No code, no CSS selectors to hand-write. ## What you'll need A web browser and the address of a page that lists the things you care about, for example a property listings page or a product catalogue. That's it. ## Steps ::steps :::step{title="Create an account"} Sign up at `app.trawley.ai`. You start on a free plan, so you can build and test a scraper right away. ::: :::step{title="Start a new scraper"} Click **New scraper** and paste the URL of the page you want to scrape. Trawley opens the page and starts looking at it for you. ::: :::step{title="Tell the wizard what to extract"} You chat with an assistant. Describe the data you want ("the price, number of bedrooms, and location for each listing") and it suggests fields, tries them against the real page, and shows you a preview. ::: :::step{title="Save and run"} When the preview looks right, save the scraper and run it. Trawley collects the data and indexes it for search. ::: :: ![The Trawley setup wizard](https://trawley.ai/docs/users/the-setup-wizard.png) ## What happens next Once a scraper has run, you can browse and search its results, schedule it to re-run automatically, and query it from your own apps through the API. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/how-trawley-works icon: lightbulb title: How Trawley works --- The handful of ideas behind scrapers, fields, jobs, and results. ::: :::card --- href: https://trawley.ai/docs/users/the-setup-wizard icon: wand-2 title: The setup wizard --- A closer look at building a scraper by chatting. ::: :: # Running a scrape Once a scraper is saved, running it collects the data and indexes it for search. Each run is called a job. ## Running on demand Open a scraper from your dashboard and start a run. Trawley queues the work, opens the site, and captures every item. You can leave the page; the run continues in the background. ![A scraper's job history](https://trawley.ai/docs/users/running-a-scrape.png) ## The job lifecycle A job moves through a few states: | State | Meaning | | --------- | ---------------------------------------------------------- | | Pending | The job is queued and waiting to start. | | Running | Trawley is actively collecting data from the site. | | Completed | The run finished and results are ready to view and search. | | Failed | Something stopped the run. Open the job to see why. | ::callout{type="tip"} Large sites with lots of pages or [nested detail pages](https://trawley.ai/docs/users/nested-pages) take longer. The job keeps running in the background, so you do not need to wait on the page. :: ## After a run A completed job updates the scraper's results. From there you can [browse and search](https://trawley.ai/docs/users/viewing-results) them, or set the scraper to re-run on a [schedule](https://trawley.ai/docs/users/scheduling) so the data stays fresh. If a run fails or the data looks off, the [job logs](https://trawley.ai/docs/users/job-logs-and-status) and the [doctor](https://trawley.ai/docs/users/the-doctor) help you work out what happened. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/scheduling icon: zap title: Schedule it --- Keep your data current automatically. ::: :::card --- href: https://trawley.ai/docs/users/job-logs-and-status icon: terminal title: Job logs & status --- See exactly what a run did. ::: :: # Scheduling Data goes stale. Scheduling re-runs a scraper on a regular cadence so its results stay current without you remembering to press a button. ## Setting a schedule You describe the cadence in plain language, for example "every morning at 7am" or "once a week on Monday". Trawley turns that into a recurring schedule for you, so you never write a cron expression by hand. ![Scheduling a scraper](https://trawley.ai/docs/users/scheduling.png) ## Choosing a cadence Match the schedule to how often the source actually changes: - **Daily** for fast-moving listings like property or job boards. - **Weekly** for catalogues that change less often. - **Hourly** only when you genuinely need near-real-time data, since each run uses your plan's allowance. ::callout{type="tip"} More frequent is not always better. Every scheduled run is a job that counts toward your usage. Pick the slowest cadence that still keeps your data fresh enough. :: ## Managing scheduled runs Each scheduled run is a normal [job](https://trawley.ai/docs/users/running-a-scrape), so it shows up in the scraper's history with its own status and logs. You can change or remove a schedule at any time, and still run the scraper on demand whenever you want. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/job-logs-and-status icon: terminal title: Job logs & status --- Review what each scheduled run did. ::: :::card --- href: https://trawley.ai/docs/users/billing-and-plans icon: gauge title: Plans & limits --- How runs count against your plan. ::: :: # Searching results Trawley does not just store your data, it makes it searchable in plain language. You can ask for what you want the way you would say it out loud, and Trawley works out the filters for you. ## Searching in the dashboard From a scraper's results, search using natural language: "3 bed houses under £500k with a garden", "flats added this week", or "anything in Kendal with parking". Trawley combines an understanding of your meaning with exact filters on numbers and dates to return the right items. ![Natural language search over results](https://trawley.ai/docs/users/searching-results.png) ## Why it understands you Behind the scenes, Trawley reads your query and decides which parts are exact constraints (under £500k, 3 beds) and which are fuzzy (near Kendal, nice garden). It filters precisely on the first and matches by meaning on the second. That is why you do not have to phrase queries in any special way. ::callout{type="tip"} Searches lean on your [field types](https://trawley.ai/docs/users/choosing-fields). "Under £500k" only works as a precise filter if `price` is a number. If a range filter is not behaving, check that the field's type is right. :: ## Searching from your own apps The same natural language search is available through the API, so you can build it into your own product or an AI agent. See [hybrid search](https://trawley.ai/docs/developers/hybrid-search) for developers. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/exporting icon: file-text title: Export your data --- Take results into a spreadsheet or another tool. ::: :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Search from code --- Use the same search in your own apps. ::: :: # Teams & members Teams are how Trawley groups scrapers and people. Your scrapers belong to a team, and the people in that team can work on them together. Billing and plan limits also live at the team level. ## Creating a team You can create a team for a project, a client, or an organisation. You can belong to several teams at once, for example a personal team and a shared work team, and switch between them whenever you like. ![Team settings and members](https://trawley.ai/docs/users/teams-and-members.png) ## Inviting members Invite people to a team by sending them an invitation. When they accept, they join the team and can see and work on its scrapers. This is how you share a scraper with colleagues without handing over your own login. ::steps :::step{title="Open the team's settings"} Go to the team you want to add someone to. ::: :::step{title="Send an invite"} Invite the person. They receive a link to join the team. ::: :::step{title="They accept"} Once they join, they appear in the member list with a role. ::: :: ## Roles Members have roles that control what they can do in the team. Use roles to give collaborators the access they need without making everyone an administrator. ::callout{type="tip"} Keep the number of administrators small. Give most collaborators a role that lets them do their work, and reserve admin access for the few people who manage the team and its billing. :: ## Switching teams A team switcher lets you move between the teams you belong to. The scrapers, results, and billing you see always belong to the team you currently have active. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/billing-and-plans icon: gauge title: Billing & plans --- Manage a team's plan and usage. ::: :::card --- href: https://trawley.ai/docs/users/transfer-between-teams icon: boxes title: Transfer a scraper --- Move a scraper into a shared team. ::: :: # The doctor Websites change. When a site you scrape moves things around, a scraper that worked yesterday can start returning empty or wrong data. The doctor is the tool for putting it right, the same conversational assistant you used to build the scraper, now focused on fixing it. ## When to use it Run the doctor when: - A job [failed](https://trawley.ai/docs/users/job-logs-and-status) or returned far fewer items than usual. - A field that used to populate is now empty. - A [warning](https://trawley.ai/docs/users/warnings) flags that a scraper looks unhealthy. ## How it works The doctor re-opens the site, compares it against your scraper's configuration, and works out what changed. Because it looks at the live page, it can tell the difference between "the site is down" and "the price moved to a different place" and propose a fix for the latter. ![The doctor diagnosing a scraper](https://trawley.ai/docs/users/the-doctor.png) ## Fixing things You work with the doctor the same way you built the scraper: describe what looks wrong and it re-maps the affected fields, tests them against the real page, and shows you a fresh preview. When the preview is healthy again, save and re-run. ::callout{type="tip"} After a fix, run the scraper once by hand and confirm the results look right before relying on the next scheduled run. A quick manual check saves a stale overnight job. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/warnings icon: flag title: Warnings --- Spot health problems before they fail a run. ::: :::card --- href: https://trawley.ai/docs/users/proxies icon: shield title: Proxies --- When the fix is getting past a block, not re-mapping fields. ::: :: # The setup wizard You build a Trawley scraper by talking to an assistant. You describe what you want in plain language, and it inspects the real page, proposes how to extract your data, tests it, and shows you a preview. You never touch a CSS selector. ## Starting out ::steps :::step{title="Paste a start URL"} Click **New scraper** and enter the address of a page that lists the items you want. The assistant opens that page and begins reading its structure. ::: :::step{title="Describe what you want"} Tell the assistant the data to capture, for example: "Grab the title, price, number of bedrooms, and location for each property." It works out which part of the page each field comes from. ::: :::step{title="Watch it work"} As it goes, the assistant shows what it is doing: reading the page, testing a field, checking for pagination, and pulling a preview. You can follow along and step in any time. ::: :: ![The Trawley setup wizard](https://trawley.ai/docs/users/the-setup-wizard.png) ## What the assistant can do The assistant is not just generating guesses. It works against the live page and can: - **Read the page structure** to find the repeating items in a list. - **Test a field** by extracting it from the real page and showing you the value. - **Check pagination** so the scraper collects every page, not just the first. - **Preview results** so you see real captured data before saving. ## Refining by conversation If a field is wrong, say so. "The price is picking up the old price, use the discounted one" or "split the location into town and county" are the kinds of instructions it understands. The configuration updates as you chat, and you can keep refining until the preview is right. ::callout{type="tip"} Be specific about edge cases you have already spotted: missing prices, items that are sold, or fields that sometimes have two values. Mentioning them early saves a round trip later. :: ## Saving When the preview looks correct, save the scraper. From there you can [run it](https://trawley.ai/docs/users/running-a-scrape), put it on a [schedule](https://trawley.ai/docs/users/scheduling), and [search the results](https://trawley.ai/docs/users/searching-results). ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/choosing-fields icon: list title: Choosing fields --- Pick the right fields and data types for clean results. ::: :::card --- href: https://trawley.ai/docs/users/preview-and-validate icon: check title: Preview and validate --- Make sure the data is right before you save. ::: :: # Transfer between teams Scrapers belong to a [team](https://trawley.ai/docs/users/teams-and-members). If you need a scraper to live under a different team, for example moving it from a personal team to a shared work team, you can transfer it. ## Why transfer - You built a scraper in your own team and now want colleagues to share it. - You are reorganising work across teams. - Billing for the scraper's runs should sit with a different team. ![Transferring a scraper to another team](https://trawley.ai/docs/users/transfer-between-teams.png) ## What moves with it Transferring a scraper moves the scraper and its configuration to the destination team. From then on, the destination team's members can manage it, and its runs count toward that team's [plan](https://trawley.ai/docs/users/billing-and-plans). ::callout{type="warning"} You can only transfer a scraper into a team you belong to. Make sure you are a member of the destination team first. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/teams-and-members icon: users title: Teams & members --- Create teams and invite people to them. ::: :::card --- href: https://trawley.ai/docs/users/billing-and-plans icon: gauge title: Billing & plans --- How a team's usage and limits work. ::: :: # Viewing results After a scraper runs, its results are ready to browse in the dashboard. Each result is one item the scraper captured, with a value for every field you defined. ## Browsing Open a scraper to see its results in a table, one row per item and one column per field. Page through them to get a feel for what was captured. ![A scraper's results table](https://trawley.ai/docs/users/viewing-results.png) ## Results come from the latest run The results you see reflect the most recent completed [job](https://trawley.ai/docs/users/running-a-scrape). Each run refreshes the data, so what you browse is always the latest snapshot Trawley collected. To compare two runs and see what changed, developers can use the [diff endpoint](https://trawley.ai/docs/developers/diff-endpoint). ## From browsing to answers Scrolling a table is fine for a quick look, but the real power is asking questions. Instead of paging through hundreds of listings, you can [search your results](https://trawley.ai/docs/users/searching-results) in plain language, or [export them](https://trawley.ai/docs/users/exporting) to work with elsewhere. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/searching-results icon: search title: Search results --- Ask questions in plain language. ::: :::card --- href: https://trawley.ai/docs/users/exporting icon: file-text title: Export --- Download your data as CSV or JSON. ::: :: # Warnings Warnings are Trawley's early signals that a scraper may not be healthy. They flag problems while you can still fix them, rather than letting a scheduled run quietly collect bad data. ## What triggers a warning A scraper can be flagged when something looks off, for example: - A run captured far fewer items than usual. - A field that normally has values came back mostly empty. - The site appears to have changed in a way that affects the scraper. ![A scraper's health and warnings](https://trawley.ai/docs/users/warnings.png) ## What to do about one A warning is a prompt to look, not always a failure. When you see one: ::steps :::step{title="Open the scraper and read the warning"} It tells you what looked wrong, such as a drop in results or an empty field. ::: :::step{title="Check recent results"} Confirm whether the data is genuinely degraded or whether the source simply had fewer items this time. ::: :::step{title="Run the doctor if needed"} If the scraper is genuinely broken, the [doctor](https://trawley.ai/docs/users/the-doctor) will diagnose and fix it. ::: :: ::callout{type="tip"} Do not dismiss a warning on a scheduled scraper without checking. The whole point is to catch a problem before the next overnight run captures a page of empty records. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/the-doctor icon: wrench title: The doctor --- Repair a scraper a warning flagged. ::: :::card --- href: https://trawley.ai/docs/users/job-logs-and-status icon: terminal title: Job logs & status --- Dig into what a specific run did. ::: :: # API overview The Trawley API is a small, read-oriented HTTP API for querying the data your scrapers collect. If you can make an HTTP request, you can use it. No SDK required. ## Base URL ```text https://api.trawley.ai ``` Every endpoint is scoped to a single scraper: ```text /v1/scrapers/{scraperId}/... ``` Your `scraperId` is the last segment of the scraper's dashboard URL (for example `scr_8f2a1c9e`). ## Conventions - **Requests** are HTTP `GET` unless noted, with parameters in the query string. - **Responses** are JSON. - **Search-style endpoints** share a common envelope: a `data` array, a `pagination` object, and a `meta` object. ```json { "data": [ /* records */ ], "pagination": { "total": 0, "page": 1, "take": 10, "totalPages": 0, "hasMore": false }, "meta": { /* endpoint-specific */ } } ``` ## Authentication ::callout{type="warning"} **There is no API key yet.** Endpoints are currently reached by scraper ID, and a scraper's results are returned to anyone who has its ID. Authentication and per-key access control are on the roadmap. :: What this means in practice today: - Treat a `scraperId` like a shared secret. Anyone with it can read that scraper's results. - Do not embed scraper IDs in public client-side code you would not want copied. - When authentication ships, existing endpoints keep working and gain an `Authorization` header. We will document the migration here. ## Rate limits Search API calls are metered against your team's plan. When you exceed your monthly allowance the API responds with `429 Too Many Requests`. Upgrade your plan or wait for the next billing period to reset. See [billing and plans](https://trawley.ai/docs/users/billing-and-plans) for current limits. ## Endpoints at a glance | Endpoint | Purpose | | ---------------------------------------------------------------------------------- | ------------------------------------- | | [`GET /v1/scrapers/{id}/hybrid`](https://trawley.ai/docs/developers/hybrid-search) | Natural language search (recommended) | | `GET /v1/scrapers/{id}/results` | Raw paginated results, no query | | `GET /v1/scrapers/{id}/export` | Bulk export of all results | | `GET /v1/scrapers/{id}/diff` | Differences between scrape runs | | `POST /v1/scrapers/{id}/chat` | Conversational queries over results | ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search --- The endpoint you will use most. Full parameter and response reference. ::: :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: plug title: Integration guides --- Turn the API into a tool your AI agent can call. ::: :: # Authentication ::callout{type="warning"} **API keys are not available yet.** This page documents the current state honestly so you can plan around it. :: ## Today The public search endpoints (hybrid, results, export, diff) are reached by scraper ID and do not require an API key: ```text GET https://api.trawley.ai/v1/scrapers/{scraperId}/hybrid?search=... ``` A scraper's results are returned to anyone who has its ID. In practice: - **Treat a `scraperId` like a shared secret.** Do not publish it anywhere you would not publish read access to that data. - **Avoid embedding scraper IDs in public client-side code.** Proxy requests through your own backend so the ID is not exposed in a browser bundle. The [chat endpoint](https://trawley.ai/docs/developers/chat-endpoint) is the exception. It is session-authenticated and currently serves the Trawley web app rather than external callers. ## What is coming API key authentication is on the roadmap (tracked internally as issue #45). When it ships: - You will generate keys from your dashboard. - Requests will carry an `Authorization: Bearer ` header. - Existing endpoints keep working. The migration will be documented here. ::callout{type="info"} Building an integration now? Write your request layer so an `Authorization` header is easy to add later. Centralise your fetch calls in one place so adding auth is a one-line change when keys arrive. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/api-overview icon: book title: API overview --- Base URL, conventions, and the endpoint map. ::: :::card --- href: https://trawley.ai/docs/developers/rate-limits-and-quotas icon: gauge title: Rate limits --- How usage is metered today. ::: :: # Chat endpoint The chat endpoint runs a small agent over a scraper's results. You send a conversation, and it replies in natural language, calling search and aggregation tools internally to ground its answer in real records. It is what powers the in-dashboard "chat with your results" panel. ::callout{type="warning"} The chat endpoint is **session-authenticated** today. Unlike hybrid search and the results endpoint (which are reached by scraper ID), chat requires a logged-in session, so it is currently usable from the Trawley web app rather than as a standalone public API. External programmatic access waits on API authentication (issue #45). For server-to-server natural language querying today, call [hybrid search](https://trawley.ai/docs/developers/hybrid-search) directly. :: ## Request ```text POST https://api.trawley.ai/v1/scrapers/{scraperId}/chat ``` ### Body ```json { "messages": [ { "role": "user", "content": "Which listings dropped in price this week?" } ] } ``` The `messages` array follows the AI SDK UI message format. The response is a streamed text reply. ## What it does under the hood The endpoint gives its model three tools and lets it choose: - **search** results using hybrid search for "find" and "filter" questions. - **aggregate** results for "how many", "average", "min", and "max" questions. - **browse** results with pagination when no query is needed. It always grounds answers in tool output rather than inventing data, and returns a readable summary. ::callout{type="tip"} If you are building your own chat experience, you usually do not need this endpoint. Wrap [hybrid search](https://trawley.ai/docs/developers/hybrid-search) as a tool in your own agent instead. See the [Vercel AI SDK guide](https://trawley.ai/docs/developers/vercel-ai-sdk) for that pattern, which gives you full control over the model and prompt. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/vercel-ai-sdk icon: sparkles title: Build your own agent --- The recommended pattern for conversational access. ::: :::card --- href: https://trawley.ai/docs/developers/authentication icon: lock title: Authentication status --- Where API auth stands and what is coming. ::: :: # Claude (Anthropic SDK) This guide gives Claude a `search_listings` tool backed by Trawley's [hybrid search](https://trawley.ai/docs/developers/hybrid-search) endpoint, using the Anthropic TypeScript SDK and its tool-use loop. ```bash npm install @anthropic-ai/sdk ``` ## Define the tool A tool is a name, a description, and a JSON Schema for its input. ```ts const searchListings = { name: 'search_listings', description: 'Search live property listings from acmehomes.co.uk. Accepts a natural ' + 'language query; constraints like price, bedrooms, or date are understood. ' + 'Returns structured records.', input_schema: { type: 'object', properties: { query: { type: 'string', description: 'What to find, e.g. "3 bed houses under £500k with a garden"', }, }, required: ['query'], }, } as const const TRAWLEY_SCRAPER_ID = 'scr_8f2a1c9e' async function runSearch(query: string) { const params = new URLSearchParams({ search: query, take: '10' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/${TRAWLEY_SCRAPER_ID}/hybrid?${params}`, ) const { data } = await res.json() return data } ``` ## Run the tool-use loop Claude decides when to call the tool. When it does, run the search and send the result back so it can finish its answer. ```ts import Anthropic from '@anthropic-ai/sdk' const client = new Anthropic() const messages: Anthropic.MessageParam[] = [ { role: 'user', content: 'Find a 3 bed house near Kendal with a garden under £500k.' }, ] while (true) { const response = await client.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [searchListings], messages, }) messages.push({ role: 'assistant', content: response.content }) const toolUse = response.content.find((block) => block.type === 'tool_use') if (response.stop_reason !== 'tool_use' || !toolUse) { // No tool call — Claude has answered. const text = response.content.find((b) => b.type === 'text') console.log(text?.text) break } const results = await runSearch((toolUse.input as { query: string }).query) messages.push({ role: 'user', content: [ { type: 'tool_result', tool_use_id: toolUse.id, content: JSON.stringify(results), }, ], }) } ``` ::callout{type="note"} The shape is always the same: send the tools, watch for a `tool_use` block, run your function, and return a `tool_result` with the matching `tool_use_id`. Claude loops until it has what it needs to answer. :: ## Tips - **Trim the records** you return to the fields Claude needs. Smaller tool results are cheaper and keep the answer focused. - **One query input is enough.** Hybrid search interprets price, bedrooms, and dates from the query string, so you do not need a parameter per filter. - **Return an empty array gracefully** when a scraper has no completed run, so Claude can say nothing was found. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: plug title: The pattern --- Why one tool is all you need. ::: :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search --- The endpoint behind the tool. ::: :: # Diff endpoint The diff endpoint compares the results of two completed scrape runs of the same scraper and reports what changed. Use it to detect new listings, dropped items, or updated values between runs, for example to power a "what's new today" feed or a price-change alert. ## Request ```text GET https://api.trawley.ai/v1/scrapers/{scraperId}/diff?jobId1={older}&jobId2={newer} ``` ### Query parameters | Parameter | Type | Description | | --------- | ------ | ------------------------------------ | | `jobId1` | string | The first run to compare. Required. | | `jobId2` | string | The second run to compare. Required. | Both jobs must belong to the scraper and must have completed. ::code-group ```bash [cURL] curl "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/diff?\ jobId1=job_a1&jobId2=job_b2" ``` ```js [JavaScript] const params = new URLSearchParams({ jobId1: 'job_a1', jobId2: 'job_b2' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/diff?${params}`, ) const diff = await res.json() ``` :: ## How records are matched Records are matched across the two runs using the scraper's first text field as a stable key (for example a title or a listing URL). A record present in one run but not the other is an addition or removal; a matched record whose contents changed is reported as a change. ::callout{type="info"} If the scraper has no text field, there is no stable key to match on and the endpoint returns `400`. Make sure your scraper captures something unique and stable, like a listing URL, to use diffs. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/users/viewing-results icon: database title: Find job IDs --- Each run has a job ID. List a scraper's runs in the dashboard. ::: :::card --- href: https://trawley.ai/docs/developers/error-responses icon: bug title: Error responses --- What `400` and `404` mean here. ::: :: # Error responses The API uses standard HTTP status codes. Errors return a JSON body with a `statusCode` and a human-readable `statusMessage`. ## Status codes | Code | Meaning | Common cause | | ----- | ----------------- | -------------------------------------------------------------------------------------------------------------------------- | | `200` | Success | The request succeeded. | | `400` | Bad request | A required parameter is missing or invalid (for example an unknown `format`, or a diff request missing `jobId1`/`jobId2`). | | `404` | Not found | The scraper ID does not exist, or a referenced job is not found or has not completed. | | `429` | Too many requests | You exceeded your plan's monthly allowance for that endpoint. | ## The empty result case A request can succeed with no data. If a scraper exists but has never completed a run, search and results endpoints return `200` with an empty `data` array: ```json { "data": [], "pagination": { "total": 0, "page": 1, "take": 10, "totalPages": 0, "hasMore": false } } ``` ::callout{type="tip"} Treat empty `data` as a normal state, not an error. In an agent, return "I could not find any matching records" rather than throwing. It usually means the query matched nothing, or the scraper has not run yet. :: ## Handling errors in code ```js const res = await fetch(url) if (res.status === 429) { // Back off or tell the user the quota is exhausted. } if (!res.ok) { const { statusMessage } = await res.json() throw new Error(`Trawley API ${res.status}: ${statusMessage}`) } const { data } = await res.json() ``` ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/rate-limits-and-quotas icon: gauge title: Rate limits --- What triggers a `429` and how allowances work. ::: :::card --- href: https://trawley.ai/docs/developers/authentication icon: lock title: Authentication --- Current status of API auth. ::: :: # Export endpoint The export endpoint streams all of a scraper's results as a downloadable file. It pages through the full result set for you, so a single request returns everything rather than one page. ## Request ```text GET https://api.trawley.ai/v1/scrapers/{scraperId}/export?format=json ``` ### Query parameters | Parameter | Type | Default | Description | | --------- | ------ | ------- | -------------------------------------- | | `format` | string | `json` | Output format. One of `json` or `csv`. | ::code-group ```bash [JSON] curl -L "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/export?format=json" \ -o results.json ``` ```bash [CSV] curl -L "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/export?format=csv" \ -o results.csv ``` :: The response is sent as a file attachment named after your scraper (for example `acme_listings_results.csv`). CSV columns are your scraper's field names. ::callout{type="warning"} Passing any `format` other than `json` or `csv` returns `400 Bad Request`. :: ## When to use export vs results | Use export when | Use the results endpoint when | | ----------------------------------------------- | -------------------------------------------- | | You want the entire dataset in one file | You want to page through results in your app | | You are loading into a spreadsheet or warehouse | You need a small, fast slice | | A download is the end product | The data feeds further requests | ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/results-endpoint icon: database title: Results endpoint --- Paginated access when you do not need the whole set. ::: :::card --- href: https://trawley.ai/docs/developers/diff-endpoint icon: boxes title: Compare two runs --- See what changed between scrapes. ::: :: # Hybrid search Hybrid search is the endpoint you will reach for most. You send a natural language query, and Trawley combines three techniques in a single call: 1. **AI-generated structured filters** turn precise constraints in your query (bedrooms, price, dates) into exact filters. 2. **Vector similarity** matches fuzzy, semantic parts of the query (locations, descriptions, names) by meaning, not exact words. 3. **Keyword matching** catches literal term overlaps. The result: "3 bed houses near Kendal under £500k with a garden" returns the right records without you writing any query logic. ## Request ```text GET https://api.trawley.ai/v1/scrapers/{scraperId}/hybrid ``` ### Query parameters | Parameter | Type | Default | Description | | --------- | ------ | -------- | --------------------------- | | `search` | string | required | The natural language query. | | `page` | number | `1` | Page of results to return. | | `take` | number | `10` | Number of results per page. | ::code-group ```bash [cURL] curl "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid?\ search=3+bed+houses+with+a+garden&page=1&take=10" ``` ```js [JavaScript] const params = new URLSearchParams({ search: '3 bed houses with a garden', page: '1', take: '10', }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid?${params}`, ) const { data, pagination, meta } = await res.json() ``` ```python [Python] import requests res = requests.get( "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid", params={"search": "3 bed houses with a garden", "page": 1, "take": 10}, ) body = res.json() ``` :: ## Response ```json { "data": [ { "title": "3 bed semi-detached house", "price": 425000, "bedrooms": 3, "location": "Kendal", "url": "https://acmehomes.co.uk/listings/3-bed-semi-kendal" } ], "pagination": { "total": 18, "page": 1, "take": 10, "totalPages": 2, "hasMore": true }, "meta": { "filter": "bedrooms = 3" } } ``` | Field | Description | | ----------------------- | ---------------------------------------------------------------------------------------- | | `data` | Array of matching records. Each record's fields are the ones you defined on the scraper. | | `pagination.total` | Estimated total number of matches across all pages. | | `pagination.totalPages` | Total pages given the current `take`. | | `pagination.hasMore` | `true` when more pages remain after this one. | | `meta.filter` | The structured filter Trawley's AI derived from your query. | ## Debugging with `meta.filter` `meta.filter` shows the exact structured filter the AI built from your natural language query before similarity ranking. It is the single most useful field for understanding why you got the results you did. ::callout{type="tip"} If a query returns too few or unexpected results, check `meta.filter`. A query like "cheap flats" might produce no filter at all (price thresholds are fuzzy), while "flats under £200k" produces `price < 200000`. Seeing the filter tells you whether the constraint was interpreted as exact or left to similarity search. :: Trawley deliberately leaves fuzzy aspects (approximate locations, names, free text) out of the filter and resolves them through vector similarity instead, so you do not need exact-match values for them. ## Notes - Only fields with a non-text data type (number, date, boolean) are eligible for structured filters. Text fields are always matched by similarity. - Results come from the most recent completed scrape run. - If the scraper has never completed a run, `data` is an empty array. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/api-overview icon: gauge title: Rate limits & errors --- What a `429` looks like and how quotas work. ::: :::card --- href: https://trawley.ai/docs/developers/vercel-ai-sdk icon: sparkles title: Use it as an agent tool --- Wrap this endpoint as a function your LLM can call. ::: :: # LangChain This guide wraps Trawley's [hybrid search](https://trawley.ai/docs/developers/hybrid-search) endpoint as a LangChain tool, so a LangChain agent can call it. ```bash npm install langchain @langchain/core @langchain/openai zod ``` ## Define the tool LangChain's `tool` helper takes a function plus a Zod schema describing its input. ```ts import { tool } from '@langchain/core/tools' import { z } from 'zod' const TRAWLEY_SCRAPER_ID = 'scr_8f2a1c9e' export const searchListings = tool( async ({ query }) => { const params = new URLSearchParams({ search: query, take: '10' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/${TRAWLEY_SCRAPER_ID}/hybrid?${params}`, ) const { data } = await res.json() // Tools must return a string; the model reads it as the observation. return JSON.stringify(data) }, { name: 'search_listings', description: 'Search live property listings from acmehomes.co.uk. Accepts a natural ' + 'language query; constraints like price, bedrooms, or date are understood. ' + 'Returns structured records.', schema: z.object({ query: z.string().describe('What to find, e.g. "3 bed houses under £500k with a garden"'), }), }, ) ``` ## Use it in an agent ```ts import { ChatOpenAI } from '@langchain/openai' import { createReactAgent } from 'langchain/agents' import { searchListings } from './search-listings' const model = new ChatOpenAI({ model: 'gpt-4.1' }) const agent = createReactAgent({ llm: model, tools: [searchListings], }) const result = await agent.invoke({ messages: [ { role: 'user', content: 'Find a 3 bed house near Kendal with a garden under £500k.' }, ], }) console.log(result.messages.at(-1)?.content) ``` ::callout{type="note"} A LangChain tool must return a string, so serialise the records with `JSON.stringify`. The agent reads that string as the tool observation and uses it to answer. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: plug title: The pattern --- Why a single query tool beats one tool per filter. ::: :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search --- The endpoint behind the tool. ::: :: # OpenAI function calling This guide defines a `search_listings` function that the OpenAI models can call, backed by Trawley's [hybrid search](https://trawley.ai/docs/developers/hybrid-search) endpoint. ```bash npm install openai ``` ## Define the function ```ts const tools = [ { type: 'function', function: { name: 'search_listings', description: 'Search live property listings from acmehomes.co.uk. Accepts a natural ' + 'language query; constraints like price, bedrooms, or date are understood.', parameters: { type: 'object', properties: { query: { type: 'string', description: 'What to find, e.g. "3 bed houses under £500k with a garden"', }, }, required: ['query'], }, }, }, ] as const const TRAWLEY_SCRAPER_ID = 'scr_8f2a1c9e' async function runSearch(query: string) { const params = new URLSearchParams({ search: query, take: '10' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/${TRAWLEY_SCRAPER_ID}/hybrid?${params}`, ) const { data } = await res.json() return data } ``` ## Run the call loop ```ts import OpenAI from 'openai' const client = new OpenAI() const messages: OpenAI.ChatCompletionMessageParam[] = [ { role: 'user', content: 'Find a 3 bed house near Kendal with a garden under £500k.' }, ] while (true) { const completion = await client.chat.completions.create({ model: 'gpt-4.1', messages, tools, }) const message = completion.choices[0].message messages.push(message) if (!message.tool_calls?.length) { console.log(message.content) break } for (const call of message.tool_calls) { const { query } = JSON.parse(call.function.arguments) const results = await runSearch(query) messages.push({ role: 'tool', tool_call_id: call.id, content: JSON.stringify(results), }) } } ``` ::callout{type="note"} After running a function, push a `role: 'tool'` message with the matching `tool_call_id`. The model reads the result and continues until it can answer in plain text. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: plug title: The pattern --- One tool, natural language in, structured records out. ::: :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search --- Parameters and response shape. ::: :: # Quickstart Trawley turns any website into a natural language search API. You build a scraper once in the dashboard, and Trawley keeps it indexed so your code can ask questions like "3 bed houses under £500k with parking" and get structured JSON back. This quickstart gets you from a finished scraper to your first API response. ::callout{type="info"} You need a scraper that has run at least once. If you have not built one yet, start with the [no-code setup wizard](https://trawley.ai/docs/users/the-setup-wizard), then come back here. :: ## Steps ::steps :::step{title="Grab your scraper ID"} Open your scraper in the dashboard at `app.trawley.ai`. The ID is the last segment of the URL: ```text https://app.trawley.ai/scrapers/scr_8f2a1c9e ^^^^^^^^^^^^ ``` ::: :::step{title="Call the search endpoint"} The recommended endpoint is **hybrid search**. Pass a natural language query in the `search` parameter: ::::code-group ```bash [cURL] curl "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid?search=3+bed+houses+with+a+garden" ``` ```js [JavaScript] const res = await fetch( 'https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid?' + new URLSearchParams({ search: '3 bed houses with a garden' }), ) const { data, pagination } = await res.json() ``` ```python [Python] import requests res = requests.get( "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/hybrid", params={"search": "3 bed houses with a garden"}, ) data = res.json()["data"] ``` :::: ::: :: \:: ::step{title="Read the results"} You get back structured records matching the query, plus pagination metadata: ```json { "data": [ { "title": "3 bed semi in Kendal", "price": 425000, "bedrooms": 3 } ], "pagination": { "total": 18, "page": 1, "take": 10, "totalPages": 2, "hasMore": true }, "meta": { "filter": "bedrooms = 3" } } ``` :: \:: That is the whole loop: one GET request, structured results, no HTML parsing. ::callout{type="warning"} The API is currently reached by scraper ID and does not yet require an API key. Authentication is coming. See the [API overview](https://trawley.ai/docs/developers/api-overview) for what that means for you today. :: ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search reference --- Every query parameter, the full response shape, and how `meta.filter` works. ::: :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: sparkles title: Wire it into an agent --- Expose Trawley as a tool your LLM can call. Start with the Vercel AI SDK. ::: :: # Rate limits & quotas Trawley meters API usage against your team's plan. Each plan includes a monthly allowance of API calls, counted per billing period. ## How metering works - **Search calls** (the [hybrid endpoint](https://trawley.ai/docs/developers/hybrid-search)) count against your monthly search allowance. - **Results calls** (the [results endpoint](https://trawley.ai/docs/developers/results-endpoint)) count against a separate monthly results allowance. - **Exports** are counted as a fractional call, so bulk downloads do not burn a full request each. Allowances reset at the start of each billing period. ## Hitting a limit When you exceed an allowance, the API responds with `429 Too Many Requests` and a message naming the limit: ```json { "statusCode": 429, "statusMessage": "Monthly search API limit reached (1000). Upgrade your plan for more calls." } ``` ::callout{type="tip"} Handle `429` by backing off until the next billing period or upgrading the plan. In an agent, catch it and return a graceful "search is temporarily unavailable" message rather than failing the whole turn. :: ## Raising your limits Limits scale with your plan. See [billing and plans](https://trawley.ai/docs/users/billing-and-plans) for current allowances and how to upgrade. You can view current usage against your allowance in the dashboard. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/error-responses icon: bug title: Error responses --- The full set of status codes the API returns. ::: :::card --- href: https://trawley.ai/docs/users/billing-and-plans icon: gauge title: Plans & billing --- Allowances per plan and how to upgrade. ::: :: # Results endpoint The results endpoint returns a scraper's records directly, with pagination and sorting but no natural language interpretation. Reach for it when you want raw data rather than relevance ranking. For "answer this question" style access, use [hybrid search](https://trawley.ai/docs/developers/hybrid-search) instead. ## Request ```text GET https://api.trawley.ai/v1/scrapers/{scraperId}/results ``` ### Query parameters | Parameter | Type | Default | Description | | --------- | ------ | ------- | --------------------------- | | `page` | number | `1` | Page of results to return. | | `take` | number | `10` | Number of results per page. | ::code-group ```bash [cURL] curl "https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/results?page=1&take=50" ``` ```js [JavaScript] const params = new URLSearchParams({ page: '1', take: '50' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/scr_8f2a1c9e/results?${params}`, ) const { data, pagination } = await res.json() ``` :: ## Response ```json { "data": [ { "title": "3 bed semi-detached house", "price": 425000, "bedrooms": 3 } ], "pagination": { "total": 240, "page": 1, "take": 50, "totalPages": 5, "hasMore": true } } ``` Records contain the fields you defined on the scraper, drawn from its most recent completed run. ::callout{type="tip"} To pull every record, page until `pagination.hasMore` is `false`. For a one-shot download of all results, the [export endpoint](https://trawley.ai/docs/developers/export-endpoint) does this for you and streams a file. :: ## Rate limits The results endpoint is metered separately from search. Each plan has a monthly results call allowance; exceeding it returns `429`. See [rate limits and quotas](https://trawley.ai/docs/developers/rate-limits-and-quotas). ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Natural language search --- Let users ask in plain language instead of paging raw data. ::: :::card --- href: https://trawley.ai/docs/developers/export-endpoint icon: file-text title: Bulk export --- Download every record as JSON or CSV in one call. ::: :: # Tools overview Trawley is built to be an agent's data source. An LLM is good at understanding a user's intent but has no live, structured view of a website. Trawley gives it one: a single function the model can call with a natural language query and get back clean, structured records. ## The pattern Every integration in this section is the same shape. You expose one tool to your model: ::steps :::step{title="Define a tool"} Give it a name like `searchListings`, a description of what data it returns, and a single `query` input. The model decides when to call it. ::: :::step{title="Call hybrid search"} Inside the tool, make a GET request to the [hybrid endpoint](https://trawley.ai/docs/developers/hybrid-search) with the model's query. ::: :::step{title="Return the results"} Hand the `data` array back to the model. It uses the records to answer the user, cite specifics, or take the next step. ::: :: That is the entire integration. The model handles phrasing and reasoning; Trawley handles retrieval. ## Why this works well - **One tool, not ten.** The model does not need a tool per filter. Hybrid search interprets constraints from plain language, so a single `query` input covers "cheap flats", "3 beds near Kendal", and "listings added this week". - **Structured output.** The model receives typed fields, not a wall of HTML, so it can reason about prices and dates reliably. - **Fresh data.** Results come from your scheduled scrapes, so the agent answers from current data without you maintaining a pipeline. ## A reusable description Tool descriptions matter more than tool code. A good one tells the model exactly what is behind the tool so it calls it at the right moments: ```text Search live {your domain} data from {source site}. Accepts a natural language query describing what to find (constraints like price, bedrooms, or date are understood). Returns structured records. Use this whenever the user asks about specific listings, availability, or current data. ``` ## Pick your framework ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/vercel-ai-sdk icon: sparkles title: Vercel AI SDK --- TypeScript. Wrap hybrid search as a `tool()` and let the model call it. ::: :::card{icon="puzzle" title="More frameworks"} Claude Agent SDK, OpenAI function calling, and LangChain follow the same pattern. Guides are being added here. ::: :: # Vercel AI SDK This guide wires a Trawley scraper into an agent built with the [Vercel AI SDK](https://sdk.vercel.ai){rel="nofollow"}. The model gets one tool, `searchListings`, that calls Trawley's [hybrid search](https://trawley.ai/docs/developers/hybrid-search) endpoint. When a user asks a question about your data, the model calls the tool, reads the structured results, and answers. ## Prerequisites - A Trawley scraper that has completed at least one run, and its scraper ID. - An AI SDK project with a model provider configured (this example uses `@ai-sdk/openai`, but any provider works). ```bash npm install ai @ai-sdk/openai zod ``` ## Define the tool The tool has one input, `query`, and calls the hybrid endpoint inside `execute`. ```ts import { tool } from 'ai' import { z } from 'zod' const TRAWLEY_SCRAPER_ID = 'scr_8f2a1c9e' export const searchListings = tool({ description: 'Search live property listings from acmehomes.co.uk. Accepts a natural ' + 'language query; constraints like price, bedrooms, or date are understood. ' + 'Returns structured records. Use whenever the user asks about specific ' + 'listings, availability, or current data.', inputSchema: z.object({ query: z .string() .describe('A natural language description of what to find, e.g. "3 bed houses under £500k with a garden"'), }), execute: async ({ query }) => { const params = new URLSearchParams({ search: query, take: '10' }) const res = await fetch( `https://api.trawley.ai/v1/scrapers/${TRAWLEY_SCRAPER_ID}/hybrid?${params}`, ) if (!res.ok) { return { error: `Search failed with status ${res.status}` } } const { data } = await res.json() return { results: data } }, }) ``` ::callout{type="note"} The AI SDK uses `inputSchema` (not `parameters`) to describe a tool's input. The `describe()` text on each field is sent to the model, so spend a sentence making it clear. :: ## Give it to the model Pass the tool to `generateText` and let the model run. `stopWhen: stepCountIs(5)` lets the model call the tool and then continue to a final answer in the same call. ```ts import { generateText, stepCountIs } from 'ai' import { openai } from '@ai-sdk/openai' import { searchListings } from './search-listings' const { text } = await generateText({ model: openai('gpt-4.1'), tools: { searchListings }, stopWhen: stepCountIs(5), prompt: 'Find me a 3 bedroom house near Kendal with a garden under £500k.', }) console.log(text) ``` ## What happens ::steps :::step{title="The model reads the request"} It sees the `searchListings` tool and decides the question needs live data. ::: :::step{title="It calls the tool"} The model generates a `query` argument such as `"3 bedroom house Kendal garden under 500000"`. Your `execute` function calls hybrid search. ::: :::step{title="Trawley returns structured records"} The `data` array comes back as the tool result and is handed to the model. ::: :::step{title="The model answers"} It uses the records to write a grounded reply, citing real prices and locations. ::: :: ## Streaming to a UI For a chat interface, swap `generateText` for `streamText` and return a UI message stream. The tool definition is identical. ```ts import { streamText, stepCountIs } from 'ai' import { openai } from '@ai-sdk/openai' import { searchListings } from './search-listings' const result = streamText({ model: openai('gpt-4.1'), tools: { searchListings }, stopWhen: stepCountIs(5), messages, }) return result.toUIMessageStreamResponse() ``` ## Tips - **Return less, not more.** Trim each record to the fields the model needs. Smaller tool results mean cheaper, faster, more focused answers. - **Let hybrid search do the filtering.** Do not add a tool input per filter. One `query` string covers price, bedrooms, dates, and location because the endpoint interprets them. See [tools overview](https://trawley.ai/docs/developers/tools-overview). - **Handle the empty case.** If a scraper has no completed run, `data` is empty. Returning `{ results: [] }` lets the model tell the user nothing was found. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: Hybrid search reference --- Every parameter, the response shape, and `meta.filter` debugging. ::: :::card --- href: https://trawley.ai/docs/developers/tools-overview icon: plug title: The integration pattern --- The shape every framework guide follows. ::: :: # ChatGPT custom GPT You can build a custom GPT that searches your Trawley scraper's live data and answers in plain language. This needs no code: ChatGPT calls Trawley's [hybrid search](https://trawley.ai/docs/developers/hybrid-search) endpoint through a GPT Action. ::callout{type="info"} Custom GPTs require a ChatGPT Plus, Team, or Enterprise plan. The steps below use the GPT Builder's **Actions** feature. :: ## Steps ::steps :::step{title="Get your scraper ID"} Open your scraper in the dashboard. The ID is the last part of the URL, for example `scr_8f2a1c9e`. ::: :::step{title="Create a new GPT"} In ChatGPT, go to **Explore GPTs → Create**, then open the **Configure** tab. ::: :::step{title="Add an Action"} Under **Actions**, add a new action and paste the schema below, replacing the scraper ID with yours. ::: :::step{title="Test it"} Ask your GPT something like "find 3 bed houses with a garden". It calls the action and answers from the live results. ::: :: ## The action schema This OpenAPI schema describes the hybrid endpoint to ChatGPT. Replace `scr_8f2a1c9e` with your scraper ID. ```yaml openapi: 3.1.0 info: title: Trawley scraper search version: 1.0.0 servers: - url: https://api.trawley.ai paths: /v1/scrapers/scr_8f2a1c9e/hybrid: get: operationId: searchListings summary: Search the scraper's results in natural language. parameters: - name: search in: query required: true description: A natural language query describing what to find. schema: type: string - name: take in: query required: false description: How many results to return. schema: type: integer default: 10 responses: "200": description: Matching results. content: application/json: schema: type: object ``` ::callout{type="warning"} The hybrid endpoint currently has no authentication, so anyone you share the GPT with can search that scraper's data. Only share GPTs for scrapers whose data you are comfortable exposing. See [authentication](https://trawley.ai/docs/developers/authentication) for what is changing. :: ## Give it instructions In the GPT's **Instructions**, tell it how to behave: ```text You help people search live listings from acmehomes.co.uk. When asked about properties, call the searchListings action with the user's request as the search query. Summarise the results clearly, including price and location. If nothing matches, say so. ``` ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/developers/hybrid-search icon: search title: How hybrid search works --- What the action is calling under the hood. ::: :::card --- href: https://trawley.ai/docs/developers/vercel-ai-sdk icon: sparkles title: Build a real integration --- For a product, wrap the endpoint in your own agent instead. ::: :: # Grounding an assistant "Grounding" means giving an assistant the real source material so its answers come from Trawley's actual docs rather than guesswork. Here are the ways to do it, from quickest to most thorough. ## One page, one click On any doc page, use the header control: ::steps :::step{title="Open the page you care about"} For example, the [hybrid search reference](https://trawley.ai/docs/developers/hybrid-search). ::: :::step{title="Use Open in Claude or Open in ChatGPT"} The **Copy page** dropdown opens your assistant with a prompt that points it at the page. The assistant reads the page and you can ask questions about it. ::: :: ## A whole topic To brief an assistant on a broader area, paste the relevant section of [`/llms.txt`](https://trawley.ai/llms.txt){rel="nofollow"} into the chat and ask it to fetch the pages it needs. The index is grouped by audience, so you can hand it just the Developers section, for example. ## Everything at once For deep questions, give the assistant the full corpus: ```text Here is the complete Trawley documentation: https://trawley.ai/llms-full.txt Using only this, help me . ``` ::callout{type="tip"} Grounding works best when you tell the assistant to answer **only** from the provided material and to say when something is not covered. That keeps it from filling gaps with invented behaviour. :: ## Keeping answers current Because `llms.txt` and `llms-full.txt` regenerate from the docs on every build, an assistant you ground today reads the current state of Trawley. Re-fetch the artifact at the start of a session to pick up the latest. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/agents/llms-txt icon: file-text title: The artifacts --- What llms.txt and llms-full.txt contain. ::: :::card --- href: https://trawley.ai/docs/agents/chatgpt-custom-gpt icon: bot title: Search live data --- Let an assistant query your scraper, not just the docs. ::: :: # llms.txt and friends Trawley publishes its documentation in formats built for language models, so an agent can read about Trawley without scraping HTML. ## The two artifacts Both live at the site root, following the convention used by Anthropic, Stripe, and Vercel: | Artifact | What it contains | | -------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | | [`/llms.txt`](https://trawley.ai/llms.txt){rel="nofollow"} | A curated index of every doc page, segmented by audience (Users, Developers, Agents), with titles, links, and short descriptions. | | [`/llms-full.txt`](https://trawley.ai/llms-full.txt){rel="nofollow"} | The full markdown of every page concatenated into one file. | Use `llms.txt` to discover what exists and fetch only the pages you need. Use `llms-full.txt` when you want the entire corpus in a single request. ```bash curl https://trawley.ai/llms.txt curl https://trawley.ai/llms-full.txt ``` ::callout{type="info"} These files are generated from the docs at build time. Every new page added to the documentation appears in them automatically, so they never go stale. :: ## Per-page actions Every documentation page has a **Copy page** control in its header: - **Copy page** copies a short markdown summary of the page, with a link back to the full version, ready to paste into any assistant. - **Open in Claude** and **Open in ChatGPT** (in the dropdown) open that assistant with a prompt that points it at the current page so it can read and answer from it. This makes it a one-click action to bring any Trawley doc into a chat. ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/agents/grounding-an-assistant icon: wand-2 title: Ground an assistant --- Practical ways to feed Trawley docs into a chat. ::: :::card --- href: https://trawley.ai/docs/agents/chatgpt-custom-gpt icon: bot title: Query live data --- Go beyond docs and search your scraper from an AI tool. ::: :: # Quickstart This tree is for two audiences: - **AI agents** that want to read Trawley's documentation in a clean, machine-friendly form to ground their answers. - **People** who want to point a personal AI tool (like ChatGPT) at the data a Trawley scraper collects, without writing code. If you are a developer building Trawley into your own product, the [Developers tree](https://trawley.ai/docs/developers/quickstart) is where the API reference and integration guides live. ## For agents: read the docs as text Every Trawley doc page is available as clean markdown, and the whole site is indexed in two artifacts at the site root: - `https://trawley.ai/llms.txt` — a curated index of every page. - `https://trawley.ai/llms-full.txt` — the full text of every page in one file. See [llms.txt and friends](https://trawley.ai/docs/agents/llms-txt) for how to use them. ## For people: connect your AI tool You can give a ChatGPT custom GPT the ability to search your scraper's live data, no code required. See [ChatGPT custom GPT](https://trawley.ai/docs/agents/chatgpt-custom-gpt). ## What's next ::card-group{cols="2"} :::card --- href: https://trawley.ai/docs/agents/llms-txt icon: file-text title: LLM-ready artifacts --- The llms.txt index and per-page markdown. ::: :::card --- href: https://trawley.ai/docs/agents/chatgpt-custom-gpt icon: bot title: Connect ChatGPT --- Point a custom GPT at your scraper. ::: :: # Docs site shipping We shipped the first cut of `trawley.ai/docs` covering all three audiences: Users (no-code wizard), Developers (API/SDK), and Agents (LLM-consumable artifacts + AI-tool recipes). The site uses Nuxt Content v3 with build-time SQLite (no runtime DB), ships a Mintlify-style component vocabulary, and emits both `/llms.txt` and `/llms-full.txt` at the site root for AI agents to consume. # Documentation content The docs site is no longer just scaffolding. All three trees now have real content: - **Users** — a full no-code path from the setup wizard through scheduling, results, search, export, scraper settings, teams, and billing. - **Developers** — the public `/v1` search API reference (hybrid, results, export, diff, chat), plus integration guides for wiring Trawley into an agent with the Vercel AI SDK, the Anthropic SDK, OpenAI function calling, and LangChain. - **Agents** — the `llms.txt` artifacts and a no-code recipe for pointing a ChatGPT custom GPT at a scraper's live data. Pages document what is shipped today and flag what is coming (API authentication), so nothing reads as available before it is.