Exa

Connect to Exa for AI-powered semantic web search, content enrichment, finding similar pages, website crawling, direct answers, structured research, and large-scale URL discovery.

Supports authentication: API Key

Set up the agent connector

Register your Exa API key with Scalekit so it can authenticate and proxy requests on behalf of your users. Unlike OAuth connectors, Exa uses API key authentication — there is no redirect URI or OAuth flow.

Generate an Exa API key
- Sign in to dashboard.exa.ai/api-keys. Under Management, click API Keys.
- Click + Create Key, enter a name (e.g., Agent Auth), and confirm.
- In the Secret Key column, click the eye icon to reveal the key and copy it. Store it somewhere safe — you will not be able to view it again.
Create a connection in Scalekit
- In Scalekit dashboard, go to Agent Auth → Create Connection. Find Exa and click Create.
- Note the Connection name — you will use this as connection_name in your code (e.g., exa).
Add a connected account

Connected accounts link a specific user identifier in your system to an Exa API key. You can add them via the dashboard for testing, or via the Scalekit API in production.

Via dashboard (for testing)
- Open the connection you created and click the Connected Accounts tab → Add account.
- Fill in:
  - Your User’s ID — a unique identifier for this user in your system (e.g., user_123)
  - API Key — the Exa API key you copied in step 1
- Click Save.
Via API (for production)
- Python
1 scalekit_client.actions.upsert_connected_account( 2 connection_name="exa", 3 identifier="user_123", # your user's unique ID 4 credentials={"api_key": "your-exa-api-key"} 5 )
In production, call upsert_connected_account when a user connects their Exa account — for example, after they paste their API key into a settings page in your app.

Usage

Once a connected account is set up, make API calls through the Scalekit proxy. Scalekit injects the Exa API key automatically — you never handle credentials in your application code.

Python

1
import scalekit.client, os
2
from dotenv import load_dotenv
3
load_dotenv()
4

5
connection_name = "exa"    # connection name from your Scalekit dashboard
6
identifier = "user_123"    # your user's unique identifier
7

8
# Get your credentials from app.scalekit.com → Developers → Settings → API Credentials
9
scalekit_client = scalekit.client.ScalekitClient(
10
    client_id=os.getenv("SCALEKIT_CLIENT_ID"),
11
    client_secret=os.getenv("SCALEKIT_CLIENT_SECRET"),
12
    env_url=os.getenv("SCALEKIT_ENV_URL"),
13
)
14
actions = scalekit_client.actions
15

16
# Make a request via Scalekit proxy — no API key needed here
17
result = actions.request(
18
    connection_name=connection_name,
19
    identifier=identifier,
20
    path="/search",
21
    method="POST",
22
    json={"query": "LLM observability tools 2025", "num_results": 5}
23
)
24
print(result)

Tool list

`exa_search`

Search the web with a semantic or keyword query. Returns ranked results with titles, URLs, published dates, and optional full-page content. Semantic search (type: neural) finds conceptually similar pages even when your exact words don’t appear in the source. Keyword search (type: keyword) uses traditional exact-match scoring.

Credits: 1 credit per request + 1 credit per result when contents.text or contents.summary is requested.

Name	Type	Required	Description
`query`	string	Yes	Search query — phrase it as a statement or question for best semantic results
`num_results`	integer	No	Number of results to return (default `10`, max `100`)
`type`	string	No	Search mode: `neural` for semantic similarity (default), `keyword` for exact-match
`category`	string	No	Filter by content type: `company`, `research paper`, `news`, `github`, `tweet`, `personal site`, `pdf`, `linkedin profile`
`include_domains`	array	No	Restrict results to these domains (e.g., `["techcrunch.com", "arxiv.org"]`)
`exclude_domains`	array	No	Exclude results from these domains
`start_published_date`	string	No	ISO 8601 date — only return pages published after this date (e.g., `2025-01-01`)
`end_published_date`	string	No	ISO 8601 date — only return pages published before this date
`start_crawl_date`	string	No	ISO 8601 date — only return pages crawled by Exa after this date
`end_crawl_date`	string	No	ISO 8601 date — only return pages crawled by Exa before this date
`include_text`	array	No	Strings that must appear in the page text
`exclude_text`	array	No	Strings that must not appear in the page text
`location`	string	No	ISO country code for location-biased results (e.g., `US`, `GB`, `DE`)
`contents.text`	boolean	No	Include full page text in each result (costs 1 extra credit per result)
`contents.highlights`	object	No	Include highlighted snippets — set `num_sentences` and `highlights_per_url`
`contents.summary`	object	No	Include an AI-generated page summary — optionally set `query` to focus the summary

`exa_find_similar`

Find web pages that are semantically similar to a given URL. Exa uses its neural index to surface pages with matching content, tone, and topic — useful for competitive research, discovering similar products, or finding alternative sources on a topic.

Credits: 1 credit per request + 1 credit per result when contents.text or contents.summary is requested.

Name	Type	Required	Description
`url`	string	Yes	URL to find similar pages for (e.g., `https://example.com/product`)
`num_results`	integer	No	Number of results to return (default `10`, max `100`)
`include_domains`	array	No	Restrict results to these domains (e.g., `["techcrunch.com", "wired.com"]`)
`exclude_domains`	array	No	Exclude results from these domains
`start_published_date`	string	No	ISO 8601 date — only return pages published after this date
`end_published_date`	string	No	ISO 8601 date — only return pages published before this date
`start_crawl_date`	string	No	ISO 8601 date — only return pages crawled by Exa after this date
`end_crawl_date`	string	No	ISO 8601 date — only return pages crawled by Exa before this date
`contents.text`	boolean	No	Include full page text in each result (costs 1 extra credit per result)
`contents.highlights`	object	No	Include highlighted snippets — set `num_sentences` and `highlights_per_url`
`contents.summary`	object	No	Include an AI-generated page summary — optionally set `query` to focus the summary

`exa_crawl`

Crawl all pages of a website starting from a given URL and extract their text content. Follows internal links up to the configured depth. Use this to index documentation sites, company knowledge bases, or competitor content for downstream processing.

Credits: Credits vary based on the number of pages crawled and content extracted.

Name	Type	Required	Description
`url`	string	Yes	Starting URL to crawl (e.g., `https://docs.example.com`)
`max_depth`	integer	No	Maximum number of link hops to follow from the starting URL (default `1`)
`max_pages`	integer	No	Maximum number of pages to retrieve (default `100`) — set this to control credit usage
`include_subdomains`	boolean	No	Whether to follow links to subdomains of the starting URL (default `false`)
`exclude_paths`	array	No	URL path patterns to skip (e.g., `["/archive/", "/blog/"]`)

`exa_get_contents`

Retrieve enriched content — full text, highlighted snippets, or AI-generated summaries — for one or more specific URLs without running a search. Use this when you already have URLs (from a prior search, a CRM, or an external list) and want to extract structured content from them.

Credits: 1 credit per URL + 1 credit per content item retrieved (text, highlights, or summary each count separately).

Name	Type	Required	Description
`urls`	array	Yes*	List of URLs to retrieve content for (e.g., `["https://example.com/about"]`)
`ids`	array	Yes*	List of Exa result IDs from a prior `exa_search` or `exa_find_similar` call — use instead of `urls` when you have Exa IDs
`text`	boolean	No	Include full page text for each URL (costs 1 extra credit per page)
`text.max_characters`	integer	No	Truncate page text to this many characters
`text.include_html_tags`	boolean	No	Preserve HTML tags in the returned text — helps LLMs interpret page structure
`highlights`	object	No	Extract relevant text snippets from each page
`highlights.query`	string	No	Custom query to guide which snippets are highlighted
`summary`	object	No	Generate an AI summary for each page (costs 1 extra credit per page)
`summary.query`	string	No	Focus the summary on a specific question or topic
`summary.schema`	object	No	JSON schema for structured summary output
`subpages`	integer	No	Number of subpages to also crawl from each URL (default `0`)
`subpage_target`	string	No	Keyword or path pattern to target specific subpages (e.g., `"pricing"`)
`max_age_hours`	integer	No	Maximum age of cached content in hours — set a lower value to force a fresher crawl

*Provide either urls or ids — at least one is required.

`exa_answer`

Get a concise natural language answer to a question, synthesized from live web search results. Returns the answer text and a list of source URLs used to generate it. Use this when you need a direct factual response rather than a list of links.

Credits: Credits vary based on the number of sources retrieved.

Name	Type	Required	Description
`query`	string	Yes	The question to answer (e.g., `"What are the rate limits for GPT-4o?"`)
`num_results`	integer	No	Number of web sources used to generate the answer (default `5`)
`text`	boolean	No	Include the source text snippets alongside the answer
`include_domains`	array	No	Restrict source lookup to these domains
`exclude_domains`	array	No	Exclude these domains from the source lookup
`start_published_date`	string	No	ISO 8601 date — only use sources published after this date
`end_published_date`	string	No	ISO 8601 date — only use sources published before this date

`exa_research`

Run in-depth, multi-angle research on a topic. Exa decomposes the topic into multiple sub-queries, runs each in parallel, and synthesizes a structured output. Optionally provide a JSON schema to control the output shape. Use this for competitive intelligence, market research, or any question that requires aggregating information across many sources.

Credits: Significantly more expensive than exa_search — cost scales with num_subqueries. Each sub-query consumes search and content retrieval credits.

Name	Type	Required	Description
`topic`	string	Yes	Research topic or question (e.g., `"Competitive landscape of AI coding assistants in 2025"`)
`num_subqueries`	integer	No	Number of parallel sub-queries to run (default `5`, max `15`) — higher values improve coverage but increase cost
`output_schema`	object	No	JSON schema defining the structure of the output — omit for free-form markdown output
`include_domains`	array	No	Restrict all sub-query sources to these domains

`exa_websets`

Execute a complex natural language query designed to return a large set of matching URLs — potentially thousands. Each result is vetted against your criteria before being returned. Use this for bulk data collection tasks such as building lead lists, aggregating industry news, or sourcing research datasets.

Credits: Credits are consumed per vetted result. Cost scales with count.

Name	Type	Required	Description
`query`	string	Yes	Natural language description of the set to find (e.g., `"YC-backed AI startups founded after 2022"`)
`count`	integer	No	Target number of matching URLs to return (default `10`, can be thousands)
`entity_type`	string	No	Type of entity to find: `company`, `person`, `article`, `research paper`, `job listing`, `event`, `product`
`criteria`	array	No	Additional filter criteria — each item has a `description` and `options.type` (`must`, `should`, or `must_not`)
`include_domains`	array	No	Restrict results to these domains (e.g., `["linkedin.com", "crunchbase.com"]`)

Exa

Set up the agent connector

Generate an Exa API key

Create a connection in Scalekit

Add a connected account

Usage

Tool list

exa_search

exa_find_similar

exa_crawl

exa_get_contents

exa_answer

exa_research

exa_websets