extract() grabs structured text from the current page using structured schemas. Given instructions and schema, you will receive structured data.
For TypeScript, the extract schemas are defined using zod schemas.For Python, the extract schemas are defined using pydantic models.
Extract a single object
Here is how anextract call might look for a single object:
Extract a link
To extract links or URLs, in the TypeScript version of Stagehand, you’ll need to define the relevant field as
z.string().url().
In Python, you’ll need to define it as HttpUrl.extract call might look for extracting a link or URL.
Extract a list of objects
Here is how anextract call might look for a list of objects.
Extract with additional context
You can provide additional context to your schema to help the model extract the data more accurately.- TypeScript
- Python
Arguments: ExtractOptions<T extends z.AnyZodObject>
Provides instructions for extraction
Defines the structure of the data to extract (TypeScript only)
Set
iframes: true if the extraction content exists within an iframe.This field is now deprecated and has no effect.
An xpath that can be used to reduce the scope of an extraction. If an xpath is passed in,
extract will only process
the contents of the HTML element that the xpath points to. Useful for reducing input tokens and increasing extraction
accuracy.Specifies the model to use
Configuration options for the model client. See
ClientOptions.Timeout in milliseconds for waiting for the DOM to settle
Returns: Promise<ExtractResult<T extends z.AnyZodObject>>
Resolves to the structured data as defined by the provided schema.
