Skip to content

Web

HTMLLoader

A loader that loads HTML, optionally converting it to markdown or stripping tags

SitemapLoader

A loader that loads URLs from a sitemap. Attributes: include: A list of strings or regular expressions. Only URLs that match one of these will be included. exclude: A list of strings or regular expressions. URLs that match one of these will be excluded. url_loader: The loader to use for loading the URLs. create_excerpts: Whether to split documents into excerpts. Defaults to True.

URLLoader

Given a list of URLs, loads whatever it finds there.

Attributes:

Name Type Description
urls list[str]

The URLs to load from.

create_excerpts bool

Whether to split documents into excerpts. Defaults to True.

response_to_document async

Convert an HTTP response to a Document.