Browsertrix User Guide¶
Welcome to the Browsertrix User Guide. This page covers the basics of using Browsertrix, Webrecorder's high-fidelity web archiving system.
Getting Started¶
To get started crawling with Browsertrix:
- Create an account and join an Organization as described here.
- After being redirected to the organization's Overview page, click the Create New button in the top right and select Crawl Workflow to begin configuring your first crawl!
- For a simple crawl, choose the Seeded Crawl option, and enter a page url in the Crawl Start URL field. By default, the crawler will archive all pages under the starting path.
- Next, click Review & Save, and ensure the Run on Save option is selected. Then click Save Workflow.
- Wait a moment for the crawler to start and watch as it archives the website!
After running your first crawl, check out the following to learn more about Browsertrix's features:
- A detailed list of crawl workflow setup options.
- Adding exclusions to limit your crawl's scope and evading crawler traps by editing exclusion rules while crawling.
- Best practices for crawling with browser profiles to capture content only available when logged in to a website.
- Managing archived items, including uploading previously archived content.
- Organizing and combining archived items with collections for sharing and export.
- If you're an admin: Inviting collaborators to your org.
Have more questions?¶
While our aim is to create intuitive interfaces, sometimes the complexities of web archiving require a little more explanation. If there's something that you found especially confusing or frustrating please get in touch!