
Web Scraping Import works on any public website, with no API keys or setup. It always shows you a live preview of one extracted product before you commit, so you can confirm the data looks right first.
How it works
Pick a source mode
Category page walks a listing page and its pagination to find every product (best for a supplier or competitor catalog). Sitemap starts from one product URL and finds similar pages across the site. Manual list takes a list of product URLs you paste in, one per line.
Add the URL and limits
Paste the seed URL. Optionally set a URL pattern (to include only the right pages) and caps on how many listing pages and products to pull, so a first run stays small.
Preview one product
Run the preview. WISEPIM reports how many product URLs it matched, the pattern it detected, a few sample URLs, and one fully extracted product so you can check the fields landed correctly.
Controls you can set
You shape each scrape with a few optional overrides. The defaults work for most sites, so reach for these only when a run needs a nudge:- Sitemap URL override: point WISEPIM at the right sitemap when a site doesn’t declare one in its
robots.txt. Use this if the sitemap mode can’t find product URLs on its own. - Product URL pattern override: tell WISEPIM which URLs count as products (for example
/p/or/products/) when the auto-detected pattern picks up the wrong pages. - Max listing pages: how many pagination pages of a category to walk. Raise it for large catalogs, keep it low for a quick test.
- Max products: an upper bound on how many products a run imports. A safety cap that keeps a first run small and predictable.
Reading the preview
The preview exists so you never import blind:- Matched URL count tells you whether the crawl found roughly the number of products you expected. Zero or far too few means the pattern or seed URL needs adjusting.
- The detected pattern shows which URLs will be treated as products. If it is catching category or blog pages, tighten the pattern with the product URL pattern override.
- The extracted sample is the real test: check that name, price, images, and key attributes mapped correctly before you commit to the full run.
Act on what you find
The preview matched 0 (or far too few) products
The preview matched 0 (or far too few) products
The seed URL or pattern is off. For a category page, make sure you pasted the listing page (not a single product); for sitemap mode, paste a real product URL so WISEPIM can learn the pattern. Adjust the pattern override and preview again. Outcome: the crawl finds the full set before you spend an import run on it.
The sample product is missing fields
The sample product is missing fields
Some sites bury data in scripts or images. Re-preview to confirm it is consistent, import what extracts cleanly, then fill the gaps with Enriching Products (AI can read the product images to recover attributes). Outcome: a complete catalog even when the source page was thin.
You will import from the same site again
You will import from the same site again
Note the settings that worked: the source mode, the seed or category URL, and any pattern or sitemap overrides. Next time the supplier updates, enter the same values to pull the changes. For sources you re-import often, a structured feed is the more reliable long-term option when one is available. Outcome: repeatable supplier onboarding.
You need a feed, not a scrape
You need a feed, not a scrape
If the source can give you an XML or CSV feed, prefer Feed Hub import or file import: structured feeds are faster and more reliable than crawling. Use scraping when no feed is available. Outcome: the right tool for each source.
How it compares
| Web Scraping Import | File / Feed import | Web Research | |
|---|---|---|---|
| Input | A live website URL | An XML / CSV / JSON file or feed | A search query or competitor URL |
| Best for | Sites with no feed available | Suppliers and channels that publish a feed | Gathering facts to enrich existing products |
| Output | Products in your catalog | Products in your catalog | Research you apply to content |
| AI does | Extracts fields from the page | Maps columns to fields | Searches and summarizes |
Related
Importing Products
File-based import (CSV, Excel) when you have structured data.
Feed Hub
Import from and publish to XML / feed sources.
Web Research
Research products on the web to enrich what you already have.
Enriching Products
Fill any gaps the scrape left, with AI.


