Search...
Menu

What is a Text Document? How to crawl websites, documents, and text corpora?

HelpKnow.ai supports crawling website links, local documents, and adding text online. After successful crawling, you can view specific slices, and supports editing, adding, and deleting slices.

 

Click "+Knowledge Base" on the [Knowledge] page

Select "Text Document". Choose one import type, you can add more import types later at any time.

 

Online data

[Import URL] supports "Site-wide import" and "Batch import".

Site-wide import will crawl all webpages related to the input link.

Batch import import will crawl according to the links entered, one per line, without obtaining other related links.

* Supports dynamic website recognition

 

After crawling the links, the website slicing effect is as follows, supporting editing, adding, and deleting slices.

 

Local Documents

Upload local files and convert them to text documents.

*Supports PDF, TXT, DOCX, and MD format files.

 

Add Online

Suppor writing text directly here to add new training corpus.

 

FAQs

1.What's the difference between creating an "Online Document" and a "Text Document"? How should I choose?

If you don't need to share online documents for user viewing and the documents are only used as AI training corpus, it is recommended to create "Text Documents".

 

2.Why do scrambled characters appear when scraping website links?

If the website link scraping produces garbled text, try clicking "Website Dynamic Recognition" and scrape the link again.

Previous
Knowledge
Next
What is a Product Library?
Last modified: 2025-10-27SaleSmartly