Using Algolia DocSearch
· 阅读需 5 分钟
Docusaurus has official support for Algolia DocSearch
.docusaurus 预设自带。注册Algolia
首先要注册Algolia
,百度一下。得到 appId、apiKey、indexName
主题配置-自动索引
配置文件docusaurus.config.js增加:
themeConfig:{
algolia: {
appId: 'xxx',
apiKey: 'xxx',
indexName: 'dev.lichenghao.cn',
contextualSearch: true,
searchParameters: {},
searchPagePath: 'search',
},
}
然后到 DocSearch: Search made for documentation | DocSearch (algolia.com)
填写自己的网站和邮箱,然后每 24 小时便会运行一次代码爬取你的网站得到索引数据。除此之外还可以选择主动推送索引的方式。
推送索引-手动索引
在centos7服务器下,利用docker执行官方的爬虫程序。
需要依赖 jq,Command-line JSON processor
安装EPEL存储库
sudo yum install epel-release -y
安装jq
sudo yum install jq -y
验证
jq --version
然后在任意文件夹下新增两个文件env
,config.json
,分别用于设置algolia的api key和索引推送的配置。
- env
- config.json
注意:API_KEY 是 Admin API Key
APPLICATION_ID=xxx
API_KEY=xxx
需要修改:index_name、start_urls
{
"index_name": "dev.lichenghao.cn",
"start_urls": ["https://dev.lichenghao.cn"],
"sitemap_urls": ["https://dev.lichenghao.cn/sitemap.xml"],
"stop_urls": ["/search"],
"selectors": {
"lvl0": {
"selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
"type": "xpath",
"global": true,
"default_value": "Documentation"
},
"lvl1": "header h1, article h1",
"lvl2": "article h2",
"lvl3": "article h3",
"lvl4": "article h4",
"lvl5": "article h5, article td:first-child",
"lvl6": "article h6",
"text": "article p, article li, article td:last-child"
},
"custom_settings": {
"attributesForFaceting": [
"type",
"lang",
"language",
"version",
"docusaurus_tag"
],
"attributesToRetrieve": [
"hierarchy",
"content",
"anchor",
"url",
"url_without_anchor",
"type"
],
"attributesToHighlight": ["hierarchy", "content"],
"attributesToSnippet": ["content:10"],
"camelCaseAttributes": ["hierarchy", "content"],
"searchableAttributes": [
"unordered(hierarchy.lvl0)",
"unordered(hierarchy.lvl1)",
"unordered(hierarchy.lvl2)",
"unordered(hierarchy.lvl3)",
"unordered(hierarchy.lvl4)",
"unordered(hierarchy.lvl5)",
"unordered(hierarchy.lvl6)",
"content"
],
"distinct": true,
"attributeForDistinct": "url",
"customRanking": [
"desc(weight.pageRank)",
"desc(weight.level)",
"asc(weight.position)"
],
"ranking": [
"words",
"filters",
"typo",
"attribute",
"proximity",
"exact",
"custom"
],
"highlightPreTag": "<span class='algolia-docsearch-suggestion--highlight'>",
"highlightPostTag": "</span>",
"minWordSizefor1Typo": 3,
"minWordSizefor2Typos": 7,
"allowTyposOnNumericTokens": false,
"minProximity": 1,
"ignorePlurals": true,
"advancedSyntax": true,
"attributeCriteriaComputedByMinProximity": true,
"removeWordsIfNoResults": "allOptional",
"separatorsToIndex": "_",
"synonyms": [
["js", "javascript"],
["ts", "typescript"]
]
}
}
然后执行
docker run -it --env-file=env -e "CONFIG=$(cat config.json | jq -r tostring)" algolia/docsearch-scraper:v1.16.0
等待结果如下表示成功
......
> DocSearch: https://dev.lichenghao.cn/docs/react/zkJ8XlOBb4b1azdWe962P 30 records)
> DocSearch: https://dev.lichenghao.cn/docs/springcloud/C7UvQ2pvTdgduS9A5seg 83 records)
Nb hits: 8302