diff --git a/content/notes/2022/08/03h08m44s51.md b/content/notes/2022/08/03h08m44s51.md index 14aefde4..abd62f79 100644 --- a/content/notes/2022/08/03h08m44s51.md +++ b/content/notes/2022/08/03h08m44s51.md @@ -3,7 +3,7 @@ date: 2022-08-03T08:44:51+02:00 context: "https://roytang.net/2022/08/twenty-years/" --- -Excellent summary Roy, cheers! I dug around in your archives and discovered you were into MtG [way back in 2001](https://roytang.net/archives/ancient/tripod/ffmagic/)---that's exactly the same year as I started playing! Since you regularly post updates on your digital Arene grinds, I was wondering how you migrated from analog to digital MtG. I only play with "the real stuff", but as a consequence, I regularly have trouble finding buddies to play with. +Excellent summary Roy, cheers! I dug around in your archives and discovered you were into MtG [way back in 2001](https://roytang.net/archives/ancient/tripod/ffmagic/)---that's exactly the same year as I started playing! Since you regularly post updates on your digital Arena grinds, I was wondering how you migrated from analog to digital MtG. I only play with "the real stuff", but as a consequence, I regularly have trouble finding buddies to play with. Since discovering Commander, I much prefer playing it like that: more chaos and politics, more crazy cards, and it's not always the player with the most expensive deck that wins. Most of my stuff is geared towards a budget anyway. diff --git a/content/notes/2022/08/03h11m10s41.md b/content/notes/2022/08/03h11m10s41.md new file mode 100644 index 00000000..98978add --- /dev/null +++ b/content/notes/2022/08/03h11m10s41.md @@ -0,0 +1,10 @@ +--- +date: 2022-08-03T11:10:41+02:00 +context: "https://fundor333.com/social/2022/08/03/1659516036/" +--- + +Fundor 333 asked: + +> In your opinion something like Gitea with a syndication like Mastodon will solve some of the problems and move more people on this “Gitea with Syndication”? + +I'd answer: yes and no. Yes, it will solve some problems---hopefully more easy collaboration across different instances. With GitHub, that's not a problem, provided that everyone uses GitHub. And No, I don't think it will move more people towards Gitea, since syndication and self-hosting are usually two "complicated" solutions. Note that I didn't say complex. Most people will still find it too troublesome to move. Just look at Mastodon VS Twitter. The `@user@mastodoninstance` thing already trips most people up. diff --git a/content/post/2022/08/implementing-searching-in-static-websites.md b/content/post/2022/08/implementing-searching-in-static-websites.md new file mode 100644 index 00000000..2bc3d8ee --- /dev/null +++ b/content/post/2022/08/implementing-searching-in-static-websites.md @@ -0,0 +1,78 @@ +--- +title: Implementing Searching In Static Websites +date: 2022-08-04T10:59:00+02:00 +categories: + - webdesign +tags: + - hugo + - searching +--- + +In my monthly [July 2022 overview](/post/2022/08/july-2022) write-up, I wrote: + +> This website got a new search engine! The baked archives page used to be powered by Lunr.js, which has been replaced by Pagefind.app. I guess this is worth its own blog post, I’ll save the details for later. + +It's time for those juicy details. + +Last month's first HugoConf revealed many interesting JAMStack-related tooling to boost your statically generated blog. For the uninitiated, a "JAMStack" is a _JavaScript, API, and Markup stack_ that (almost) enables static websites to be just as dynamic as true blogging engines such as Wordpress. For example, [a Webmention-based commenting system](/post/2021/05/beyond-webmention-io/) with a queryable API, a few pre- and post-processor scripts like [YouTube link to image converters](/post/2021/06/youtube-play-image-links-in-hugo/), or, **search functionality**. + +One of those new search tools mentioned during the conference is [Pagefind](https://pagefind.app/). Since I was looking into throwing out [Lunr.js](https://lunrjs.com/) anyway, it was a good opportunity to try out new things. The result is the simple but very fast [search bar in the /archives page](/archives). + +How do these tools work? + +1. You generate some content in Markdown. Your static site processor, in my case Hugo, converts it to static HTML, ready to be served to visitors. +2. A script needs to be run to create **an index** of your content---either by processing the `.md` source, or the `.html` target. The result is usually a fairly large `.js` file. +3. On a search page, you include 2 `` tags: the index file and the tool that uses it. Users that enter a query search **client-side** in JS code, as opposed to submitting a real form like in search engines or with Wordpress. + +The problem with step 2 is the index file itself, as it can quickly grow in size. Furthermore, I automatically checked in the changes to the `brainbaking-index.js` file, needlessly convoluting the git repository. Even with gzip-compression in mind, I found Lunr.js not to be the best approach. + +Instead, Pagefind uses **fragmentation**. It never requires the inclusion of a single huge index file, but rather a tiny JS file (`8.05 kB`), that only fetches a minimal index file after you start typing (`45.66 kB`), and for each result to be displayed (usually limited to five), fetches a _fragment_ of the indexed content (between `5` and `8.5 kB`), and, optionally, a (currently non-optimized) thumbnail. The result is a blazing fast search-as-you-type system that's still self-hosted, highly optimized, and doesn't require a page submit. + +Try it out yourself [at /archives](/archives). + +There are a few obvious disadvantages of using Pagefind. For one, it's very bleeding edge, currently at [version 0.5.3](https://github.com/CloudCannon/pagefind/releases). Custom placeholder text, proper internationalization support, and more custom options are currently missing, but it is possible to use the lower-level API and come up with something cool yourself. I took a stab at it but decided that most of the default stuff is just fine. + +The other downside is that it still requires you to run another executable---this is the JAMStack part, so to speak---after Hugo is done generating. I have a simple shell script that is triggered every hour: + +```sh +#!/bin/bash + +sites=( brainbaking jefklakscodex redzuurdesem ) +export WEBMENTION_TOKEN="supersecret" + +echo "building at $(date)... with $1" + +for site in "${sites[@]}" +do + echo "building site $site" + cd /var/dev/$site + git reset --hard + RESULT=$(git pull | grep 'Already up to date') + if [[ -z "$RESULT" ]] || [[ $1 == "--force" ]] + then + /usr/local/bin/hugo --cleanDestinationDir --destination docs + /usr/local/bin/pagefind --source docs + rsync --archive --delete docs/ /var/www/$site/ + yarn install + yarn run postdeploy + else + echo "nothing to do for $site" + fi +done +echo "done building." +``` + +This boils down to: + +1. Execute `hugo`, dump HTML output in `docs/` +2. Execute `pagefind`, scour through `docs/` and dump index/JS/fragments in there as well +3. Copy over new files using `rsync` to the deployed location for Nginx to pick up +4. Run `yarn` for an optional post-deploy step. This contains webmention sending. + +Pagefind is a Rust self-contained binary, but I had to install it from source for my MacBook as there's no released ARM64 artifact available. You do have to install it as well on your web server---although that is optional: you can also run the `pagefind` command locally and simply check in all changes. I did that before with Lunr.js, but do not recommend it: every slightest change of your blog triggers a commit of the index file. + +--- + +Is all this trouble worth it? I'm not sure. [Rubenerd's Archives page](https://rubenerd.com/archives/) resorts to another technique: simply let a _real_ search engine do the searching. By embedding a DuckDuckGo `
`, you delegate all the above to another party, decreasing the complexity of your build pipeline and website theme code. It's worth noting that this alternative works _even with JavaScript disabled in the browser!_ I had to put in a `` tag to bring JS-haters bad news: they can't search. + +On the other hand, DuckDuckGo doesn't immediately index new posts, and you still route users away from your site with a form submit. In the end, Ruben's approach is probably the easiest, albeit the less immersive option. You'll have to decide for yourself whether or not to go for it. I still like Pagefind's relative simpleness and even [implemented it at my other sites](https://jefklakscodex.com/tags/) that didn't have a search option before.