My "hello world" of notetaking is keeping track of what books I've read.
There are two ways to do this:
read by me
and tag each book I've read with it.A natural response to these options might be, "YOLO pick one and keep moving".
I think that's a great response for people who want to get stuff done. But there's something interesting here. It's a decision almost every user will hit about two minutes into using a notetaking program. The choices seem arbitrary, but the implications for permissions, discoverability, etc-- when multiplied to tens of thousands of notes-- might start to add up. I'd like to understand it better.
Bring out the Microscope of Pedantry!
Static collections like the list in (1) are easy to annotate by adding child bullet points. Thus they shine as semi-structured data, serving as a document and a collection at once. However they don't scale well.
Dynamic collections like the tags-and-backlinks strategy in (2) are actually best implemented with properties-and-a-query. In this form they're structured data. Thus they scale well and provide easy programmatic interaction, but are more rigid than static collections since they don't allow annotation.
Static collections are precise. They're also ordered (assuming your notetaking program doesn't provide a static set type).
They can be structured (such as a list or a table). Here's an example list:
They can also be semi-structured (such as an outline or just 'text with links'). Here's an example outline:
Static collections get first-class versioning and permissions. Whatever versioning and permissions system you have for notes is inherited by them exactly.
Dynamic collections have variable precision and are unordered by default.
For example, read the backlinks at this page: read by me (example tag)
This collection has a false postive since this page itself shows up in the backlinks, but it's not a book that I've read. Dynamic collections of this type are imprecise.
The tag+backlinks strategy has no permissioning-- anyone who can add pages to the system can add entries into the collection.
For a precise dynamic collection you can use property like read_by: me
.
You then need a query since backlinks won't show only pages with that property/value combination. This website doesn't support queries so I won't give an example, but for apps that do there are two possible levels of support:
For apps that support (2) the properties-and-query strategy can result in a collection with an ID just like static queries, by creating a page with just the live query in it. This collection is structured and precise. It can be ordered and reordered however you want.
Permissioning is interesting for query-based collections since it depends on what queries can be made, not what the first-class permissioning system supports. For instance, you might be able to restrict the results to ones made by users whose name starts with a vowel and only if posting on weekdays.
[Section still WIP]
This comes down to structured vs. semi-structured data.
Static collections can be semi-structured, letting you include arbitrary comments and observations within them. This makes them good for small scale and exploratory work.
Tag-based dynamic collections are also semi-structured and not that useful.
However, property-based dynamic collections are structured and scale great. They let you mix-and-match: you can have a reading_status: finished
property as well as a type: book
property. Then you can query for everything you've finished including papers, or everything that's a book even if you haven't finished it.
relational database | philosophy | 💾 CritLink terminology | |
---|---|---|---|
static collection | table | extensional definition | intrinsic link |
dynamic collection | view | intensional definition | extrinsic link |
You'd think "static vs. dynamic collections" would be an obvious name, but it wasn't to me. I was going to call these "enumerated vs. query-based collections". Thanks to DTLow and atomicnotes for the better suggestion.