Hierarchical tags - Neuron

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

felko

https://github.com/srid/neuron/issues/49
That's a good idea, this can avoid creating too many tags.

Should querying for tag=journal include journal/work zettels as well? Or would this behaviour be specified explicitly by the user somehow?

I would prefer that tag queries only match the exact tag. In my personal zettelkasten I would like to make "portals" kinda like wikipedia, the only way I found around this is to make a note tag and then do in the portal zquery://search?tag=topic&tag=note. Maybe a new query like zquery://search?category=journal could be useful?

If a zettel has the following metadata --- title: Some zettel tags: - journal/work --- Should querying for tag=journal include journal/work zettels as well? Or would this behaviour be specified exp...
Sridhar Ratnakumar

I'm not sure I understand your portal example. Could you elaborate with some example?

Sridhar Ratnakumar

What's "topic" and "note"? What's the relation between them?

Sridhar Ratnakumar

I would prefer that tag queries only match the exact tag.

Okay, yea. Let's keep it this way. "Less magic" is good. For matching the whole tag tree, perhaps one can usetag=journal&tag=journal/* EDIT: Actually the query does an AND. So we have to revamp the query format to support OR'ing.

felko

What's "topic" and "note"? What's the relation between them?

Let's say topic = calculus and I want to make a zettel that shows all topics I have written about related to mathematics (but without linking to the actual notes). Here's how I do this currently:

+ math (contains zquery://search?tag=math&tag=portal)
|---+ calculus (tags: math, calculus, portal, contains zquery://search?tag=calculus&tag=note)
|   |--- note1 (tags: math, calculus, note)
|   |--- note2 (tags: math, calculus, note)
|---+ algebra (tags: math, algebra, portal, contains zquery://search?tag=algebra&tag=note)
|   |--- note1 (tags: math, algebra, note)
|   |--- note2 (tags: math, algebra, note)

I need to make the portal and note tags to avoid the calculus and algebra zettels linking to themselves.
With hierarchical tags I could do tag=math/*/ in the math zettel and tag=math/calculus/* in the calculus zettel.
Also, recursive globs like math/** could be useful.

Sridhar Ratnakumar

In that scheme, if I want to include everything under one tag, including that tag, I'd do: tag=journal OR tag=journal/**. I wonder if that can be simplified somehow.

Sridhar Ratnakumar

Maybe tagFrom=journal. And for your case, tagUnder=math/calculus (excludes math/calculus itself). I'm not too happy about this, but it is one idea.

felko

I said I wasn't going to be able to contribute this week but actually I need hierarchical tags to start taking my course notes so I decided to implement them.
I went for the globbing mechanism but with tagFrom/tagUnder syntactic sugar, are you ok with that?

Sridhar Ratnakumar

Let me think about it. I'd like to see that simplified, because tagFrom/tagUnder is something no other software uses, so it is a new thing one needs to learn and remember (in fact, I had to re-read my own message to understand what it actually means)

Sridhar Ratnakumar

@felko Do you need both tagFrom and tagUnder? What's the use case?

Sridhar Ratnakumar

And what does your tagFrom sugar expand to (in terms of Query value)?

Sridhar Ratnakumar

Here's an idea. We can eliminate tagUnder, and in its place use tagFrom=journal/ (i.e., with a slash at the end).

Sridhar Ratnakumar

Need a better name than "tagFrom" still

felko

I don't need both, tagUnder is enough for me, but most importantly I would really like globs. I can see how tagFrom could be useful though, but I don't think its worth implementing it.

And what does your tagFrom sugar expand to (in terms of Query value)?

Indeed tagFrom doesn't reduce to a tag glob, but I can reuse the globbing mechanism to implement isSubTag and isStrictSubTag

Sridhar Ratnakumar

Does tagUnder=math/calculus include only math/calculus/foo or also math/calculus/foo/bar?

Sridhar Ratnakumar

So the entire sub-tree of tags

felko

actually both tagFrom and tagUnder can be reduced to globs

Sridhar Ratnakumar

What does tagUnder=math/calculus map to in terms of glob or the Query type?

felko

tagFrom=math/calculus --> tagGlob=math/calculus/** (** can match nothing)
tagUnder=math/calculus --> tagGlob=math/calculus/**/*

Sridhar Ratnakumar

What's the meaning of "*"? What does "**" mean?

Sridhar Ratnakumar

I'm particulary confused by "/**/*"

Sridhar Ratnakumar

I thought "**" generally meant a recursive glob. So /foo/** should match both /foo/bar/ and /foo/bar/baz (but not /foo because there is a slash before the *)

Sridhar Ratnakumar

It looks to me like the main feature you need right now is: foo/** (with could be aliased to something like tagUnder=foo)

felko

* will match a single tag component (like "math" or "calculus", but not "math/calculus")
** will non-eagerly match as little tag components as possible
I can make ** match at least one component though, I think that's less confusing too but that's the way python's recursive globs work, and it allows tagFrom to be interpreted as a glob

Sridhar Ratnakumar

In Haskell, it is a bit different:

* matches part of a path component, excluding any separators.
** as a path component matches an arbitrary number of path components.

https://hackage.haskell.org/package/filepattern

Sridhar Ratnakumar
**/*.c matches all .c files anywhere on the filesystem, so file.c, dir/file.c,
dir1/dir2/file.c and /path/to/file.c all match, but file.h and dir/file.h don't.
Sridhar Ratnakumar

So: math/*/** is what you would do if you want to match math/calculus and math/calculus/sub.

Sridhar Ratnakumar

If math should be matched as well in addition, one would do: math/**.

Sridhar Ratnakumar

To me, that looks like a clearer interface.

Sridhar Ratnakumar

And it allows a third posibility: if you want to match only the immediate sub tags: math/*.

Sridhar Ratnakumar

Also, we could re-use the tag key for this. The parser would construct the appropriate type.

data Query = ByTag TagSpec
data TagSpec
  = LiteralMatch Text
  | TreeMatch SomeTypeForGlob
Sridhar Ratnakumar

So tag=journal will continue to work as before; but now users can do one of the following as well:

  • tag=journal/*: include journal/<whatever> (only one level)
  • tag=journal/*/**: recursively include all under journal/
  • tag=journal/**: recursively include all under journal/ including journal itself
Sridhar Ratnakumar

This way we use an existing semantics, instead of creating a new one (that users need to learn)

felko

just to give some context: in most clusters I don't need a complicated set of tags, so tagUnder/tagFrom are enough
but in some cases, its preferable to have something more granular by separating e.g. definitions from theorems or examples.
What recursive globs would allow is to query all definitions at once, by doing **/definition, and if I just want definitions in calculus, i could do math/calculus/**/definition (here empty match on ** would be useful because some definitions are nested, and some aren't)

Sridhar Ratnakumar

Sure. What do you think of the proposal above?

Sridhar Ratnakumar

In the proposal, math/calculus/**/definition would be automatically supported if we use System.FilePattern.match

felko

I like it, I already implemented a (backtracking) globbing algorithm that matches this specification (not that complicated, its about 10 LOC), would you still rather like me to use System.FilePattern.match?

Sridhar Ratnakumar

I have no strong preference. One thing we will need is a good number of cases to check against in the unit test.

felko

I already wrote a bunch (for both parsing and globbing), I can write more

Sridhar Ratnakumar

You can put them in this module I added yesterday: https://github.com/srid/neuron/blob/master/test/Neuron/Zettelkasten/QuerySpec.hs

Haskell meets Zettelkasten, for your plain-text delight. - srid/neuron
felko

Here's what I wrote so far, if you have any suggestion: https://github.com/felko/neuron/blob/hierarchical-tags/test/Neuron/Zettelkasten/TagSpec.hs
(or comments about the interface)
I'll need to change the it "fails to match end of tag" test

Haskell meets Zettelkasten, for your plain-text delight. - felko/neuron
Sridhar Ratnakumar
newtype Tag = Tag
  {tagComponents :: NonEmpty Text}

Is this to represent that exact tag (without patterns)?

felko

yes, I initially went for a messy GADT to store exact tags and tag patterns in the same type but it was a bit overkill

Sridhar Ratnakumar
      it "cannot contain stars" $ do
        shouldNotParse $ Z.parseTag "algorithms/a*"

May be not necessary to be done in your PR, but we could eventually accept this pattern as well (filepatterns library will make this possible). Telling the user that the "*" semantics is same as file system matching as used by the filepatterns library should be easier than giving a custom explanation (with a list of restrictions like this).

felko

if we didn't take into account the custom explanation, I think it would be better to restrict this because it would allow pretty weird globs (unix globbing also supports the ? wildcard which matches a single char)

Sridhar Ratnakumar

About the context "tag pattern", I think it would be simpler if you wrote the cases like this:

let cases =
    [ ("course/*/note", ["course/foo/note", "course/bar/note"], ["course/foo/bar/note"] )
    ]

i.e., a list of (search text, shouldMatch list, shouldNotMatch list)

felko

good idea, thanks

Sridhar Ratnakumar

Actually add one more item to that tuple, to indicate the reason that you can put in it. kind of how it happens here https://github.com/felko/neuron/blob/hierarchical-tags/test/Neuron/Zettelkasten/Link/ActionSpec.hs#L26-L66

Haskell meets Zettelkasten, for your plain-text delight. - felko/neuron
felko

wait filepattern doesn't seem to handle anything besides * and **, so that might not be a problem

Sridhar Ratnakumar

Yea, give it a try in a timebox maybe. You may be able to simplify a lot of what's in Tag.hs

Sridhar Ratnakumar

Less code is always better :-)

felko

I'm just a bit uncomfortable with the fact that files have a more general format than hierarchical tags, however:

All matching is O(n)

this may convince me to use filepatterns

felko

I meant filepaths (and file patterns), because they can contain more characters, wildcards can be placed anywhere in a component, and they can have file extensions (maybe the extensions don't change anything about the globbing algorithm actually, I'm not sure)

Sridhar Ratnakumar

If we generate a .html page for each tags, wouldn't it basically be like the filepaths, i.e., tags/math/calculus.html, tags/math/calculus/foo/definitions.html, tags/math.html, etc.

Sridhar Ratnakumar

(which by the way is an interesting idea to consider for really large zettelkastens; to generate individual tag index pages).

felko

what would you put inside those tag indices? If they would list zettels with that tag then I'm not sure to understand the purpose, since the web search can already do that. However I think it would be interesting to let the user say "redirect to this zettel instead of running the web search when clicking on that specific tag"

Sridhar Ratnakumar

A tag index could be a z-index corresponding to the, hmm, "sub zettelkasten"? With its own tree view, clusters, etc.

Sridhar Ratnakumar

"redirect to this zettel instead of running the web search when clicking on that specific tag"

If you are referring to portal zettels, then a tag z-index should display it prominently in the category tree. So it will be two clicks to get there. Anyway, this is a distant idea.

felko

With its own tree view, clusters, etc.

ok then it's a great idea i think

felko

anyway i'll try to make hierarchical tags first, with filepatterns

felko

Sridhar Ratnakumar said:

Also, we could re-use the tag key for this. The parser would construct the appropriate type.

data Query = ByTag TagSpec
data TagSpec
  = LiteralMatch Text
  | TreeMatch SomeTypeForGlob

Is this really needed? A literal match is also a glob without wildcards

felko

Also, neuron query --list-tags could output the tags in a tree, if someday in zettel-mode I want to implement a 'select tag' feature, instead of showing all existing tags like math/calculus and math/topology, group them in pseudo directories a la counsel-find-file

Sridhar Ratnakumar

By the way, there is also +ivy/projectile-find-file which allows you to fuzzy match anywhere on the path. So you can type "calc" and directly get to "math/calculus" without typing "math" first.