Hierarchical tags - Neuron

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

2020-04-05 22:32:29

https://github.com/srid/neuron/issues/49
That's a good idea, this can avoid creating too many tags.

Should querying for tag=journal include journal/work zettels as well? Or would this behaviour be specified explicitly by the user somehow?

I would prefer that tag queries only match the exact tag. In my personal zettelkasten I would like to make "portals" kinda like wikipedia, the only way I found around this is to make a note tag and then do in the portal zquery://search?tag=topic&tag=note. Maybe a new query like zquery://search?category=journal could be useful?

Hierarchical tags support · Issue #49 · srid/neuron

If a zettel has the following metadata --- title: Some zettel tags: - journal/work --- Should querying for tag=journal include journal/work zettels as well? Or would this behaviour be specified exp...

Sridhar Ratnakumar

2020-04-06 03:09:54

I'm not sure I understand your portal example. Could you elaborate with some example?

Sridhar Ratnakumar

2020-04-06 03:10:27

What's "topic" and "note"? What's the relation between them?

Sridhar Ratnakumar

2020-04-06 03:11:43

I would prefer that tag queries only match the exact tag.

Okay, yea. Let's keep it this way. "Less magic" is good. For matching the whole tag tree, perhaps one can usetag=journal&tag=journal/* EDIT: Actually the query does an AND. So we have to revamp the query format to support OR'ing.

2020-04-06 09:14:30

What's "topic" and "note"? What's the relation between them?

Let's say topic = calculus and I want to make a zettel that shows all topics I have written about related to mathematics (but without linking to the actual notes). Here's how I do this currently:

+ math (contains zquery://search?tag=math&tag=portal)
|---+ calculus (tags: math, calculus, portal, contains zquery://search?tag=calculus&tag=note)
|   |--- note1 (tags: math, calculus, note)
|   |--- note2 (tags: math, calculus, note)
|---+ algebra (tags: math, algebra, portal, contains zquery://search?tag=algebra&tag=note)
|   |--- note1 (tags: math, algebra, note)
|   |--- note2 (tags: math, algebra, note)

I need to make the portal and note tags to avoid the calculus and algebra zettels linking to themselves.
With hierarchical tags I could do tag=math/*/ in the math zettel and tag=math/calculus/* in the calculus zettel.
Also, recursive globs like math/** could be useful.

Sridhar Ratnakumar

2020-04-07 22:06:54

In that scheme, if I want to include everything under one tag, including that tag, I'd do: tag=journal OR tag=journal/**. I wonder if that can be simplified somehow.

Sridhar Ratnakumar

2020-04-07 22:08:03

Maybe tagFrom=journal. And for your case, tagUnder=math/calculus (excludes math/calculus itself). I'm not too happy about this, but it is one idea.

2020-04-16 14:57:08

I said I wasn't going to be able to contribute this week but actually I need hierarchical tags to start taking my course notes so I decided to implement them.
I went for the globbing mechanism but with tagFrom/tagUnder syntactic sugar, are you ok with that?

Sridhar Ratnakumar

2020-04-16 14:59:18

Let me think about it. I'd like to see that simplified, because tagFrom/tagUnder is something no other software uses, so it is a new thing one needs to learn and remember (in fact, I had to re-read my own message to understand what it actually means)

Sridhar Ratnakumar

2020-04-16 15:00:27

@felko Do you need both tagFrom and tagUnder? What's the use case?

Sridhar Ratnakumar

2020-04-16 15:01:41

And what does your tagFrom sugar expand to (in terms of Query value)?

Sridhar Ratnakumar

2020-04-16 15:03:31

Here's an idea. We can eliminate tagUnder, and in its place use tagFrom=journal/ (i.e., with a slash at the end).

Sridhar Ratnakumar

2020-04-16 15:03:59

Need a better name than "tagFrom" still

2020-04-16 15:05:11

I don't need both, tagUnder is enough for me, but most importantly I would really like globs. I can see how tagFrom could be useful though, but I don't think its worth implementing it.

And what does your tagFrom sugar expand to (in terms of Query value)?

Indeed tagFrom doesn't reduce to a tag glob, but I can reuse the globbing mechanism to implement isSubTag and isStrictSubTag

Sridhar Ratnakumar

2020-04-16 15:06:18

Does tagUnder=math/calculus include only math/calculus/foo or also math/calculus/foo/bar?

2020-04-16 15:06:41

both

Sridhar Ratnakumar

2020-04-16 15:07:17

So the entire sub-tree of tags

2020-04-16 15:07:21

actually both tagFrom and tagUnder can be reduced to globs

Sridhar Ratnakumar

2020-04-16 15:07:39

What does tagUnder=math/calculus map to in terms of glob or the Query type?

2020-04-16 15:09:11

tagFrom=math/calculus --> tagGlob=math/calculus/** (** can match nothing)
tagUnder=math/calculus --> tagGlob=math/calculus/**/*

Sridhar Ratnakumar

2020-04-16 15:11:02

What's the meaning of "*"? What does "**" mean?

Sridhar Ratnakumar

2020-04-16 15:11:39

I'm particulary confused by "/**/*"

Sridhar Ratnakumar

2020-04-16 15:12:55

I thought "**" generally meant a recursive glob. So /foo/** should match both /foo/bar/ and /foo/bar/baz (but not /foo because there is a slash before the *)

Sridhar Ratnakumar

2020-04-16 15:13:37

It looks to me like the main feature you need right now is: foo/** (with could be aliased to something like tagUnder=foo)

2020-04-16 15:15:53

* will match a single tag component (like "math" or "calculus", but not "math/calculus")
** will non-eagerly match as little tag components as possible
I can make ** match at least one component though, I think that's less confusing too but that's the way python's recursive globs work, and it allows tagFrom to be interpreted as a glob

Sridhar Ratnakumar

2020-04-16 15:19:57

In Haskell, it is a bit different:

* matches part of a path component, excluding any separators.
** as a path component matches an arbitrary number of path components.

https://hackage.haskell.org/package/filepattern

Sridhar Ratnakumar

2020-04-16 15:20:21

**/*.c matches all .c files anywhere on the filesystem, so file.c, dir/file.c,
dir1/dir2/file.c and /path/to/file.c all match, but file.h and dir/file.h don't.

Sridhar Ratnakumar

2020-04-16 15:21:52

So: math/*/** is what you would do if you want to match math/calculus and math/calculus/sub.

Sridhar Ratnakumar

2020-04-16 15:22:05

If math should be matched as well in addition, one would do: math/**.

Sridhar Ratnakumar

2020-04-16 15:22:26

To me, that looks like a clearer interface.

Sridhar Ratnakumar

2020-04-16 15:23:00

And it allows a third posibility: if you want to match only the immediate sub tags: math/*.

Sridhar Ratnakumar

2020-04-16 15:24:40

Also, we could re-use the tag key for this. The parser would construct the appropriate type.

data Query = ByTag TagSpec
data TagSpec
  = LiteralMatch Text
  | TreeMatch SomeTypeForGlob

Sridhar Ratnakumar

2020-04-16 15:25:32

So tag=journal will continue to work as before; but now users can do one of the following as well:

tag=journal/*: include journal/<whatever> (only one level)
tag=journal/*/**: recursively include all under journal/
tag=journal/**: recursively include all under journal/ including journal itself

Sridhar Ratnakumar

2020-04-16 15:28:24

And we may be able to re-use this function to implement the actual matching: https://hackage.haskell.org/package/filepattern-0.1.2/docs/System-FilePattern.html#v:match

Sridhar Ratnakumar

2020-04-16 15:28:40

This way we use an existing semantics, instead of creating a new one (that users need to learn)

2020-04-16 15:29:18

just to give some context: in most clusters I don't need a complicated set of tags, so tagUnder/tagFrom are enough
but in some cases, its preferable to have something more granular by separating e.g. definitions from theorems or examples.
What recursive globs would allow is to query all definitions at once, by doing **/definition, and if I just want definitions in calculus, i could do math/calculus/**/definition (here empty match on ** would be useful because some definitions are nested, and some aren't)

Sridhar Ratnakumar

2020-04-16 15:30:19

Sure. What do you think of the proposal above?

Sridhar Ratnakumar

2020-04-16 15:31:17

In the proposal, math/calculus/**/definition would be automatically supported if we use System.FilePattern.match

2020-04-16 15:34:07

I like it, I already implemented a (backtracking) globbing algorithm that matches this specification (not that complicated, its about 10 LOC), would you still rather like me to use System.FilePattern.match?

Sridhar Ratnakumar

2020-04-16 15:34:44

I have no strong preference. One thing we will need is a good number of cases to check against in the unit test.

2020-04-16 15:35:17

I already wrote a bunch (for both parsing and globbing), I can write more

Sridhar Ratnakumar

2020-04-16 15:35:26

You can put them in this module I added yesterday: https://github.com/srid/neuron/blob/master/test/Neuron/Zettelkasten/QuerySpec.hs

Haskell meets Zettelkasten, for your plain-text delight. - srid/neuron

2020-04-16 15:39:09

Here's what I wrote so far, if you have any suggestion: https://github.com/felko/neuron/blob/hierarchical-tags/test/Neuron/Zettelkasten/TagSpec.hs
(or comments about the interface)
I'll need to change the it "fails to match end of tag" test

Haskell meets Zettelkasten, for your plain-text delight. - felko/neuron

Sridhar Ratnakumar

2020-04-16 15:41:24

newtype Tag = Tag
  {tagComponents :: NonEmpty Text}

Is this to represent that exact tag (without patterns)?

2020-04-16 15:42:08

yes, I initially went for a messy GADT to store exact tags and tag patterns in the same type but it was a bit overkill

Sridhar Ratnakumar

2020-04-16 15:43:58

      it "cannot contain stars" $ do
        shouldNotParse $ Z.parseTag "algorithms/a*"

May be not necessary to be done in your PR, but we could eventually accept this pattern as well (filepatterns library will make this possible). Telling the user that the "*" semantics is same as file system matching as used by the filepatterns library should be easier than giving a custom explanation (with a list of restrictions like this).

2020-04-16 15:47:15

if we didn't take into account the custom explanation, I think it would be better to restrict this because it would allow pretty weird globs (unix globbing also supports the ? wildcard which matches a single char)

Sridhar Ratnakumar

2020-04-16 15:47:19

About the context "tag pattern", I think it would be simpler if you wrote the cases like this:

let cases =
    [ ("course/*/note", ["course/foo/note", "course/bar/note"], ["course/foo/bar/note"] )
    ]

i.e., a list of (search text, shouldMatch list, shouldNotMatch list)

2020-04-16 15:48:26

good idea, thanks

Sridhar Ratnakumar

2020-04-16 15:49:28

Actually add one more item to that tuple, to indicate the reason that you can put in it. kind of how it happens here https://github.com/felko/neuron/blob/hierarchical-tags/test/Neuron/Zettelkasten/Link/ActionSpec.hs#L26-L66

Haskell meets Zettelkasten, for your plain-text delight. - felko/neuron

2020-04-16 15:49:54

wait filepattern doesn't seem to handle anything besides * and **, so that might not be a problem

Sridhar Ratnakumar

2020-04-16 15:50:30

Yea, give it a try in a timebox maybe. You may be able to simplify a lot of what's in Tag.hs

Sridhar Ratnakumar

2020-04-16 15:50:59

Less code is always better :-)

2020-04-16 15:56:56

I'm just a bit uncomfortable with the fact that files have a more general format than hierarchical tags, however:

All matching is O(n)

this may convince me to use filepatterns

Sridhar Ratnakumar

2020-04-16 15:57:31

More general format?

2020-04-16 16:00:14

I meant filepaths (and file patterns), because they can contain more characters, wildcards can be placed anywhere in a component, and they can have file extensions (maybe the extensions don't change anything about the globbing algorithm actually, I'm not sure)

Sridhar Ratnakumar

2020-04-16 16:03:03

If we generate a .html page for each tags, wouldn't it basically be like the filepaths, i.e., tags/math/calculus.html, tags/math/calculus/foo/definitions.html, tags/math.html, etc.

Sridhar Ratnakumar

2020-04-16 16:04:26

(which by the way is an interesting idea to consider for really large zettelkastens; to generate individual tag index pages).

2020-04-16 16:11:29

what would you put inside those tag indices? If they would list zettels with that tag then I'm not sure to understand the purpose, since the web search can already do that. However I think it would be interesting to let the user say "redirect to this zettel instead of running the web search when clicking on that specific tag"

Sridhar Ratnakumar

2020-04-16 16:19:41

A tag index could be a z-index corresponding to the, hmm, "sub zettelkasten"? With its own tree view, clusters, etc.

Sridhar Ratnakumar

2020-04-16 16:21:07

"redirect to this zettel instead of running the web search when clicking on that specific tag"

If you are referring to portal zettels, then a tag z-index should display it prominently in the category tree. So it will be two clicks to get there. Anyway, this is a distant idea.

2020-04-16 16:22:09

With its own tree view, clusters, etc.

ok then it's a great idea i think

2020-04-16 16:23:19

anyway i'll try to make hierarchical tags first, with filepatterns

2020-04-17 14:33:50

Sridhar Ratnakumar said:

Also, we could re-use the tag key for this. The parser would construct the appropriate type.
data Query = ByTag TagSpec
data TagSpec
  = LiteralMatch Text
  | TreeMatch SomeTypeForGlob

Is this really needed? A literal match is also a glob without wildcards

Sridhar Ratnakumar

2020-04-17 14:58:42

Good point

2020-04-17 15:01:01

Also, neuron query --list-tags could output the tags in a tree, if someday in zettel-mode I want to implement a 'select tag' feature, instead of showing all existing tags like math/calculus and math/topology, group them in pseudo directories a la counsel-find-file

Sridhar Ratnakumar

2020-04-17 16:05:57

By the way, there is also +ivy/projectile-find-file which allows you to fuzzy match anywhere on the path. So you can type "calc" and directly get to "math/calculus" without typing "math" first.