Proposal Card browser search: Change default search behaviour in respect to accents ( nc: syntax as default)

Thorsten's Avatar

Thorsten

25 May, 2020 09:22 PM

As I noticed recently beginning with Anki 2.1.24 the card browser supports a new notation which ignores letters with diacritic signs (like é, ü, ç, etc.) Therefore the plugin 'ignore accents in browser search' is obsolete now.

I appreciate this new functionality very much. However, I'd like to propose to have a slightly different search syntax regarding words with diacritic letters, i.e. a 'vice versa' behaviour because the new search behaviour is much more uncomfortable than with the old plugin:

  • the default search (without any additional syntax) should ignore diacritic signs
  • if somebody needs to search exactly for words with letters with diacritic signs a special syntax (or better: checkbox, as in the plugin) should be possible (like at the moment with 'nc:...')

The reason for this is: Normally somebody who learns a new language does not necessarily know the right spelling when he searches for words in the browser. For example, if I search for the word 'tree' in Portuguese, imagine what I would try first:
- arvore or
- árvore? Of course I would write arvore first because it's easier (faster) to write and because I am always forgetting the accent on the first letter. I would expect that the browser should find the card with the right spelling despite my (wrong) spelling. But at the moment the browser won't find anything. I would be forced to write nc:arvore to get a result.

This makes IMO no sense. Generally, it would mean that I had to include nc: in front of every search to be sure to get results. I am sure that nobody wants a behaviour like that and that other people have similar difficulties in respect to accents like me. Therefore I request that you change the default search behaviour as I described above. (Of course, I had the former checkbox of the plugin 'ignore accents in search browser' always checked.)

The few people who needs 'stronger' search criteria could use a special (vice versa) search syntax, or - much easier - a simple checkbox like in the former plugin 'ignore accents in browser search'. This would be perfect for both worlds and would satisfy most people.

BTW The very few people, who need to combine the 'wider' search and the 'narrow' search in one search query should use the vice versa search syntax furthermore.

  1. Support Staff 1 Posted by Damien Elmes on 26 May, 2020 09:15 AM

    Damien Elmes's Avatar

    Hi Thorsten,

    While faster than the old add-on, it unfortunately still slows searches down on larger collections. Many users don't use accents, so I'm afraid I'm not sure it makes sense to make things slower for everyone for a feature that only some users need. In the future when the text can be better indexed and it no longer has an extra cost, I agree it would be the more sensible default. And in the mean time, it should be fairly simple for someone to write an add-on that automatically adds "nc:" to words, as there's a hook in the code intended for such things.

  2. 2 Posted by Thorsten on 26 May, 2020 03:38 PM

    Thorsten's Avatar

    Hi Damien,

    I see. I did not think about the fact that the time to search anything is noticeable to (and therefore important for) people - but ok, I only have 7000 different cards and not 100.000! Personally, I do not notice any difference between the two search methods on my 8 years old Windows 7 notebook.

    Would be interesting to know how many people have such big collections where they omit a search with nc: because it takes too long to wait for the result. But experiments like this are also very dependent of the used hardware (SSD, CPU).

    Do you have a big test collection which I could import and see if I notice any difference between the two search methods? Would be interesting to test it with different hardware (old notebook, new PC).

  3. 3 Posted by Aleksej on 27 May, 2020 02:48 AM

    Aleksej's Avatar

    I have over 80000 cards (not publishing them), many of which are huge articles or even books for incremental reading. Very few cards with optional accent marks, and many cards with German phrases/words (Damien, how important are combining characters in Japanese?). I intend to write an addon to show forecasts for each card in the Browser; the version for Anki 2.0 has always been slow.

  4. Support Staff 4 Posted by Damien Elmes on 27 May, 2020 11:03 AM

    Damien Elmes's Avatar

    Thorsten: I've made a suggestion about the add-on on https://github.com/ankitects/help-wanted/issues/8

    Aleksej: I think it would be rare to want to ignore combining characters when searching for Japanese

  5. 5 Posted by Thorsten on 28 May, 2020 09:08 AM

    Thorsten's Avatar

    @Aleksej: Just curious - why are you storing books in Anki and not in an application which is made for it, i.e. Google Books or Kindle or any other book reader, that is very comfortable to read page by page? (BTW Is this exactly what you mean with 'incremental reading', i. e. normal reading?) I imagine that an application like Anki is not designed for storing whole books or huge articles because of the database design. Which is the advantage compared to other programs regarding large texts?

    Or are you trying to memorize whole books and huge articles like other people who just try to memorize a foreign vocabulary?

    IMO I think even Evernote is more suited to store big text data than a 'card' in Anki.

  6. 6 Posted by Dominik on 05 Jun, 2020 04:33 PM

    Dominik's Avatar

    I´m here, because I have the same request like Thorsten. In my Latin deck I often use diacritic signs and the old "ignore accents" addon was always activated. I hope we can get this function back very soon.

  7. 7 Posted by Thorsten on 05 Jun, 2020 06:37 PM

    Thorsten's Avatar

    Hello Damien,

    you wrote: "Many users don't use accents"

    How do you know? Do you have a statistic? All romanic languages use accents (Portuguese, Spanish (!), French, Italien). In German exists the famous German Umlauts (öäü). Many, many people are learning exactly those languages, especially Spanish.

    Not everybody who uses Anki has the effort to learn asiatic languages like Chinese or Japanese which have no accents. Or do you think that exactly those people form the majority of Anki users, and therefore a 'ignore accents search' is not important for this majority?

    Personally I think without having a statistic to proof: Many Anki users use letters with diacretic signs / accents. And of course, nearly everybody of them wants to use the search as easy as possible, i.e. without diacretic signs. Example: c instead of ç.

    BTW, What about Danish (ø, å) or Swedish (å)?
    BTW2 I think it's not necessary to cope with specialities like ss=ß. Would be nice, but not necessary.

  8. 8 Posted by addons_zz on 05 Jun, 2020 11:15 PM

    addons_zz's Avatar

    How do you know? Do you have a statistic? All romanic languages use accents

    He was talking about searching for cards not using accents, not that Anki users do not use accents.

    Not everybody who uses Anki has the effort to learn asiatic languages like Chinese or Japanese which have no accents

    Languages like Chinese or Japanese also have accents.

    Personally I think without having a statistic to proof: Many Anki users use letters with diacretic signs / accents. And of course, nearly everybody of them wants to use the search as easy as possible, i.e. without diacretic signs. Example: c instead of ç.

    You can search ignoring accents just by prefixing your search with nc: https://docs.ankiweb.net/#/searching?id=ignoring-accentscombining-c...

    You can use nc: to remove combining characters ("no combining"). For example:

    nc:uber
    matches notes with "uber", "über", "Über" and so on.

    nc:は
    matches "は", "ば", and "ぱ"

    Searches that ignore combining characters are slower than regular searches.

  9. 9 Posted by Dominik on 06 Jun, 2020 09:23 AM

    Dominik's Avatar

    The methode with nc: isn't usefull. You can't type nc: every single time, when you search for something. We need this implemented as standard search.
    We, the users, who need it, realise that 2.1.22 made it much more complicated, but those, who don't need it, won't realise, if nc: is alyways activated.
    So it's no disadvantage for anyone, but a huge advantage for many people, who work with accents.

  10. 10 Posted by Thorsten on 06 Jun, 2020 09:48 AM

    Thorsten's Avatar

    You can search ignoring accents just by prefixing your search with nc:

    Thank you very much for your hint. I think I described this syntax already in my first posting.

    This syntax is uncomfortable and not intuitive. This syntax is not anything a general user expects.

    Imagine you have to type nc: every time you are using Google. Do you see, what I mean? This would be very uncomfortable for somebody living in a land which uses a language with many diacretic signs, wouldn't it? If you use google.es, it's enough to type 'senor' to find every 'señor'. Nobody wants always to type nc:senor.

    Instead, if you need a more detailed search, there should be a possibility to restrict the search result with the help of filters. E. g., if I need to search for accents etc. I could use a search term like rd: (rd = respect diacretic signs). Or better: just a checkbox like in the old (and now deprecated) plugin.

    BTW Does the used database not support functions to ignore accents in database search? Is this the underlying problem?

  11. 11 Posted by Thorsten on 06 Jun, 2020 09:58 AM

    Thorsten's Avatar

    We, the users, who need it, realise that 2.1.22 made it much more complicated, but those, who don't need it, won't realise, if nc: is alyways activated.

    Of course one would realize it because users will get a longer result list. But in this case a filter in form of a fast ticked checkbox could be easily offered which uses an 'exact search' which will trigger a search for letters with diacretic signs or just narrow the search result which would be faster.

  12. 12 Posted by Aleksej on 12 Jun, 2020 01:35 AM

    Aleksej's Avatar

    @Thorsten, Anki 2.1 seems to work well with books (unlike 2.0), and be slow for some other reason. The books are either textbooks, or are supposed to affect MorphMan.

  13. 13 Posted by Thorsten on 13 Jun, 2020 12:21 PM

    Thorsten's Avatar

    @Aleksej: I don't know MorphMan yet. I just read the description that it has to do with learning an unknown word within a sentence. Sound interesting. Could you please describe a little bit more in detail what you are doing with MophMan in combination with textbooks? Are you reading texts in Anki to be able to input sentences of those texts into MorphMan?

  14. 14 Posted by Aleksej on 13 Jun, 2020 09:32 PM

    Aleksej's Avatar

    @Thorsten: Since I have a huge backlog, I haven't yet benefited much from MorphMan.

    MorphMan sorts not just by the unknown word, but also by other criteria. See the "point system" in https://massimmersionapproach.com/table-of-contents/anki/morphman/#...

    To make it more useful for me, I changed the "Language with spaces" morphemizer locally to return also all pairs of words.

    As to why have books in Anki, see Incremental reading: https://ankiweb.net/shared/info/935264945

    I also made MorphMan use different maximum length settings for incremental reading cards.

Comments are closed, but you can start a new discussion.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac