(backend) search user on her email and name

Compute Trigram similarity on user's name, and sum it up
with existing one based on user's email.

This approach is inspired by Contact search feature, which
computes a Trigram similarity score on first name and last
name, to sum up their scores.

With a similarity score influenced by both email and name,
API results would reflect both email and name user's attributes.

As we sum up similarities, I increased the similarity threshold.
Its value is empirical, and was finetuned to avoid breaking
existing tests. Please note, the updated value is closer to the
threshold used to search contacts.

Email or Name can be None. Summing two similarity scores with
one of them None, results in a None total score. To mitigate
this issue, I added a default empty string value, to replace
None values. Thus, the similarity score on this default empty
string value is equal to 0 and not to None anymore.
This commit is contained in:
Lebaud Antoine
2024-03-06 23:32:25 +01:00
committed by aleb_the_flash
parent b2d68df646
commit 7d65de1938
2 changed files with 120 additions and 9 deletions

View File

@@ -2,6 +2,7 @@
from django.contrib.postgres.search import TrigramSimilarity
from django.db.models import Func, Max, OuterRef, Prefetch, Q, Subquery, Value
from django.db.models.functions import Coalesce
from rest_framework import (
decorators,
@@ -18,9 +19,7 @@ from core import models
from . import permissions, serializers
EMAIL_SIMILARITY_THRESHOLD = 0.01
# TrigramSimilarity threshold is lower for searching email than for names,
# to improve matching results
SIMILARITY_THRESHOLD = 0.04
class NestedGenericViewSet(viewsets.GenericViewSet):
@@ -212,13 +211,21 @@ class UserViewSet(
if query := self.request.GET.get("q", ""):
similarity = Max(
TrigramSimilarity(
Func("identities__email", function="unaccent"),
Coalesce(
Func("identities__email", function="unaccent"), Value("")
),
Func(Value(query), function="unaccent"),
)
+ TrigramSimilarity(
Coalesce(
Func("identities__name", function="unaccent"), Value("")
),
Func(Value(query), function="unaccent"),
)
)
queryset = (
queryset.annotate(similarity=similarity)
.filter(similarity__gte=EMAIL_SIMILARITY_THRESHOLD)
.filter(similarity__gte=SIMILARITY_THRESHOLD)
.order_by("-similarity")
)