Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4258: Federated User Directory #4258

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions proposals/4258-federated-user-directory.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Client
  • Server

Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# MSC4258: Federated User Directory

Currently user search can only be done locally, which would at best get a list of all users known to the server.

This proposal aims at introducing a federation endpoint allowing servers to broadcast the search to the current federating servers and get results among all their known users.

Improvement of the client API is also proposed to accommodate the fact that results will arrive asynchronously and to allow users to manage their visibility on search results.


## Proposal

### Federation endpoint

We first propose a new federation endpoint similar to the [current client API](https://spec.matrix.org/v1.12/client-server-api#post_matrixclientv3user_directorysearch).
It would be authenticated and rate limited.

#### `POST /_matrix/federation/v3/user_directory/search`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the valid error conditions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging on to this, the profile federation API has a 403 to let server admins deny profile look-up. This might be good to have on the user directory API as well.


#### Request
```json
{
"limit": 10,
"search_term": "foo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there guidance on how the search term is used? Is it the same as the current API?

}
```

#### Response
```json
{
"limited": false,
"results": [
{
"avatar_url": "mxc://bar.com/foo",
"display_name": "Foo",
"m.user_directory.visibility": "local",
"user_id": "@foo:bar.com"
}
]
}
```

All profile fields (cf [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133)) should be returned here.

When an user calls the client user search API, the server should send a federated user search request to all known servers. It would then receive the results and return them to the user.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds really really expensive for the server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could benefit from #4259 🙂

Servers must not forward this request to other servers and only return results known locally. This is to avoid infinite loop between servers knowing each other.

However we have a problem here: we can't have expectation on when and even if servers will answer to the search request.

We are hence proposing some changes to the client API to accommodate the need to have a way to stream new results to the client.

Note that `m.user_directory.visibility` is defined further down this proposal.

### Client endpoint changes

We propose to introduce a reactive mechanism to allow the server to stream new results to the client.

#### POST /_matrix/client/v3/user_directory/search
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients cannot control if the request is only local or not then?


#### Request
```json
{
"limit": 10,
"search_term": "foo",
"search_token": "a1d29g4f73"
}
```

For that we introduce a `search_token` to the request coming from the response of a previous search(`search_token`). A request containing a `search_token` will stall until new results are available to the server. If some more results are expected to be returned, it may include another `search_token`, and hence.

`search_token` is optional within the request so proposed changes are retro-compatible.

#### Response
```json
{
"search_token": "a1d29g4f73",
"limited": true,
"results": [
{
"avatar_url": "mxc://bar.com/foo",
"display_name": "Foo",
"m.user_directory.visibility": "local",
"user_id": "@foo:bar.com"
}
]
}
```
`search_token` : is a unique identifier that means that more results can be retrieved by querying with this `search_token`. `limited` should be `true` when `search_token` is returned.

#### New profile field to control user visibility in the directory

We propose to add a new field in the profile (MSC4133) `m.user_directory.visibility` to give the user the ability to control their visibility in the user directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd piece of data to allow other users to query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this maybe go into account data instead? It only needs to be accessed by the local homeserver, right?


Different values are possible :
- `hidden` : not visible to anyone
- `local` : visible only to local homeserver users
- `restricted`: visible to any user sharing a room with
- `remote` (or federated or public ?): visible to users on local and remote homeservers

If no value is provided (or it is null), the user hasn't set a preference and the server should follow the current expected behavior (visible if sharing a room in common or in public room).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"visible if sharing a room in common or in public room" is actually the only the minimum requirement.


```json
{
"avatar_url": "…",
"displayname": "…",
"m.user_directory.visibility": "local"
}
```

## Potential issues

We may have requests lost or getting timeout from intermediary network equipment, especially since we are using some kind of long polling.
We think the fact that we use a `search_token` that changes on each request allow the server to track correctly if new search results were already received by the client or not.

## Alternatives

We first thought about using an account data, however it has a big caveat: remote servers can't access it, hence remote servers will not be able to honor the visibility when trying to return remote users that are already visible locally to them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to this is to include the requesting user to the federated servers and let them decide whether to provide the user in the result or not?


Rather than using a `search_token`, we could use a `search_id` that will be the same for all subsequent calls.
This solution is less informative about the progression of the search from the server perspective, cf `Potential issues` section.

## Security considerations

### Sensitive Data Exposure

A malicious server could list all user matrix ids that are defined in `remote` or `restricted`.

#### Data Exposure Mitigation recommendations

The federation search endpoint should be rate limited.

We recommend to not answer for `search_term` with less than 3 characters like "a" or "at".

#### Trust & Safety recommendations

We recommend to log requests (or at least their count) from each server in order to be able to identify and ban the malicious servers who are trying to scrap all visible (including `restricted` ones) users profiles of the federation.

Before, a server needed to join a room to list the users in a room (`restricted`). This scenario is logged in the room state.
Now with this change, it is possible to list all the restricted users from other servers with no trace left at the protocol level.


## Unstable prefix

`fr.tchap.user_directory.visibility` should be used as an unstable identifier for the profile field.

`/_matrix/federation/unstable/fr.tchap/user_directory/search` should be used as an unstable federation endpoint.


## Dependencies

This MSC builds on [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133)