Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/matrix-synapse: Add UNIX domain socket listener support #286172

Merged
merged 2 commits into from
Feb 9, 2024

Conversation

mweinelt
Copy link
Member

@mweinelt mweinelt commented Feb 3, 2024

Exposes two options, path and mode, to configure the location and permissions on the socket file.

Adds an assertion, that either path or bind_addresses and port are configured on every listener.

Migrates the default replication listener of the main instance to a UNIX domain socket, because it is more efficient.

cc NixOS/infra#336

Description of changes

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

Copy link
Contributor

@999eagle 999eagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM. I've deployed this on my server now and it works fine. Had to manually change the replication listener and instance map to use the new path though because I'm setting those options myself.

nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
Copy link
Contributor

@999eagle 999eagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay no, I found a big issue with the mode setting. When not changing the mode from its default of 660 (which is 1224 in octal), the socket is created with s-w--w-r-T permissions which is completely wrong (but apparently enough for replication to still work?). I had to set the mode to 438 (which is 666 in octal) to get srw-rw-rw- permissions on the socket.
Apparently, Synapse interprets the mode as octal even if it's a (decimal) integer, so we would need to serialize this as mode: 0o660 or something similar in the generated yaml file.

EDIT: also the final homeserver.yaml still contains port: null for all UDS listeners and mode: 660,path: null for all TCP listeners.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 4, 2024

Thanks for testing.

I initially had the mode as nullOr str, but os.chmod complained about that. So that's when I made it nullOr int. Not sure how to represent file permissions properly in NixOS modules.

I'm a bit surprised, that the final config still has null values, since the renderer filters out null values. Have to recheck the code path.

@Ma27
Copy link
Member

Ma27 commented Feb 4, 2024

I initially had the mode as nullOr str, but os.chmod complained about that. So that's when I made it nullOr int. Not sure how to represent file permissions properly in NixOS modules.

Because it had quotes in the final config and was thus interpreted as a string? Would kinda suck though if that got propagated all the way down to the chmod call.

How about types.coercedTo or something like this instead? Apparently YAML also accepts 0X as base8, perhaps we can make use of that somehow.

Apparently, Synapse interprets the mode as octal even if it's a (decimal) integer, so we would need to serialize this as mode: 0o660 or something similar in the generated yaml file.

I don't really understand where this happens though, I skimmed a bit through the code of synapse and twisted and to me it seems that no more conversion hacks are done, so apparently a correct representation is assumed.

Anyways, I think it's wrong that we even have to care and I think it's a bad idea to use a base10 integer when we need a base8 one. I think that a string with coercion (and a very strict regex check) is a better option here.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 4, 2024

This is the code path up to the os.chmod call in twisted.

Error during startup:
Traceback (most recent call last):
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/_base.py", line 258, in wrapper
    await cb(*args, **kwargs)
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/homeserver.py", line 370, in start
    await _base.start(hs)
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/_base.py", line 594, in start
    hs.start_listening()
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/homeserver.py", line 258, in start_listening
    self._listener_http(self.config, listener)
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/homeserver.py", line 147, in _listener_http
    ports = listen_http(
            ^^^^^^^^^^^^
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/_base.py", line 441, in listen_http
    ports = listen_unix(
            ^^^^^^^^^^^^
  File "/nix/store/gxma1wp6xc25vp8lza0bkhm30abrl1l1-matrix-synapse-1.100.0/lib/python3.11/site-packages/synapse/app/_base.py", line 388, in listen_unix
    cast(Port, reactor.listenUNIX(path, factory, backlog, mode, wantPID))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/gmygp534ydqbs1iw63090m9vixkcydrq-python3-3.11.7-env/lib/python3.11/site-packages/twisted/internet/posixbase.py", line 265, in listenUNIX
    p.startListening()
  File "/nix/store/gmygp534ydqbs1iw63090m9vixkcydrq-python3-3.11.7-env/lib/python3.11/site-packages/twisted/internet/unix.py", line 416, in startListening
    os.chmod(self.port, self.mode)
TypeError: 'str' object cannot be interpreted as an integer

@999eagle
Copy link
Contributor

999eagle commented Feb 4, 2024

From what I've gathered the mode actually has to be an integer but its bits are directly set as the file mode. There is no special handling anywhere because python uses an octal literal as default for the mode, so the default value is actually 0o666, which is 438 in decimal.
The config parser does not seem to do any kind of type conversions so the expectation is that the yaml already contains an integer. I think yaml supports octal literals using the same 0o-syntax, so that's how we should specify the mode.

@Ma27
Copy link
Member

Ma27 commented Feb 4, 2024

There is no special handling anywhere because python uses an octal literal as default for the mode, so the default value is actually 0o666, which is 438 in decimal.

Yeah, can confirm that code-wise.

Rereading your previous comment, I think I misunderstood the "interprets it as octal" part and thought that you specified the decimal number which lead to the error not the other way round, sorry for that!

I think yaml supports octal literals using the same 0o-syntax, so that's how we should specify the mode.

Quickly checked that you can indeed write 0666 in YAML to get a base8 int, so that's probably what we'll want to output.

But without being able to express octal numbers in Nix (and I'm not entirely sure if I'll like it then) I strongly dislike the mode to be a number, so I stand by my suggestion to use a string with coercedTo and strict validation.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 4, 2024

My understanding is as follows:

  • Nix basically supports only base 10 integers and strings.
  • Synapse uses pyyaml to read the config.
  • pyyaml only supports YAML 1.1, where octal numbers need to be specified with leading zeros.
  • Python's os.chmod only supports integers, but respects different representations, like 0o660
  • I don't think there is a way to represent an int in Nix, that has leading zeros.
  • I don't understand how coercing the type can save us here

I see two possible solutions:

  • Hardcode the option in its decimal form
  • Write a function that converts octal to decimal

@mweinelt
Copy link
Member Author

mweinelt commented Feb 4, 2024

Write a function that converts octal to decimal

This happened. The function name toIntBase8 sounds weird, but it follows the naming in lib.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 5, 2024

I have absolutely no clue, why we now have null values in the resulting configuration. In my opinion, the recursive filtering should deal with that, but it does not.

Maybe the custom listenerType causes problems.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 5, 2024

The problem is apparently, that filterAttrsRecursive does not recurse into a list.

Copy link
Contributor

@999eagle 999eagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should at least fix the listener to have the correct null values.

nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Show resolved Hide resolved
@mweinelt
Copy link
Member Author

mweinelt commented Feb 5, 2024

I think I addressed all the feedback. Ready for another review.

Copy link
Contributor

@999eagle 999eagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few documentation changes. Otherwise LGTM, I'm running this on my server now and it seems to work!

I'm still not entirely happy to see mode: 432 in my homeserver.yaml without any comment as to what this value means but I guess this isn't something we can really fix in any good way, so I guess we can keep it like that.

nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
@mweinelt
Copy link
Member Author

mweinelt commented Feb 5, 2024

@ofborg test matrix-synapse
@ofborg test matrix-synapse-workers

nixos/modules/services/matrix/synapse.nix Outdated Show resolved Hide resolved
@Ma27
Copy link
Member

Ma27 commented Feb 8, 2024

So I just tried to replace the http client/federation listener with a UDS (it's reverse-proxied by nginx anyways) and this fails with error: value is null while a Boolean was expected while evaluating matrix-synapse-register_new_matrix_user.

@Ma27
Copy link
Member

Ma27 commented Feb 8, 2024

Replication via uds rather than tcp/ip appears to work though 👍

@mweinelt
Copy link
Member Author

mweinelt commented Feb 8, 2024

So I just tried to replace the http client/federation listener with a UDS (it's reverse-proxied by nginx anyways) and this fails with error: value is null while a Boolean was expected while evaluating matrix-synapse-register_new_matrix_user.

😱

Two new options lead me into the shittiest rabbit holes in a long time. It's barely funny anymore.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 8, 2024

The script has no support for UNIX domain sockets. It would need to be extended with https://github.com/msabramo/requests-unixsocket to support the http+unix:// proto.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 8, 2024

So we now drop the registration script, if the clientListener has no bind addresses. 🤷

@mweinelt mweinelt force-pushed the matrix-uds-listeners branch 2 times, most recently from 4cd351c to b9a7d8d Compare February 8, 2024 22:12
@Ma27
Copy link
Member

Ma27 commented Feb 9, 2024

Two new options lead me into the shittiest rabbit holes in a long time. It's barely funny anymore.

That's the third escalating PR against Matrix from you that I'm reviewing, so perhaps it's my fault after all. In that case I'm sorry %)

So we now drop the registration script, if the clientListener has no bind addresses. 🤷

That's fine for me, I'm using OIDC anyways.
I'm happy to re-check the current state now.

However, I think it's a bit surprising if the script disappears when using UDS, so we should do e.g.

  • warn about this in the manual: iirc the registration script is mentioned there anyways.
  • perhaps even raise a warning? I.e. enableRegistrationScript is true by default. If the client listener uses UDS and the option is still true, a warning should be raised.

I'm fine with either option.

@mweinelt
Copy link
Member Author

mweinelt commented Feb 9, 2024

perhaps even raise a warning? I.e. enableRegistrationScript is true by default. If the client listener uses UDS and the option is still true, a warning should be raised.

As long as this leads to an evaluation error, this should probably be a hard assertion. WDYT?

Exposes two options, `path` and `mode`, to configure the location and
permissions on the socket file.

The `mode` needs to be specified as string in octal and will be converted
into a decimal integer, so it correctly passes through the YAML parser
and arrives at the `os.chmod` call in the Twisted codebase. What a fun
detour.

Adds an assertion, that either `path` or `bind_addresses` and `port` are
configured on every listener.

Migrates the default replication listener of the main instance to a UNIX
domain socket, because it is more efficient.

Introduces the `enableRegistrationScript` option, to gracefully disable
the user registration script, when the client listener listens on a UNIX
domain socket, which is something the script does not support.
Using `filterAttrsRecursive` is not sufficient to account for a nested
attribute set with list values, like used for listeners.
@mweinelt
Copy link
Member Author

mweinelt commented Feb 9, 2024

Updated the manual and added both an option and an assertion in the last iteration.

@Ma27 Ma27 merged commit bd8acd0 into NixOS:master Feb 9, 2024
21 checks passed
@mweinelt mweinelt deleted the matrix-uds-listeners branch February 11, 2024 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants