Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHEL-15110: Fix issue with registration using gsd-subman #3350

Merged
merged 2 commits into from
Jan 5, 2024

Conversation

jirihnidek
Copy link
Contributor

  • We were too agresive, when we fixed CVE in this PR: 2225446: Hotfix of D-Bus policy #3317
  • It is still safe to allow non-root user to create abstract socket using Start() on interface com.redhat.RHSM1.RegisterServer and destroy it later using Stop(). This abstract socket could be later used by root user for calling e.g. Register() on interface com.redhat.RHSM1.Register. This is way how it works for gsd-subman (run by non-root user) and gsd-subman-helper (run by root user).

@cnsnyder cnsnyder requested review from a team and DuckBoss and removed request for a team October 31, 2023 16:47
Copy link

github-actions bot commented Oct 31, 2023

Coverage

Coverage (computed on Fedora latest) •
FileStmtsMissCoverMissing
rhsmlib/dbus
   dbus_utils.py988018%31–37, 44–45, 47–50, 54–56, 58, 65–66, 68–72, 78–82, 89–90, 92–96, 98, 104, 108, 112–120, 129–138, 148, 150, 152–153, 162, 164, 190, 194, 206, 208–209, 211–213, 218–219, 221–222, 227–230
   server.py20912341%44–45, 65–66, 88, 90, 93, 95–100, 102–111, 113–114, 116, 119–123, 125, 128–129, 131–150, 152–156, 158, 167, 171, 178–186, 189–191, 199–200, 202, 210, 213, 215, 222–223, 225–226, 240, 243–246, 258–262, 285, 290–294, 298–302, 304–305, 307, 314, 349–351, 362–368, 370–372, 376
rhsmlib/dbus/objects
   register.py2086071%36, 89–90, 103–104, 114–115, 117–118, 128–129, 131, 145, 148, 161, 164, 203, 212, 250–251, 258, 294, 340–342, 344–345, 364–367, 369–371, 373, 375, 377, 392, 396, 415–420, 422–424, 426, 428, 440–444, 446–448, 450, 452
TOTAL18221461974% 

Tests Skipped Failures Errors Time
2640 14 💤 0 ❌ 0 🔥 49.493s ⏱️

@jirihnidek
Copy link
Contributor Author

Link to issue: https://issues.redhat.com/browse/RHELMISC-1500

@owtaylor
Copy link

I spent some time reviewing the impact of this change:

  • Start() creates a private d-bus server using dbus.Server() - which is a binding for libdbus's dbus_server_listen()
    • I verified by code review that the private server created dbus_server_listen() initially has a security policy of "same UID or root". (See auth_via_defaullt_rules). So just starting the server and knowing the address does not allow you to register the system to a hostile update server.
    • @jirihnidek and I also tested that this was the case in practice - that it isn't possible for a non-root user to connect to that private server, even if they know the address.
    • Start() is idempotent - if it's called multiple times, the existing server is returned, so a caller can't create an arbitrary memory leak
  • Stop() stops the current server.
    • Allowing this to be called by arbitrary users does allow any user on the system to break an in-progress registration through the GUI if they call Stop() in a loop.
    • The breakage will be apparent - the user trying to register will get an error message
    • It seems unlikely that a not-yet-registered system will have hostile users on it trying to sabotage registration.

I think the patch is correct as is. If there was serious concern about allowing the Stop() method to non-root, Stop() could also be made a no-op.

@jirihnidek
Copy link
Contributor Author

I can confirm that I wasn't able to connect to abstract socket created by Start() method as another non-root users. All that attempts were rejected.

@halfline
Copy link

halfline commented Oct 31, 2023

I think it still might be a good idea to add a check of some sort on the user for start and stop... It's not critical, but if anything it at least might fend off a DoS CVE report in N months from an overzealous security researcher or something... maybe something like

 class RegisterDBusObject(base_object.BaseObject):
     default_dbus_path = constants.REGISTER_DBUS_PATH
     interface_name = constants.REGISTER_INTERFACE
 
     def __init__(self, conn=None, object_path=None, bus_name=None):
         super().__init__(conn=conn, object_path=object_path, bus_name=bus_name)
         self.impl = RegisterDBusImplementation()
+        self.caller_uids = set()
+        self.bus_proxy = self.conn.get_object('org.freedesktop.DBus', '/org/freedesktop/DBus')
+        self.bus_interface = dbus.Interface(self.bus_proxy, 'org.freedesktop.DBus')
 
     @util.dbus_service_method(
         constants.REGISTER_INTERFACE,
         in_signature="s",
         out_signature="s",
     )
     @util.dbus_handle_sender
     @util.dbus_handle_exceptions
     def Start(self, locale, sender=None):
         locale = dbus_utils.dbus_to_python(locale, expected_type=str)
         Locale.set(locale)
 
+        caller_uid = self.bus_interface.GetConnectionUnixUser(sender)
+
         address: str = self.impl.start(sender)
+
+        self.caller_uids.add(caller_uid)
+
         return address
 
     @util.dbus_service_method(
         constants.REGISTER_INTERFACE,
         in_signature="s",
         out_signature="b",
     )
     @util.dbus_handle_sender
     @util.dbus_handle_exceptions
     def Stop(self, locale, sender=None):
         locale = dbus_utils.dbus_to_python(locale, expected_type=str)
         Locale.set(locale)
 
-        self.impl.stop()
+        caller_uid = self.bus_interface.GetConnectionUnixUser(sender)
+
+        self.caller_uids.discard(caller_uid)
+
+        if not self.caller_uids or caller_uid == 0:
+             self.impl.stop()
 
 

(untested)

@jirihnidek
Copy link
Contributor Author

I think it still might be a good idea to add a check of some sort on the user for start and stop... It's not critical, but if anything it at least might fend off a DoS CVE report in N months from an overzealous security researcher or something... maybe something like

Very good point. Only non-root user that called Start() method should be able to call Stop() method (no other non-root user should be able to do that). Of course, the root user still should be able to call Stop() method.

I will update PR.


# When somebody else started domain socket listener, then
if self.impl.server is not None:
log.warning(f"domain socket listener already running, opened by: {self.impl.server.sender}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be better to include _caller_uid here. it's probably a little more recognizable in a log than a d-bus unique name.

Copy link

@halfline halfline Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, just to summarize a slack conversation between @jirihnidek @owtaylor @m-horky and me. This causes a few other problems:

  1. There's still the DoS problem because one user can call Start() and prevent another user from calling Start()
  2. If the user session crashes during login before Stop() is called, they'll have to reboot to register instead of just log back in
  3. gnome-settings-daemon currently has unpaired Start() in its unregister path. So if user unregisters they'll have to reboot to register again

So I think Start() needs to silently succeed if it's already started to maintain backward compatibility. The two easiest ways to do that are:

  1. go back to tracking a set of caller uids like the sketch from RHEL-15110: Fix issue with registration using gsd-subman #3350 (comment)
  2. Allow Start() from multiple users but make Stop() a no-op unless it's called by root (as mentioned earlier in RHEL-15110: Fix issue with registration using gsd-subman #3350 (comment) )

@jirihnidek
Copy link
Contributor Author

jirihnidek commented Nov 3, 2023

I reworked the code. Long story short: The policy should not be about users, but it should be about applications using RegisterServer. The server could be stopped only in the situation, when it is not needed by any application.

@jirihnidek
Copy link
Contributor Author

BTW: I will have to fix some unit tests (probably add more mocks). Tests do not work, when you try to run these tests in container, because there is no system bus.

@owtaylor
Copy link

owtaylor commented Nov 3, 2023

As far as I can tell, the code seems to work as advertised. I'm still not sure that complicated tracking of Start/Stop is warranted. Can you explain a bit more about why you want do that?

Experimentally, stopping the server saves only a tiny fraction of resident memory:

# cat /proc/$(pgrep rhsm-service)/status  | grep -E 'VmSize|VmRSS'
VmSize:	  376436 kB
VmRSS:	   57852 kB
# dbus-send --system --print-reply --dest=com.redhat.RHSM1 /com/redhat/RHSM1/RegisterServer com.redhat.RHSM1.RegisterServer.Start string:en_US.UTF-8
method return time=1699019633.267880 sender=:1.567 -> destination=:1.568 serial=28 reply_serial=2
   string "unix:abstract=/run/dbus-dFZyd0E7MU,guid=95a9babc9487888aab569b1b6544fb71"
# cat /proc/$(pgrep rhsm-service)/status  | grep -E 'VmSize|VmRSS'
VmSize:	  376436 kB
VmRSS:	   57888 kB
# dbus-send --system --print-reply --dest=com.redhat.RHSM1 /com/redhat/RHSM1/RegisterServer com.redhat.RHSM1.RegisterServer.Stop string:en_US.UTF-8
method return time=1699019645.095060 sender=:1.567 -> destination=:1.569 serial=31 reply_serial=2
   boolean true
# cat /proc/$(pgrep rhsm-service)/status  | grep -E 'VmSize|VmRSS'
VmSize:	  376436 kB
VmRSS:	   57868 kB

One thing that your approach definitely does improve compared to just leaving running is tracking of the "sender", which is used to retrieve the command line information. But it still isn't very reliable - any user could make it whatever they want by calling Start() without Stop() first. What about instead doing something like:

From b6f401d70438f528c278e2e02ccc1c041da10c67 Mon Sep 17 00:00:00 2001
From: "Owen W. Taylor" <otaylor@fishsoup.net>
Date: Fri, 3 Nov 2023 10:58:41 -0400
Subject: [PATCH] registration: get the sender command line from the peer
 connection

It's hard to reliably associate Start/Stop methods with the calls to the
actual registration methods, so instead retrieve the PID from the
caller of the private method and look up the command line in /proc.
---
 src/rhsmlib/client_info.py           | 17 +++++++++++++++++
 src/rhsmlib/dbus/objects/register.py | 10 ++++++----
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/rhsmlib/client_info.py b/src/rhsmlib/client_info.py
index a7132266d..981024e8f 100644
--- a/src/rhsmlib/client_info.py
+++ b/src/rhsmlib/client_info.py
@@ -74,6 +74,23 @@ class DBusSender:
             self.cmd_line = cmd_line
         log.debug("D-Bus sender: %s (cmd-line: %s)" % (sender, self.cmd_line))
 
+    def set_cmd_line_from_peer_connection(self, dbus_connection):
+        """
+        Set the sender's command line in the singleton object, by looking it up from /proc
+        """
+
+        peer_pid = dbus_connection.get_peer_unix_process_id()
+        if peer_pid is None:
+            log.debug("Can't get peer pid for connection")
+            return
+
+        try:
+            with open("/proc/%d/cmdline" % peer_pid, "rb") as f:
+                self.cmd_line = f.read().split(b"\0")[0].decode("utf-8")
+            log.debug("D-Bus sender pid: %d (cmd-line: %s)" % (peer_pid, self.cmd_line))
+        except (OSError, UnicodeDecodeError) as e:
+            log.debug("Can't read cmd-line for pid %d: %s" % (peer_pid, str(e)))
+
     def reset_cmd_line(self):
         """
         Reset sender's command line
diff --git a/src/rhsmlib/dbus/objects/register.py b/src/rhsmlib/dbus/objects/register.py
index c31151ddf..91b8df5f7 100644
--- a/src/rhsmlib/dbus/objects/register.py
+++ b/src/rhsmlib/dbus/objects/register.py
@@ -384,9 +384,10 @@ class DomainSocketRegisterDBusObject(base_object.BaseObject):
         dbus_interface=constants.PRIVATE_REGISTER_INTERFACE,
         in_signature="sssa{sv}a{sv}s",
         out_signature="s",
+        connection_keyword="dbus_connection"
     )
     @util.dbus_handle_exceptions
-    def Register(self, org, username, password, options, connection_options, locale):
+    def Register(self, org, username, password, options, connection_options, locale, dbus_connection):
         """
         This method registers the system using basic auth
         (username and password for a given org).
@@ -405,7 +406,7 @@ class DomainSocketRegisterDBusObject(base_object.BaseObject):
         locale = dbus_utils.dbus_to_python(locale, expected_type=str)
 
         with DBusSender() as dbus_sender:
-            dbus_sender.set_cmd_line(sender=self.sender, cmd_line=self.cmd_line)
+            dbus_sender.set_cmd_line_from_peer_connection(dbus_connection)
             Locale.set(locale)
 
             consumer: dict = self.impl.register_with_credentials(org, options, connection_options)
@@ -416,9 +417,10 @@ class DomainSocketRegisterDBusObject(base_object.BaseObject):
         dbus_interface=constants.PRIVATE_REGISTER_INTERFACE,
         in_signature="sasa{sv}a{sv}s",
         out_signature="s",
+        connection_keyword="dbus_connection"
     )
     @util.dbus_handle_exceptions
-    def RegisterWithActivationKeys(self, org, activation_keys, options, connection_options, locale):
+    def RegisterWithActivationKeys(self, org, activation_keys, options, connection_options, locale, dbus_connection):
         """
         Note this method is registration ONLY.  Auto-attach is a separate process.
         """
@@ -429,7 +431,7 @@ class DomainSocketRegisterDBusObject(base_object.BaseObject):
         locale = dbus_utils.dbus_to_python(locale, expected_type=str)
 
         with DBusSender() as dbus_sender:
-            dbus_sender.set_cmd_line(sender=self.sender, cmd_line=self.cmd_line)
+            dbus_sender.set_cmd_line_from_peer_connection(dbus_connection)
             Locale.set(locale)
 
             consumer: dict = self.impl.register_with_activation_keys(org, options, connection_options)
-- 
2.41.0

@jirihnidek
Copy link
Contributor Author

I try to minimize changes of functionality. I don't want to keep RegisterServer running, when a client application called Stop() and there is no good reason for keeping server running.

I don't think that it is necessary to get information about application calling Register or RegisterWithActivationKeys, because most of applications calling Start()/Stop() also calls Register or RegisterWithActivationKeys. gsd-subman is only one exception and we can easily distinguish this exception. Your proposal would also break backward compatibility and I would like to avoid that, because there is lot of applications using this D-Bus API.

BTW: we already have method called command_of_pid(pid):

https://github.com/candlepin/subscription-manager/blob/main/src/rhsmlib/dbus/dbus_utils.py#L29

@owtaylor
Copy link

owtaylor commented Nov 6, 2023

I try to minimize changes of functionality. I don't want to keep RegisterServer running, when a client application called Stop() and there is no good reason for keeping server running.

My first idea when reading your code was to suggest using the org.freedesktop.DBus.NameOwnerChanged D-Bus signal to track senders in more consistent and easier to understand way - but that seemed like possibly adding even more lines of code for a goal that I don't really understand (yet).

The functionality that clients depend upon is being able to get the server address and call the registration methods on it. Stop() lets subscription-manager know that the client is done and doesn't need the server any more. But it should be a fully compatible approach for subscription-manager to to respond with Stop with "OK, thank you - have a good day!" and do nothing.

I don't think that it is necessary to get information about application calling Register or RegisterWithActivationKeys, because most of applications calling Start()/Stop() also calls Register or RegisterWithActivationKeys. gsd-subman is only one exception and we can easily distinguish this exception

Yeah, the point is not that it's typically different, it's that when we start getting into the corner cases of Start/Stop, we don't reliably know how to connect Start calls to Register* calls so there is confusing and hard-to-review behavior. If we simply use the caller of Register* instead, this goes away and we don't need to worry about who sent the Start call at all.

. Your proposal would also break backward compatibility and I would like to avoid that, because there is lot of applications using this D-Bus API.

Can you explain how my suggestion breaks backwards compat? I don't think it would require any changes to any callers. I guess you would start getting gsd-subman-helper rather than gsd-subman in logs, but that seems easy to deal with when analyzing log data.

BTW: we already have method called command_of_pid(pid):

Ah, that could simplify things a bit further.

@jirihnidek
Copy link
Contributor Author

. Your proposal would also break backward compatibility and I would like to avoid that, because there is lot of applications using this D-Bus API.

Can you explain how my suggestion breaks backwards compat? I don't think it would require any changes to any callers. I guess you would start getting gsd-subman-helper rather than gsd-subman in logs, but that seems easy to deal with when analyzing log data.

Oh, I'm sorry. I overlooked that dbus_connection is not another argument of D-Bus methods Regitster and RegisterWithActivationKeys. It is only argument of Python method.

BTW: we already have method called command_of_pid(pid):

Ah, that could simplify things a bit further.

I don't want to make more changes in this PR than necessary.

@halfline
Copy link

halfline commented Nov 6, 2023

My first idea when reading your code was to suggest using the org.freedesktop.DBus.NameOwnerChanged D-Bus signal to track senders in more consistent and easier to understand way

Yea that was my first reaction too. Using NameOwnerChanged is more idiomatic. Sender unique names are guaranteed to never be recycled, so NameOwnerChanged is kind of meant for this use case. I actually don't think it's much added code either...maybe something like

bus.add_signal_receiver(
    lambda name, *args: self._senders.discard(name),
    dbus_interface='org.freedesktop.DBus',
    signal_name='NameOwnerChanged')

(in addition to discarding on Stop())

It actually might be less code, since are_other_senders_still_running could be dropped and just replaced with if self._senders:

It's certainly more efficient, but maybe in practice it doesn't matter that much since in practice we're talking about a sender list of size 1, so iterating over the entire list isn't likely to be a problem.

I guess you would start getting gsd-subman-helper rather than gsd-subman in logs, but that seems easy to deal with when analyzing log data.

In fact, that change would be fixing a bug right? It would be somewhat bad if a registration attempt got attributed to a process from a command from a different user for instance.

It seems like this is another somewhat independent bug that should probably get fixed lest a security researcher files a CVE about it in M months, right?

I do have to say after several rounds of changes of increasing complexity, though, my bear detector is starting to go off, and that again is making me like your Stop-is-a-nop approach better (I realize @jirihnidek isn't a fan of that, so I'm just giving my two cents).

Of course, even with the Stop-is-a-nop approach, the auditing fix you mention above is still relevant.

* We were too agresive, when we fixed CVE in this PR:
  #3317
* It is still safe to allow non-root user to create abstract
  socket using Start() on interface com.redhat.RHSM1.RegisterServer
  and destroy it later using Stop(). This abstract socket
  could be later used by root user for calling e.g. Register()
  on interface com.redhat.RHSM1.Register. This is way how
  it works for gsd-subman (run by non-root user) and
  gsd-subman-helper (run by root user).
* When some application starts RegisterServer using Start(),
  then it returns address. When this method is called multiple
  times and RegisterServer is still running, then the same
  address is returned. It means, when user starts multiple
  applications, then all of these applications should be able
  to use the same address to keep backward compatibility.
  The RegisterServer should be terminated only in the case,
  when the last application call Stop() and there is no running
  proccess using this RegisterServer.
* This change does not allow to stop RegisterServer by some other
  user or application using Stop() method, when RegisterServer
  is still needed.
* Non-root user can stop the RegisterServer only in the situation,
  when it is not needed by any application.
* Implementation of Stop() method did not return anything explicitly,
  but the D-Bus method was designed to return boolean value. Thus,
  None object was interpreted as "False" value. It was fixed and it
  returns "True", when it was really stoped and it return "False",
  when it was not possible to stop it, because some application
  still uses RegisterServer.
* It looks like that there was probably intention to use similar
  approach, but it has never been finished.
* Modified some unit tests
@ptoscano ptoscano force-pushed the jhnidek/RHEL-15110 branch from 0f4b79d to bc33776 Compare January 5, 2024 11:05
Copy link
Contributor

@ptoscano ptoscano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks folks for the code & the discussion on this!

LGTM, so merging it.

@ptoscano ptoscano merged commit 47d2729 into main Jan 5, 2024
16 checks passed
@ptoscano ptoscano deleted the jhnidek/RHEL-15110 branch January 5, 2024 11:52
jirihnidek added a commit that referenced this pull request Jan 9, 2024
* Backport to 1.28 branch. Original PR: #3350
  * Original commit: ba1e0e4
* We were too agresive, when we fixed CVE in this PR:
  #3318
* It is still safe to allow non-root user to create abstract
  socket using Start() on interface com.redhat.RHSM1.RegisterServer
  and destroy it later using Stop(). This abstract socket
  could be later used by root user for calling e.g. Register()
  on interface com.redhat.RHSM1.Register. This is way how
  it works for gsd-subman (run by non-root user) and
  gsd-subman-helper (run by root user).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants