Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHCP proxy - no available capacity / crash when using external DHCP service #1024

Open
jjzazuet opened this issue Jul 12, 2024 · 2 comments
Open

Comments

@jjzazuet
Copy link

jjzazuet commented Jul 12, 2024

Hi, I have a few standalone containers running under podman, using a macvlan network to make them available to an internal LAN network. I'm observing the following:

1: Every time the external router assigns a DHCP host configuration to a container, netavark logs this message:

dhcp-proxy: [ERROR netavark::commands::dhcp_proxy] no available capacity

2: Every week or so I will get a hard crash on netavark with the following logs, and the container will no longer be reachable at the static IP lease addresses:

Jul 11 21:29:36 build-00 user.notice dhcp-proxy: thread '<unnamed>' panicked at library/std/src/sys/pal/unix/stack_overflow.rs:158:13:
Jul 11 21:29:36 build-00 user.notice dhcp-proxy: failed to set up alternative stack guard page: Out of memory (os error 12)
Jul 11 21:29:36 build-00 user.notice dhcp-proxy: note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Jul 11 21:29:36 build-00 user.notice dhcp-proxy: thread '<unnamed>' panicked at library/std/src/sys/pal/unix/stack_overflow.rs:154:13:
Jul 11 21:29:36 build-00 user.notice dhcp-proxy: failed to allocate an alternative stack: Out of memory (os error 12)
Jul 11 21:29:36 build-00 daemon.err /etc/init.d/netavark-dhcp-proxy[38874]: start-stop-daemon: failed to start `/usr/libexec/podman/netavark'
Jul 11 21:29:36 build-00 user.notice dhcp-proxy:  * start-stop-daemon: failed to start `/usr/libexec/podman/netavark'

I just tried building and upgrading to the latest upstream version of netavark, but I'm seeing the same log messages. I'll wait to see if I get a crash with this version.

# /opt/netavark version
{
  "version": "1.12.0-dev",
  "commit": "e182147b6aea964f572a4ca981bc000698d59539",
  "build_time": "2024-07-12T03:39:23.741229481+00:00",
  "target": "x86_64-alpine-linux-musl",
  "default_fw_driver": "iptables"
}
# podman network inspect podman30
[
     {
          "name": "podman30",
          "id": "71d03f55fc0de8a074d5e5de88759269e32da004c568f99bb94a420f2e7f31a2",
          "driver": "macvlan",
          "network_interface": "br30",
          "created": "2024-06-23T04:40:07.724065801Z",
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": false,
          "ipam_options": {
               "driver": "dhcp"
          },
          "containers": {
               "2852850d2afb74e1e27aa46c2ab25a887f3abb9989d41c8e9699b4fea60d3f51": {
                    "name": "excalidraw",
                    "interfaces": {
                         "eth0": {
                              "subnets": [
                                   {
                                        "ipnet": "172.16.30.115/24",
                                        "gateway": "172.16.30.1"
                                   }
                              ],
                              "mac_address": "ee:b7:eb:0b:b1:2b"
                         }
                    }
               }
          }
     }
]
/home/gopher # podman info
host:
  arch: amd64
  buildahVersion: 1.35.4
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-r0
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: unknown'
  cpuUtilization:
    idlePercent: 97.52
    systemPercent: 2.08
    userPercent: 0.4
  cpus: 24
  databaseBackend: sqlite
  distribution:
    distribution: alpine
    version: 3.20.1
  eventLogger: file
  freeLocks: 2031
  hostname: build-00
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.6.34-1-lts
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 29579984896
  memTotal: 33634459648
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-r0
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-r0
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.15-r0
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-2024.06.07-r0
    version: |
      pasta unknown version
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 191h 35m 22.00s (Approximately 7.96 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 11
    paused: 0
    running: 11
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /media/md1/containers/storage
  graphRootAllocated: 146415128576
  graphRootUsed: 4874047488
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 11
  runRoot: /media/md1/containers-runroot/storage
  transientStore: false
  volumePath: /media/md1/containers/storage/volumes
version:
  APIVersion: 5.0.3
  Built: 1717594599
  BuiltTime: Wed Jun  5 13:36:39 2024
  GitCommit: ""
  GoVersion: go1.22.4
  Os: linux
  OsArch: linux/amd64
  Version: 5.0.3

Let me know if more info is needed. Thanks!

@Luap99
Copy link
Member

Luap99 commented Jul 12, 2024

It is likely the same cause as #811 but given your error is different we cannot be for sure. When #811 is fixed you should definitely retest.

@jjzazuet
Copy link
Author

jjzazuet commented Jul 15, 2024

One more crash occurrence today with latest netavark build in case it helps:

ul 15 22:16:10 build-00 user.notice dhcp-proxy: thread 'thread 'tokio-runtime-worker<unnamed>' panicked at ' panicked at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rslibrary/std/src/sys/pal/unix/stack_overflow.rs::683168::2913:
Jul 15 22:16:10 build-00 user.notice dhcp-proxy: :
Jul 15 22:16:10 build-00 user.notice dhcp-proxy: failed to spawn thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }failed to set up alternative stack guard page: Out of memory (os error 12)
Jul 15 22:16:10 build-00 user.notice dhcp-proxy: stack backtrace:
Jul 15 22:16:10 build-00 user.notice dhcp-proxy: memory allocation of 3072 bytes failed
Jul 15 22:16:11 build-00 daemon.err /etc/init.d/netavark-dhcp-proxy[17725]: start-stop-daemon: failed to start `/opt/netavark'
Jul 15 22:16:11 build-00 user.notice dhcp-proxy:  * start-stop-daemon: failed to start `/opt/netavark'

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants