Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redash won't start up: connecting to redash-redis-master:6379. Connection refused & WORKER TIMEOUT #127

Open
Rusiecki opened this issue Jun 30, 2022 · 5 comments

Comments

@Rusiecki
Copy link

Hello,
I am trying to install redash with the helm chart via Terraform on our corporate cluster.
I've read that the inital deployment takes a bit and also let it run for over a few hours but still no luck.

Locally my setup runs fine. Only on the corporate cluster it fails somehow.

1.) I am suspecting that there is somekind of proxy-hick-up that is why I disabled most of the query_runner & destination this at least reduced that amound of WORKER TIMEOUT inside redash itself. But finally didn't solve my problem.

2.) What also is kind of weird is that I get an redash-redis-master connection refused. altough it is reachable within the k8s network.

I'm happy for any help.

What I've already tried:

  1. Deactivate Liveness/Readyness Probe: Pods are green but nothing is reachable ending up, that the setup is dead.
  2. Giving the setup a http_proxy/https_proxy/no_proxy via the env

Here are the logs of the pods if it helps:

Logs(dev/redash-adhocworker-7b59b97db6-ldggj:redash-adhocworker)[1m]

Using Database: postgresql://redash:******@redash-postgresql:5432/redash
Using Redis: redis://:******@redash-redis-master:6379/0
Starting 2 workers for queues: queries...
[2022-06-30 08:39:29,473][PID:6][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,473][PID:6][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,474][PID:6][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,475][PID:6][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,682][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7fe971def460>
[2022-06-30 08:39:29,683][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,683][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
 
 -------------- celery@redash-adhocworker-7b59b97db6-ldggj v4.3.0 (rhubarb)
---- **** ----- 
--- * ***  * -- Linux-5.15.37-051537-generic-x86_64-with-debian-10.0 2022-06-30 08:39:30
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> app:         redash:0x7fe96dc78e50
- ** ---------- .> transport:   redis://:**@redash-redis-master:6379/0
- ** ---------- .> results:     redis://:**@redash-redis-master:6379/0
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> queries          exchange=queries(direct) key=queries
                

[tasks]
  . redash.tasks.check_alerts_for_query
  . redash.tasks.cleanup_query_results
  . redash.tasks.empty_schedules
  . redash.tasks.execute_query
  . redash.tasks.record_event
  . redash.tasks.refresh_queries
  . redash.tasks.refresh_schema
  . redash.tasks.refresh_schemas
  . redash.tasks.send_aggregated_errors
  . redash.tasks.send_mail
  . redash.tasks.subscribe
  . redash.tasks.sync_user_details
  . redash.tasks.version_check

[2022-06-30 08:39:32,086][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 111 connecting to redash-redis-master:6379. Connection refused..
Trying again in 2.00 seconds...

[2022-06-30 08:39:35,093][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 111 connecting to redash-redis-master:6379. Connection refused..
Trying again in 4.00 seconds...

Logs(dev/redash-c5bbb5cc4-d9gj9:redash-server)[1m]


Using Database: postgresql://redash:******@redash-postgresql:5432/redash
Using Redis: redis://:******@redash-redis-master:6379/0
[2022-06-30 08:39:28 +0000] [6] [INFO] Starting gunicorn 19.7.1
[2022-06-30 08:39:28 +0000] [6] [INFO] Listening at: http://0.0.0.0:5000 (6)
[2022-06-30 08:39:28 +0000] [6] [INFO] Using worker: sync
[2022-06-30 08:39:28 +0000] [10] [INFO] Booting worker with pid: 10
[2022-06-30 08:39:28 +0000] [12] [INFO] Booting worker with pid: 12
[2022-06-30 08:39:28 +0000] [14] [INFO] Booting worker with pid: 14
[2022-06-30 08:39:28 +0000] [16] [INFO] Booting worker with pid: 16
[2022-06-30 08:39:29,387][PID:10][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,387][PID:10][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,388][PID:10][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,390][PID:10][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,394][PID:12][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,395][PID:12][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,395][PID:12][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,395][PID:12][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,424][PID:16][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,424][PID:16][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,424][PID:16][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,424][PID:16][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,461][PID:16][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f56f5af5600>
[2022-06-30 08:39:29,461][PID:16][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,461][PID:16][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
[2022-06-30 08:39:29,465][PID:10][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f56f5af4600>
[2022-06-30 08:39:29,465][PID:10][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,466][PID:10][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
[2022-06-30 08:39:29,482][PID:14][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,482][PID:14][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,482][PID:14][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,482][PID:14][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,495][PID:12][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f56f5af4600>
[2022-06-30 08:39:29,496][PID:12][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,496][PID:12][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
[2022-06-30 08:39:29,535][PID:14][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f56f5af5600>
[2022-06-30 08:39:29,535][PID:14][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,536][PID:14][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
[2022-06-30 08:40:16 +0000] [6] [CRITICAL] WORKER TIMEOUT (pid:12)
[2022-06-30 08:40:16 +0000] [12] [INFO] Worker exiting (pid: 12)
[2022-06-30 08:40:16 +0000] [22] [INFO] Booting worker with pid: 22
[2022-06-30 08:40:17,368][PID:22][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:40:17,369][PID:22][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:40:17,369][PID:22][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:40:17,369][PID:22][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:40:17,399][PID:22][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f56f5af5600>
[2022-06-30 08:40:17,399][PID:22][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:40:17,399][PID:22][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
[2022-06-30 08:40:25 +0000] [6] [CRITICAL] WORKER TIMEOUT (pid:16)
[2022-06-30 08:40:25 +0000] [16] [INFO] Worker exiting (pid: 16)
[2022-06-30 08:40:26 +0000] [25] [INFO] Booting worker with pid: 25

Logs(dev/redash-genericworker-65bb9df79d-tt8kw:redash-genericworker)[1m]

Using Redis: redis://:******@redash-redis-master:6379/0
Starting 1 workers for queues: periodic,emails,default...
[2022-06-30 08:39:29,513][PID:6][DEBUG][redash.query_runner] Registering PostgreSQL (pg) query runner.
[2022-06-30 08:39:29,513][PID:6][DEBUG][redash.query_runner] Registering Redshift (redshift) query runner.
[2022-06-30 08:39:29,514][PID:6][DEBUG][redash.query_runner] Registering CockroachDB (cockroach) query runner.
[2022-06-30 08:39:29,515][PID:6][DEBUG][redash.destinations] Registering Mattermost (mattermost) destinations.
[2022-06-30 08:39:29,714][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'SafeConfigParser': <class ConfigParser.SafeConfigParser at 0x7f7f08980460>
[2022-06-30 08:39:29,714][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'NativeStringIO': <built-in function StringIO>
[2022-06-30 08:39:29,714][PID:6][DEBUG][passlib.utils.compat] loaded lazy attr 'BytesIO': <built-in function StringIO>
 
 -------------- celery@redash-genericworker-65bb9df79d-tt8kw v4.3.0 (rhubarb)
---- **** ----- 
--- * ***  * -- Linux-5.15.37-051537-generic-x86_64-with-debian-10.0 2022-06-30 08:39:30
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> app:         redash:0x7f7f04809f10
- ** ---------- .> transport:   redis://:**@redash-redis-master:6379/0
- ** ---------- .> results:     redis://:**@redash-redis-master:6379/0
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> default          exchange=default(direct) key=default
                .> emails           exchange=emails(direct) key=emails
                .> periodic         exchange=periodic(direct) key=periodic

[tasks]
  . redash.tasks.check_alerts_for_query
  . redash.tasks.cleanup_query_results
  . redash.tasks.empty_schedules
  . redash.tasks.execute_query
  . redash.tasks.record_event
  . redash.tasks.refresh_queries
  . redash.tasks.refresh_schema
  . redash.tasks.refresh_schemas
  . redash.tasks.send_aggregated_errors
  . redash.tasks.send_mail
  . redash.tasks.subscribe
  . redash.tasks.sync_user_details
  . redash.tasks.version_check

[2022-06-30 08:39:31,798][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 111 connecting to redash-redis-master:6379. Connection refused..
Trying again in 2.00 seconds...

[2022-06-30 08:39:34,805][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 111 connecting to redash-redis-master:6379. Connection refused..
Trying again in 4.00 seconds...```

`Logs(dev/redash-redis-master-0:redis)[1m]`


```redis 08:39:31.47 INFO  ==> ** Starting Redis **
1:C 30 Jun 2022 08:39:31.491 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 30 Jun 2022 08:39:31.491 # Redis version=6.0.8, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 30 Jun 2022 08:39:31.491 # Configuration loaded
1:M 30 Jun 2022 08:39:31.493 * Running mode=standalone, port=6379.
1:M 30 Jun 2022 08:39:31.493 # Server initialized
1:M 30 Jun 2022 08:39:31.494 * Ready to accept connections

Logs(dev/redash-scheduledworker-5dc5c9c89f-mm8qf:redash-scheduledworker)[1m]

[2022-06-30 08:41:48,921][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 110 connecting to redash-redis-master:6379. Connection timed out..
Trying again in 6.00 seconds...

Logs(dev/redash-scheduler-7b5865c4-sbsfj:redash-scheduler)[1m]

[2022-06-30 08:41:48,918][PID:6][ERROR][MainProcess] consumer: Cannot connect to redis://:**@redash-redis-master:6379/0: Error 110 connecting to redash-redis-master:6379. Connection timed out..
Trying again in 6.00 seconds...

[2022-06-30 08:42:13,493][PID:11][ERROR][Beat] beat: Connection error: Error 110 connecting to redash-redis-master:6379. Connection timed out.. Trying again in 2.0 seconds...

Here are my settings.

main.tf

resource "helm_release" "redash" {
  name       = "redash"
  repository = "https://getredash.github.io/contrib-helm-chart/"
  chart      = "redash"
  version    = "3.0.0"
  namespace  = "dev"

  set {
    name  = "postgresql.postgresqlPassword"
    value = "password123verystrong"
  }

  set {
    name  = "redash.cookieSecret"
    value = "secret"
  }

  set {
    name  = "redash.secretKey"
    value = "key"
  }

  set {
    name  = "image.tag"
    value = "latest"
  }

  set {
    name  = "postgresql.image.repository"
    value = "bitnami/postgresql"
  }

  set {
    name  = "postgresql.image.tag"
    value = "9.6.17-debian-10-r3"
  }

  set {
    name  = "redis.image.repository"
    value = "bitnami/redis"
  }

  set {
    name  = "redis.image.tag"
    value = "6.0.8-debian-10-r0"
  }

  set {
    name  = "server.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "server.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "adhocWorker.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "adhocWorker.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "scheduledWorker.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "scheduledWorker.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "scheduler.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "scheduler.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "genericWorker.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "genericWorker.podSecurityContext.runAsUser"
    value = 1000
  }
  set {
    name  = "hookInstallJob.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "hookInstallJob.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "hookUpgradeJob.podSecurityContext.runAsNonRoot"
    value = true
  }

  set {
    name  = "hookUpgradeJob.podSecurityContext.runAsUser"
    value = 1000
  }

  set {
    name  = "ingress.enabled"
    value = "true"
  }

  set {
    name  = "ingress.hosts[0].host"
    value = "someurl"
  }

  set {
    name  = "ingress.hosts[0].paths[0]"
    value = "/"
  }

  set {
    name  = "ingress.annotations.traefik\\.ingress\\.kubernetes\\.io\\/router\\.tls"
    value = "true"
    type  = "string"
  }

  set {
    name  = "redash.enabledQueryRunners"
    value = "redash.query_runner.pg"
  }

  set {
    name  = "redash.enabledDestinations"
    value = "redash.destinations.mattermost"
  }
}
@Rusiecki
Copy link
Author

Rusiecki commented Jul 4, 2022

@grugnog any Idea?

@grugnog
Copy link
Collaborator

grugnog commented Jul 7, 2022

@Rusiecki nothing really from the above - it seems like a cluster network issue of some kind, but I can't come up with any ideas from the information above, particularly if the same config works locally. My guess is that there is some additional network layer or policy enforcement on the corporate cluster that is denying the connection?

@manvindar
Copy link

Can we bump up redis helm chart version from 10 -> 17? there are breaking changes if we do it

@grugnog
Copy link
Collaborator

grugnog commented Jul 22, 2022

@manvindar PR would be welcome - not sure if that is related to this issue though.

@jversolatocreditas
Copy link

any idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants