Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout during COPY command #2

Open
urish opened this issue Dec 23, 2022 · 17 comments
Open

Timeout during COPY command #2

urish opened this issue Dec 23, 2022 · 17 comments

Comments

@urish
Copy link

urish commented Dec 23, 2022

Thanks for the detailed instructions!

I'm trying this on my Supabase database acting as a primary server (Pro subscription with Small add-on), and a local testing database acting as the replica.

This goes fine until we get to the largest table, which is ~2GB in size. Then the replication fails due to timeout on the primary side:

Log ID
f6bfc5c7-0bed-4dad-8cb7-0ecdebf7548a

Log Timestamp (UTC)
2022-12-23T16:50:06.306Z

Log Event Message
canceling statement due to statement timeout

Log Metadata
[
  {
    "file": "/var/log/postgresql/postgresql.csv",
    "host": "db-<redacted>",
    "parsed": [
      {
        "application_name": "pg_428762_sync_428374_7177698832877711389",
        "backend_type": "walsender",
        "command_tag": "COPY",
        "connection_from": "<redacted>:61499",
        "context": null,
        "database_name": "postgres",
        "detail": null,
        "error_severity": "ERROR",
        "hint": null,
        "internal_query": null,
        "internal_query_pos": null,
        "leader_pid": null,
        "location": null,
        "process_id": 2434,
        "query": "COPY public.\"ProjectFile\" TO STDOUT",
        "query_id": -4671124835743753000,
        "query_pos": null,
        "session_id": "63a5dbc4.982",
        "session_line_num": 5,
        "session_start_time": "2022-12-23 16:48:04 UTC",
        "sql_state_code": "57014",
        "timestamp": "2022-12-23 16:50:06.306 UTC",
        "transaction_id": 0,
        "user_name": "postgres",
        "virtual_transaction_id": "9/2447"
      }
    ],
    "parsed_from": null,
    "project": "<redacted>",
    "source_type": "file"
  }
]

Any advice on how to make this work?

@burggraf
Copy link
Owner

Maybe try changing the timeout value(s) https://supabase.com/docs/guides/database/timeouts. But honestly, I've never seen this come up before, no matter how big the database is. Is this error coming from your Supabase (primary) database or your local database?

@burggraf
Copy link
Owner

You might also look at the debugging page and adjust the number of works or slots.

@urish
Copy link
Author

urish commented Dec 23, 2022

Thanks for the quick reply!

Primary, as far as I can tell. Looking at the log entry, it seems like it's written just a bit more than 2 minutes after the session begins:

        "session_start_time": "2022-12-23 16:48:04 UTC",
        "timestamp": "2022-12-23 16:50:06.306 UTC",

This seems to correspond with the global limit of 2 minutes mentioned in the timeouts doc.

Is there a way to temporarily lift this global limit so we can figure out if that is the reason or not?

@burggraf
Copy link
Owner

That 2 minute limit is talking about "users", as in users going through the javascript client (anon or authenticated users). That doesn't apply here, since this process is run inside the database.

this thread mentions maybe setting the wal_sender_timeout and wal_receiver_timeout settings, but I've never had to mess with those.

@urish
Copy link
Author

urish commented Dec 23, 2022

Thanks for clarifying!

I'm actually not sure how to change these settings. Here's what I'm trying:

image

So it seems like I'm at least able to modify wal_sender_timeout, but then when querying it through the supabase admin panel, it still shows 1min:

image

I'm going to try setting up the replication without supabase (just two local postgresql servers) to see if it's something with my local setup or whether it's related to supabase.

p.s. I'm a newbie when it comes to PostgreSQL, so I hope I'm not making some trivial mistake and wasting your time here ;-)

@burggraf
Copy link
Owner

You need to do a quick reset on the database after making changes. From the dashboard go to the Settings Page then click the drop-down to the right of Restart Project and choose the Fast Database Reboot option. This effectively runs systemctl restart postgresql on the server and only takes a few seconds.

@urish
Copy link
Author

urish commented Dec 23, 2022

Thanks! Meanwhile, I figured out that my local postgres server is at version 14.2, while my Supabase one seems to be on 15.1. I'll try to see if upgrading the local version helps.

@urish
Copy link
Author

urish commented Dec 23, 2022

Update: It works locally (even when using 14.2 for the replica and 15.1 for the primary), so probably not a version issue.

I'll try playing around with the max_replication_slots / max_worker_processes / timeout settings in Supabase and see if this helps. At least now we have one example of a working setup!

@urish
Copy link
Author

urish commented Dec 23, 2022

After running the following commands:

alter system set max_worker_processes to '8';
alter system set max_replication_slots to '8';

I restarted my database, and now it won't come up :-(
Also tried restarting the instance. Seems like I'm having bad luck!

Anyway, I contacted Supabase support to see if they could assist with bringing the instance up again.

@burggraf
Copy link
Owner

I see you have a support thread for this going and that the current status of the database is RESTORING. So it looks like the db is being restored from a backup.

@urish
Copy link
Author

urish commented Dec 23, 2022

Thanks for looking into this! I'll patently wait then

@burggraf
Copy link
Owner

It's best to follow up with the support ticket and not here.

@urish
Copy link
Author

urish commented Dec 23, 2022

Unfortunately, setting max_worker_processes, max_replication_slots (both to 8), and wal_sender_timeout, and wal_receiver_timeout (both to 1h) didn't help.

The replication keeps failing every two minutes in the COPY command for the 2GB table with "canceling statement due to statement timeout".

Any other suggestions?

@burggraf
Copy link
Owner

It looks like you have under 2GB of space available on your primary server, so maybe it's running out of space during the initial replication? The error you're getting doesn't make sense in this case, but maybe it's trying to copy the entire table during the initial process?

@urish
Copy link
Author

urish commented Dec 23, 2022

Supabase dashboard claims that I have around 5GB of space available on the primary server:

image

Looking at the network traffic, I can see the data being downloaded to the replica up to the point where the query times out. The replica even says how many records have been read until it timed out - 645082:

2022-12-23 20:56:52.101 UTC [30] LOG:  logical replication table synchronization worker for subscription "wokwi_prod_subscription", table "ProjectFile" has started
2022-12-23 20:58:16.872 UTC [22] LOG:  checkpoint starting: wal
2022-12-23 20:58:55.650 UTC [30] ERROR:  could not receive data from WAL stream: ERROR:  canceling statement due to statement timeout 
2022-12-23 20:58:55.650 UTC [30] CONTEXT:  COPY ProjectFile, line 645082

@burggraf
Copy link
Owner

That may have a lag to it. Looking at your server, there's 1.6GB available on the data partition. You can ask support to increase your disk size a bit and try it again to see if that helps. Are you trying to replicate to a local system, say your laptop, or to another hosted server somewhere, or what?

@urish
Copy link
Author

urish commented Dec 23, 2022

Trying to replicate to a local system (desktop), running postgresql inside docker (the official image, alpine flavor, latest 15.1). Managed to get it working between two local postgres servers running inside different docker containers, using the same dataset.

I'll try contacting support regarding larger data partition, thanks for the suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants