PoolCluster stops queuing and fails with ETIMEDOUT · mysqljs/mysql#904

(56 comments) (0 reactions) (0 assignees)JavaScript (2,502 forks)batch import

help wantedneeds investigation

Repository metrics

Stars: (18,137 stars)
PR merge metrics: (No merged PRs in 30d)

Description

Hello everybody, I'm still investigating on this headache but I'd greatly appreciate whatever suggestion.

I'm using a PoolCluster with a 10 connection limit per pool, selecting a pool and running N queries on it. I'm inserting a sample of 300k rows, at 1k rows per query (using the multiple queries feature). I already checked the generated SQL and it is fine. Everything works fine as long as I try to send 30 (exactly 30) subsequent queries (30k rows): 10 connections get pulled out of the pool, the other get queue and performed as soon as a connection is available. And up to this point everything is just great :)

Running a test with 31 connection (30500 rows): everything goes wreck :/ the whole cluster tumbles down as the 1st getConnection() receives a ETIMEDOUT error and all the subsequent getConnection() call receive a Error: Pool is closed error. Here you can find a sample error log with stack traces. I added a console.log() (and marked it as mysql lib stack: Error:) in the library as suggested by this answer (the case did look kind of similar to mine)

Increasing config variables values such as connectionLimit, acquireTimeout, removeNodeErrorCount and explicitally setting waitForConnections, canRetry to true didn't help either.

Checking the server's configuration, the numbers seem fine:

mysql> SHOW VARIABLES LIKE "%connect%";
+-----------------------------------------------+--------------------------------------------+
| Variable_name                                 | Value                                      |
+-----------------------------------------------+--------------------------------------------+
| character_set_connection                      | utf8                                       |
| collation_connection                          | utf8_general_ci                            |
| connect_timeout                               | 10                                         |
| disconnect_on_expired_password                | ON                                         |
| init_connect                                  | SET collation_connection = utf8_general_ci |
| max_connect_errors                            | 100                                        |
| max_connections                               | 151                                        |
| max_user_connections                          | 0                                          |
| performance_schema_session_connect_attrs_size | 512                                        |
+-----------------------------------------------+--------------------------------------------+
9 rows in set (0.00 sec)

Shouldn't the pool/poolCluster feature manage these kind of situations by its self? What would be the best way to manage this kind of error and reconnect the lost pool?

As said I'm still investigating this situation and I'll try to add more information as soon as they come out. I'm sorry I'm not able to attach a fiddle test case at this time.

Contributor guide

Research direction: Investigate the PoolCluster's connection queue logic and timeout handling, specifically the getConnection method. Trace how the queue builds up and why ETIMEDOUT errors occur when the number of connections exceeds the pool limit. Look at the ensureConnection and enqueueCallbacks functions. Consider the impact of the 'acquireTimeout' and 'waitForConnections' options.
Tech stack: javascriptnodejs
Domain: backenddatabaseapi
Issue type: Bug
Difficulty: 2
Estimated time: 1-3 hours
Activity status: Active
Clarity: Mostly clear
Prerequisites: GitNode.jsMySQL
Newbie friendliness: 75

Repository metrics

Description

Contributor guide

Get fresh easy issues in your inbox.