问题详情
上海计费做故障切换测试,业务通过lvs+udal连接,shutdown teledb主机后,日志显示,dbproxy侧感知到backend connection关闭,但业务侧未感知连接异常,并且持续时间较久。其中,业务侧使用c++的长连接,代码连接的逻辑为:
客户异常测试分三个场景:
(1)业务正常连接lvs->dbproxy->teledb执行业务,将teledb主机宕机,观察到业务侧长时间未感知异常,上述截图的代码没有获取到异常,并且在将近20分钟后,才感知到异常。异常持续时间过久,不符合预期;
(2) 业务正常连接,然后将teledb 的主实例down掉服务(非机器),此时业务立即收到lost connection 的错误,业务马上感知到异常,符合预期;
(3)业务正常连接,手动点击teledb控制台的主从切换,此时业务立即收到异常,符合预期
在上述三个故障场景测试中,场景一的宕机业务长时间未感知,不符合客户预期,需要定位原因,并给出切实可行的方案,从而宕机时,客户能在一个较短时间内感知到异常。
解决方案
配置socket 套接字超时参数,
Java:socketTimeout参数,参考测试代码
private static void testJdbcMysqlPrepare() {
Connection connection = null;
PreparedStatement preparedStatement = null;
String JDBC_URL = "jdbc:mysql://xxxxxx/local1_1?useServerPrepStmts=true&autoReconnect=true&connectTimeout=6000&socketTimeout=6000";
String USERNAME = "xxxx";
String PASSWORD = "xxxx";
try {
connection = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD);
connection.setAutoCommit(false); // 设置手动提交事务
String insertQuery = "INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES (?, ?, ?, '2022-09-05 00:00:00', 1)";
preparedStatement = connection.prepareStatement(insertQuery);
int i = 1;
while (true) {
preparedStatement.setString(1, String.valueOf(i));
preparedStatement.setString(2, "email" + i + "@example.com");
preparedStatement.setString(3, "name" + i);
preparedStatement.executeUpdate();
i = i + 2;
}
} catch (SQLException e) {
e.printStackTrace();
try {
if (connection != null) {
connection.rollback();
}
} catch (SQLException rollbackEx) {
rollbackEx.printStackTrace();
}
}
}
C++:MYSQL_OPT_READ_TIMEOUT和MYSQL_OPT_WRITE_TIMEOUT参数
参考代码:
mysql_options(conn, MYSQL_OPT_CONNECT_TIMEOUT, &connect_timeout);
mysql_options(conn, MYSQL_OPT_READ_TIMEOUT, &read_timeout);
mysql_options(conn, MYSQL_OPT_WRITE_TIMEOUT, &write_timeout);
// 连接到数据库
if (mysql_real_connect(conn, "101.34.85.247", "your_username", "your_password", "local1_1", 3306, NULL, 0) == NULL) {
std::cerr << "mysql_real_connect() failed\n";
mysql_close(conn);
return EXIT_FAILURE;
}
问题复现
在业务复现异常测试时,可以在本地分别对上述场景做复现,查看问题的表现。
本地模拟长连接的代码为:
try {
connection = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD);
connection.setAutoCommit(false); // 设置手动提交事务
String insertQuery = "INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES (?, ?, ?, '2022-09-05 00:00:00', 1)";
preparedStatement = connection.prepareStatement(insertQuery);
int i = 1;
while (true) {
preparedStatement.setString(1, String.valueOf(i));
preparedStatement.setString(2, "email" + i + "@example.com");
preparedStatement.setString(3, "name" + i);
preparedStatement.executeUpdate();
i = i + 2;
}
}
在dbproxy
(1)本地应用连接dbproxy,同时将Mysql服务kill掉,观察业务应用和报文表现,
抓包显示,
在11:06:57最后一次发送交互后,将Mysqlkill掉,
瞬间,客户端感受到连接断开,发送Fin报文。
即Mysql服务down掉、主从切换时,TCP连接正常,客户端有感知。
此外,此时JDBC服务端报错:
java.sql.SQLNonTransientConnectionException: 127.0.0.1:3306/LOCAL_2 | Server shutdown in progress
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:108)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:95)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.StatementImpl.executeUpdateInternal(StatementImpl.java:1393)
at com.mysql.cj.jdbc.StatementImpl.executeLargeUpdate(StatementImpl.java:2353)
at com.mysql.cj.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1303)
at Main.insertLocalDbproxy(Main.java:128)
at Main.main(Main.java:55)
其中的createSQLException显示有异常连接的catch处理。
(2)本地应用连接dbproxy,将Mysql服务器断网/关机,观察报文表现:
在17:07进行断网:
直到17:12,发生tcp的fin报文交互
问题完成复现,客户端间隔很久才收到proxy层的fin 报文,进而触发断开连接的结果。
问题分析
在了解情况复现后,进一步地,同时抓取dbproxy和Mysql的报文,此时dbproxy即作为应用的服务端,同时作为底层Mysql的客户端,在此情况下观察报文的表现。
再次完成两次故障模拟,表现一样。
7月31日14:05,发生断网/关机的操作,此时,dbproxy作为服务端,报文表现
14:05发送execute sql stamt的request:
直到14:08才发生fin挥手的操作。
dbproxy作为客户端,报文表现为:
即14:05分后未收到tcp报文
此时,dbproxy日志的表现为:
2024-07-31 14:05:52.804 INFO [$_NIOREACTOR-2-RW] (ServerPrepareHandler.java:139) preparestatement execute id:1
2024-07-31 14:05:52.880 INFO [$_NIOREACTOR-2-RW] (ServerPrepareHandler.java:139) preparestatement execute id:1
2024-07-31 14:05:54.007 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 47/s, consume tps 47/s, events queue 0
2024-07-31 14:06:04.006 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:13.528 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=20,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:14.001 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:18.549 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=25,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:23.554 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=30,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:24.001 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:28.561 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=35,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:33.563 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=40,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:34.004 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:38.568 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=45,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:43.572 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=50,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:44.000 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:48.574 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=55,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:53.577 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=60,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:06:54.003 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:06:58.584 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=65,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:03.591 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=70,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:03.999 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:08.592 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=75,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:13.599 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=80,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:14.002 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:18.601 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=85,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:23.603 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=90,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:24.001 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:28.608 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=95,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:33.614 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=100,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:33.998 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:38.617 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=105,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:43.622 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=110,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:44.000 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:48.629 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=115,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:53.634 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=120,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:07:53.998 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:07:58.639 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=125,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:08:03.642 INFO [rssd-0] (RunningSlowSqlDetector.java:76) Detect running slow sql:connectionId=13,start=2024-07-31T14:05:52.880+0800,executed=130,user=zh@127.0.0.1:64535/LOCAL1,sql=INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)
2024-07-31 14:08:03.998 INFO [monitor-status-meter-0] (MonitorStatusMeter.java:51) Produce tps 0/s, consume tps 0/s, events queue 0
2024-07-31 14:08:08.176 INFO [$_NIOREACTOR-0-RW] (AbstractConnection.java:397) close connection,reason: program err:Operation timed out : MySQLConnection [id=1044, lastTime=1722405952882, schema=LOCAL1_2, old shema=LOCAL1_2, borrowed=true, fromSlaveDB=false, threadId=656592, charset=utf8mb4, txIsolation=2, autocommit=false, attachment=local1_2{INSERT INTO employee1 (employee_id, name, email, create_date, area_id)
VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)},buckets={{}}, respHandler=SingleNodeHandler [node=local1_2{INSERT INTO employee1 (employee_id, name, email, create_date, area_id)
VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)},buckets={{}}, packetId=0], host=101.34.85.247, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=true]
2024-07-31 14:08:08.190 INFO [$_NIOREACTOR-0-RW] (AbstractConnection.java:397) close connection,reason: Backend Database has been closed, because of the reason: program err:Operation timed out : ServerConnection [id=13, client=127.0.0.1:64535, user=zh, txIsolation=2, autocommit=false, schema=LOCAL1, writeQueueSize=0]
2024-07-31 14:08:08.200 ERROR [$_NIOREACTOR-0-RW] (NIOReactor.java:84) program err: con=MySQLConnection [id=1044, lastTime=1722405952882, schema=LOCAL1_2, old shema=LOCAL1_2, borrowed=true, fromSlaveDB=false, threadId=656592, charset=utf8mb4, txIsolation=2, autocommit=false, attachment=local1_2{INSERT INTO employee1 (employee_id, name, email, create_date, area_id)
VALUES ('14717', 'email14717@example.com', 'name14717', '2022-09-05 00:00:00', 1)},buckets={{}}, respHandler=null, host=101.34.85.247, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=true]
java.io.IOException: Operation timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_351]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_351]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_351]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_351]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378) ~[?:1.8.0_351]
at com.ctg.udal.dbproxy.net.NIOSocketWR.asynRead(NIOSocketWR.java:182) ~[ctg-udal-dbproxy-2.8.6.jar:?]
at com.ctg.udal.dbproxy.net.AbstractConnection.asynRead(AbstractConnection.java:220) ~[ctg-udal-dbproxy-2.8.6.jar:?]
at com.ctg.udal.dbproxy.net.NIOReactor$RW.run(NIOReactor.java:81) ~[ctg-udal-dbproxy-2.8.6.jar:?]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_351]
直到14:08分,dbproxy才作为mysql的客户端,主动关闭连接,并且同时关闭后端、前端连接,并主动发送 fin报文到应用,应用感知到连接关闭
进一步地,查阅资料得知,
当宕机、网络故障时,正常交互的tcp连接不会主动发生异常:
由于应用程序通过 TCP 协议读写网络数据包,都是通过 TCP/IP 协议栈的 socket api 进行的,所以常规的套接字超时 socket timeout 同样适用于 JDBC 应用程序。虽然 TCP是面向连接的协议,但这里的连接是虚拟的,是动态的,也是不对等的,所以应用程序需要通过 socket timeout 来检测和感知网络层面 TCP 连接的异常,从而避免僵死连接造成的无限等待;(对这块感兴趣的朋友,可以关注笔者对 tcp/ip 协议栈,对 tcpdump/wireshark/packetdrill 工具的相关分享文章);
- 套接字连接建立后,对 socket 中数据的读写操作都是阻塞的(涉及到 CPU 用户态和内核态的切换以及系统调用),套接字超时即是读写 socket 底层数据时的阻塞超时时间;
- 调用 Socket.write()对 socket 进行写操作时,应用通过系统调用将数据传给本地操作系统内核的缓冲区之后就可以立即返回(控制权立即回到应用上),通过网络对底层数据进行远程传输的操作是由操作系统进行的,所以一般应用代码的 socket 写操作很快就会返回,一般不会发生长时间的阻塞(当然如果系统内核缓冲区由于网络故障满了的话,Socket.write()也会进入waiting阻塞状态,此时操作系统会尝试重新发包,当达到重试的次数时就会产生系统异常错误);
- 调用 Socket.read() 对 socket 进行读操作时,由于首先需要通过网络将 socket 底层的远程数据传输到本地,然后才能经由操作系统将底层数据返回给用户态的应用程序,所以一般应用代码的 socket 读操作会消耗一段时间,可能会因为长时间的阻塞而发生超时异常;
- 在网络连接发生异常或服务器异常崩溃宕机时,因为 TCP/IP 的工作机制, socket 无法检测到底层网络的异常,因此应用系统也就无法检测到跟 DBMS 之间的 TCP 连接是否处于断开状态,所以应用端如果没有配置套接字超时,应用就会无期限地等待 返回结果(这种连接也被称为死亡连接或僵尸连接 "dead connection.") ,为了避免这种僵尸连接,同样需要配置套接字超时;
在标黑处,并不会无限等待,而是操作系统、内核也会处理此种异常情况。
至此,分析出套接字超时参数,可以有效防止该异常情况。
新的测试代码:
Java测试的:
private static void testJdbcMysqlPrepare() {
Connection connection = null;
PreparedStatement preparedStatement = null;
//String JDBC_URL = "jdbc:mysql://101.34.85.247/local1_1?useServerPrepStmts=true&autoReconnect=true";
String JDBC_URL = "jdbc:mysql://xxxxxx/local1_1?useServerPrepStmts=true&autoReconnect=true&connectTimeout=6000&socketTimeout=6000";
String USERNAME = "xxxx";
String PASSWORD = "xxxx";
try {
connection = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD);
connection.setAutoCommit(false); // 设置手动提交事务
String insertQuery = "INSERT INTO employee1 (employee_id, name, email, create_date, area_id) VALUES (?, ?, ?, '2022-09-05 00:00:00', 1)";
preparedStatement = connection.prepareStatement(insertQuery);
int i = 1;
while (true) {
preparedStatement.setString(1, String.valueOf(i));
preparedStatement.setString(2, "email" + i + "@example.com");
preparedStatement.setString(3, "name" + i);
preparedStatement.executeUpdate();
i = i + 2;
// Thread.sleep(100);
}
} catch (SQLException e) {
e.printStackTrace();
try {
if (connection != null) {
connection.rollback();
}
} catch (SQLException rollbackEx) {
rollbackEx.printStackTrace();
}
} finally {
// 关闭连接,释放资源
try {
if (preparedStatement != null) {
preparedStatement.close();
}
if (connection != null) {
connection.close();
}
} catch (SQLException e) {
e.printStackTrace();
}
}
}
配置socketTimeout超时后,同样的情况再次模拟,发现客户端在6s内马上感知到异常。
后续:
客户配合抓取了同样异常测试的报文:在异常时间点的报文显示如下:
在数据批量插入时,一个statement经历的报文交互有:
客户端(应用)发送Execute Statement的Request-> 服务端(proxy)发送ACK包-> 服务端(Proxy)发送 Response ok;
但是,Mysql主机关闭后,tcp连接没有返回预期的response ok ,完全符合本地模拟的情况。
业务使用了socket套接字超时后,可根据配置的时间感知到异常,问题得到解决。