pingcap/tidb
View on GitHubddl: give me the original trace stack when cancelling a ddl job
Open
#27,679 opened on Aug 30, 2021
help wantedsig/sql-infratype/enhancement
Description
Enhancement
So far, when a ddl job encounter a error, TiDB will stores the error message into job and retry until the time limitation has been up to. When the limitation is reached, ddl owner will actively cancel this job and throw the error message out (not including the trace stack info)
Example
So what only you can see in the ddl owner's log is something like below:
[2021/08/30 17:22:17.713 +08:00] [INFO] [ddl_worker.go:701]
["[ddl] DDL job is cancelled normally"] [worker="worker 1, tp general"] [error="pd unavailable"]
For the client side, what you can get is all about:
[2021/08/30 17:22:17.714 +08:00] [INFO] [conn.go:997] ["command dispatched failed"] [conn=3] [connInfo="id:3, addr:127.0.0.1:58552 status:10, collation:utf8mb4_unicode_ci, user:root"] [command=Query] [status="inTxn:0, autocommit:1"] [sql=" create placement policy x PRIMARY_REGION=\"cn-east-1\" REGIONS=\"cn-east-1,cn-east-2\" LEADER_CONSTRAINTS=\"[+zone=cn-west-1]\" CONSTRAINTS=\"[+disk=ssd]\""] [txn_mode=OPTIMISTIC] [err="[ddl:-1]pd unavailable\ngithub.com/pingcap/errors.AddStack
/home/arenatlx/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20210425183316-da1aaba5fb63/errors.go:174\ngithub.com/pingcap/errors.Trace
/home/arenatlx/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20210425183316-da1aaba5fb63/juju_adaptor.go:15\ngithub.com/pingcap/tidb/ddl.(*ddl).doDDLJob
/home/arenatlx/go/src/github.com/pingcap/tidb/ddl/ddl.go:610\ngithub.com/pingcap/tidb/ddl.(*ddl).CreatePlacementPolicy
/home/arenatlx/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:6208\ngithub.com/pingcap/tidb/executor.(*DDLExec).executeCreatePlacementPolicy
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/ddl.go:912\ngithub.com/pingcap/tidb/executor.(*DDLExec).Next
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/ddl.go:237\ngithub.com/pingcap/tidb/executor.Next
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/executor.go:286\ngithub.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelayExecutor
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/adapter.go:581\ngithub.com/pingcap/tidb/executor.(*ExecStmt).handleNoDelay
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/adapter.go:462\ngithub.com/pingcap/tidb/executor.(*ExecStmt).Exec
/home/arenatlx/go/src/github.com/pingcap/tidb/executor/adapter.go:411\ngithub.com/pingcap/tidb/session.runStmt
/home/arenatlx/go/src/github.com/pingcap/tidb/session/session.go:1674\ngithub.com/pingcap/tidb/session.(*session).ExecuteStmt
/home/arenatlx/go/src/github.com/pingcap/tidb/session/session.go:1568\ngithub.com/pingcap/tidb/server.(*TiDBContext).ExecuteStmt
/home/arenatlx/go/src/github.com/pingcap/tidb/server/driver_tidb.go:219\ngithub.com/pingcap/tidb/server.(*clientConn).handleStmt
/home/arenatlx/go/src/github.com/pingcap/tidb/server/conn.go:1843\ngithub.com/pingcap/tidb/server.(*clientConn).handleQuery
/home/arenatlx/go/src/github.com/pingcap/tidb/server/conn.go:1707\ngithub.com/pingcap/tidb/server.(*clientConn).dispatch
/home/arenatlx/go/src/github.com/pingcap/tidb/server/conn.go:1217\ngithub.com/pingcap/tidb/server.(*clientConn).Run
/home/arenatlx/go/src/github.com/pingcap/tidb/server/conn.go:979\ngithub.com/pingcap/tidb/server.(*Server).onConn
/home/arenatlx/go/src/github.com/pingcap/tidb/server/server.go:502\nruntime.goexit
/usr/lib/go/src/runtime/asm_amd64.s:1371"]
For the both side, what you have now is pd unavailable and rough front-side stack, but when and where the error is derived from? we need a detailed trace stack like front-side did in the ddl owner handling loop.