x/tools/cmd/goyacc: bad code generated by goyacc for large grammars (more than 1000 non-terminals)
#78251 opened on Mar 20, 2026
Description
Go version
go version go1.26.1 darwin/arm64
Output of go env in your module/workspace:
AR='ar'
CC='cc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='c++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/deepak/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/deepak/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/yw/mq1pwdbx3gd9077rsc65x7jw0000gn/T/go-build4184327577=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/deepak/src/varonis/dev.azure.com/VaronisIO/DA Cloud/pg-parser/go.mod'
GOMODCACHE='/Users/deepak/go/pkg/mod'
GONOPROXY='github.com/cyralinc/*,dev.azure.com/VaronisIO/*,github.com/lestrrat-go/*'
GONOSUMDB='github.com/cyralinc/*,dev.azure.com/VaronisIO/*'
GOOS='darwin'
GOPATH='/Users/deepak/go'
GOPRIVATE='github.com/cyralinc/*,dev.azure.com/VaronisIO/*'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.26.1/libexec'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/deepak/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.26.1/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.26.1'
GOWORK=''
What did you do?
Run goyacc on a large grammar
What did you see happen?
When the number of non-terminals in the grammar is < 1000, the generated code behaves fine but the generated code fails to parse even simple inputs when the number of non-terminals is >= 1000. I confirmed that changing the value of the constant NOMORE to something like -1000000 from -1000 fixes the issue. This appears to be a sentinel value which should have a larger absolute value than the number of non-terminals in the grammar.
The number of states etc., were all within limits of the other defined constant values in the code.
What did you expect to see?
The generated code should work fine for a large number of non-terminal symbols in the grammar. 1000 definitely is not a very large number.