apache/dubbo

[Bug] Dubbo 3.3.2: FORCE_APPLICATION may leave one of multiple same-interface different-group consumer references with nacos-A(0) after startup when mapping is initially empty

Open

Aperta il 14 mag 2026

Vedi su GitHub
 (1 commento) (0 reazioni) (0 assegnatari)Java (41.524 star) (26.453 fork)batch import
help wanted

Descrizione

Pre-check

  • I am sure that all the content I provide is in English.

Search before asking

  • I had searched in the issues and found no similar issues.

Apache Dubbo Component

Java SDK (apache/dubbo)

Dubbo Version

Dubbo: 3.3.2 JDK: openjdk version "1.8.0_452" OS: SUSE Linux Enterprise Server 15 SP7 (x86_64) - Kernel Registry: Nacos Protocol: tri Migration mode: FORCE_APPLICATION

Steps to reproduce this issue

Environment

  • Dubbo version: 3.3.2
  • Registry: Nacos
  • Protocol: tri
  • Migration mode: FORCE_APPLICATION
  • One consumer JVM contains multiple references of the same interface but with different groups
  • enable-empty-protection=true

Consumer configuration:

dubbo.application.name=mng-consumer
dubbo.application.service-discovery.migration=FORCE_APPLICATION
dubbo.application.shutwait=30000
dubbo.application.enable-empty-protection=true

dubbo.reference.check=false
dubbo.reference.filter=-authenticationPrepare,-contextHolderParametersSelectedTransfer
dubbo.consumer.parameters.params-filter=-authenticationResolver,-authenticationExceptionTranslator
dubbo.consumer.parameters.router=-tag
dubbo.consumer.protocol=tri
dubbo.consumer.timeout=180000

dubbo.protocols.tri.name=tri
dubbo.protocols.tri.triple.max-response-body-size=52428800
dubbo.protocols.tri.triple.max-body-size=52428800

Scenario

In one consumer JVM, there are multiple references of the same interface but with different groups, for example:

  • cbsp-limt1/com.szfesc.cbsp.limt.api.LimitReProcServiceApi
  • cbsp-limt2/com.szfesc.cbsp.limt.api.LimitReProcServiceApi

Reproduction pattern

This issue does not happen every time. It seems to happen when startup enters the path where interface-app mapping is initially empty.

Observed startup log:

No interface-apps mapping found in local cache, stop subscribing, will automatically wait for mapping listener callback:
... group=cbsp-limt1&interface=com.szfesc.cbsp.limt.api.LimitReProcServiceApi ...

After that, mapping callback and Nacos app subscription logs can still be observed, for example:

[DUBBO] Received mapping notification from meta server, {serviceKey: com.szfesc.cbsp.limt.api.LimitReProcServiceApi, apps: [cbsp-limt1]}
[SUBSCRIBE-SERVICE] service:cbsp-limt1, group:RPC_GROUP, clusters:
new ips(1) service: RPC_GROUP@@cbsp-limt1 -> [...]

However, one concrete consumer reference may still remain unavailable in qos output:

As Consumer side:
+---------------------------------------------------------+----------+
|                  Consumer Service Name                  |    NUM   |
+---------------------------------------------------------+----------+
|cbsp-limt1/com.szfesc.cbsp.limt.api.LimitReProcServiceApi|nacos-A(0)|
+---------------------------------------------------------+----------+
|cbsp-limt2/com.szfesc.cbsp.limt.api.LimitReProcServiceApi|nacos-A(2)|
+---------------------------------------------------------+----------+

Restarting the consumer process makes it recover.

Important observation

For the affected reference, I do NOT see the final address notify log:

Notify service cbsp-limt1/com.szfesc.cbsp.limt.api.LimitReProcServiceApi:tri with urls ...

This suggests the affected concrete reference may not be successfully attached into the final application-level notify chain, even though mapping callback and Nacos app-level subscription do happen.

What you expected to happen

I expect that after mapping callback arrives, all concrete consumer references of the same interface in the JVM should eventually complete the application-level subscribe flow and receive address notifications correctly.

In the example above, both references should eventually become non-zero in qos output, instead of one staying at nacos-A(0) permanently until restart.

Anything else

Why I think this is a Dubbo bug instead of a user configuration problem

  • Providers are visible and healthy in Nacos.
  • Other references in the same JVM are normal.
  • The issue only affects one concrete reference while another reference of the same interface in the same process works.
  • Restarting the consumer fixes it.
  • This behavior looks like a startup race / recovery issue in application-level service discovery.

Relevant logs

Startup entered mapping-miss path:

No interface-apps mapping found in local cache, stop subscribing, will automatically wait for mapping listener callback:
... group=cbsp-limt1&interface=com.szfesc.cbsp.limt.api.LimitReProcServiceApi ...

Mapping callback was received:

[DUBBO] Received mapping notification from meta server, {serviceKey: com.szfesc.cbsp.limt.api.LimitReProcServiceApi, apps: [cbsp-limt1]}

Nacos app-level subscription happened:

[SUBSCRIBE-SERVICE] service:cbsp-limt1, group:RPC_GROUP, clusters:
new ips(1) service: RPC_GROUP@@cbsp-limt1 -> [{"instanceId":"10.111.0.195#20000#null#cbsp-limt1","ip":"10.111.0.195","port":20000,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"RPC_GROUP@@cbsp-limt1","metadata":{"dubbo.metadata-service.url-params":"{\"prefer.serialization\":\"hessian2,fastjson2\",\"version\":\"2.0.0\",\"dubbo\":\"2.0.2\",\"release\":\"3.3.2\",\"side\":\"provider\",\"port\":\"20000\",\"protocol\":\"tri\"}","dubbo.endpoints":"[{\"port\":20000,\"protocol\":\"tri\"}]","dubbo.metadata.revision":"c88088292faa51b8a2f2b0d9dfa6250b","dubbo.metadata.storage-type":"local","meta-v":"2.0.0","timestamp":"1778636485744"},"instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"ipDeleteTimeout":30000}]

QOS output when issue happens:

As Consumer side:
+---------------------------------------------------------+----------+
|                  Consumer Service Name                  |    NUM   |
+---------------------------------------------------------+----------+
|    cbsp-bp01/com.szfesc.cbsp.bp.api.BpServiceJsonApi    |nacos-A(4)|
+---------------------------------------------------------+----------+
|    cbsp-bp02/com.szfesc.cbsp.bp.api.BpServiceJsonApi    |nacos-A(4)|
+---------------------------------------------------------+----------+
|   cbsp-bpclt01/com.szfesc.cbsp.bpclt.api.BpJsonCtrlApi  |nacos-A(2)|
+---------------------------------------------------------+----------+
|   cbsp-bpclt02/com.szfesc.cbsp.bpclt.api.BpJsonCtrlApi  |nacos-A(2)|
+---------------------------------------------------------+----------+
|cbsp-limt1/com.szfesc.cbsp.limt.api.LimitReProcServiceApi|nacos-A(0)|
+---------------------------------------------------------+----------+
|cbsp-limt2/com.szfesc.cbsp.limt.api.LimitReProcServiceApi|nacos-A(2)|
+---------------------------------------------------------+----------+

Current suspicion

This may be related to the recovery path of application-level service discovery when mapping is initially empty.

Suspicious code areas:

  • ServiceDiscoveryRegistry.doSubscribe()
  • ServiceDiscoveryRegistry.DefaultMappingListener.onEvent()
  • ServiceDiscoveryRegistry.subscribeURLs()
  • ServiceInstancesChangedListener.addListenerAndNotify()
  • ServiceInstancesChangedListener.notifyAddressChanged()
  • ServiceNameMapping.buildMappingKey()
  • NacosMetadataReport.getServiceAppMapping()

Workaround

Changing:

dubbo.application.service-discovery.migration=FORCE_APPLICATION

to:

dubbo.application.service-discovery.migration=FORCE_INTERFACE

can avoid this issue, but this is only a workaround.

Frequency

This issue is not 100% reproducible on every startup. It appears under certain startup timing conditions, especially when mapping is initially empty and later recovered by callback.

If needed, I can provide more logs and help test a patch.

Do you have a (mini) reproduction demo?

  • Yes, I have a minimal reproduction demo to help resolve this issue more effectively!

Are you willing to submit a pull request to fix on your own?

  • Yes I am willing to submit a pull request on my own!

Code of Conduct

Guida contributor