testcontainers/testcontainers-java

HttpWaitStrategy does not retry to re-validate ResponsePredicate

Open

#2,516 创建于 2020年4月3日

在 GitHub 查看
 (4 评论) (0 反应) (0 负责人)Java (7,535 star) (1,588 fork)batch import
help wantedresolution/acknowledged

描述

using the HttpWaitStrategy with a forResponsePredicate does only try it once, although the responseBody will change when startup is completed. expectation would be that the strategy re-evaluated until timeout is reached and until it is successful.

test setup with a HDFS cluster where the NameNode is in safemode on startup and will change into normal operation mode only after DataNode is successfully registered:

public static final GenericContainer hadoopContainer = new GenericContainer("sequenceiq/hadoop-docker:2.7.1")
	.withExposedPorts(50070, 50075, 50090, 9000)
	.waitingFor(new WaitAllStrategy()
		.withStrategy(Wait.forLogMessage(".*localhost: starting nodemanager.*\n", 1))
		.withStrategy(Wait.forHttp("/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo")
				  .forPort(50070)
				  .forResponsePredicate(s -> !s.contains("Safe mode is ON.")))
		.withStartupTimeout(Duration.ofMinutes(2))
	);

the log output clearly only tries it once to validate the predicate:

2020-04-03 10:52:31.339 INFO [ontainers.wait.strategy.HttpWaitStrategy] [ducttape-0] - /magical_euler: Waiting for 120 seconds for URL: http://localhost:33077/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo
2020-04-03 10:52:32.438 TRACE[ontainers.wait.strategy.HttpWaitStrategy] [ducttape-1] - Get response code 200
2020-04-03 10:52:32.440 TRACE[ontainers.wait.strategy.HttpWaitStrategy] [ducttape-1] - Get response {  "beans" : [ {    "name" : "Hadoop:service=NameNode,name=NameNodeInfo",    "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",    "UpgradeFinalized" : true,    "ClusterId" : "CID-5e691286-4de5-4dde-800b-c02a7a8bf44a",    "Version" : "2.7.1, r15ecc87ccf4a0228f35af08fc56de536e6ce657a",    "Used" : 335872,    "Free" : 29158596608,    "Safemode" : "Safe mode is ON. The reported blocks 31 has reached the threshold 0.9990 of total blocks 31. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 17 seconds.",    "NonDfsUsedSpace" : 33566855168,    "PercentUsed" : 5.354608E-4,    "BlockPoolUsedSpace" : 335872,    "PercentBlockPoolUsed" : 5.354608E-4,    "PercentRemaining" : 46.48582,    "CacheCapacity" : 0,    "CacheUsed" : 0,    "TotalBlocks" : 31,    "TotalFiles" : 35,    "NumberOfMissingBlocks" : 0,    "NumberOfMissingBlocksWithReplicationFactorOne" : 0,    "LiveNodes" : "{\"localhost:50010\":{\"infoAddr\":\"127.0.0.1:50075\",\"infoSecureAddr\":\"127.0.0.1:0\",\"xferaddr\":\"127.0.0.1:50010\",\"lastContact\":0,\"usedSpace\":335872,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":33566855168,\"capacity\":62725787648,\"numBlocks\":31,\"version\":\"2.7.1\",\"used\":335872,\"remaining\":29158596608,\"blockScheduled\":0,\"blockPoolUsed\":335872,\"blockPoolUsedPercent\":5.354608E-4,\"volfails\":0}}",    "DeadNodes" : "{}",    "DecomNodes" : "{}",    "BlockPoolId" : "BP-1961412683-172.17.0.32-1450036414523",    "NameDirStatuses" : "{\"failed\":{},\"active\":{\"/tmp/hadoop-root/dfs/name\":\"IMAGE_AND_EDITS\"}}",    "NodeUsage" : "{\"nodeUsage\":{\"min\":\"0.00%\",\"median\":\"0.00%\",\"max\":\"0.00%\",\"stdDev\":\"0.00%\"}}",    "NameJournalStatus" : "[{\"stream\":\"EditLogFileOutputStream(/tmp/hadoop-root/dfs/name/current/edits_inprogress_0000000000000000192)\",\"manager\":\"FileJournalManager(root=/tmp/hadoop-root/dfs/name)\",\"required\":\"false\",\"disabled\":\"false\"}]",    "JournalTransactionInfo" : "{\"LastAppliedOrWrittenTxId\":\"192\",\"MostRecentCheckpointTxId\":\"191\"}",    "NNStarted" : "Fri Apr 03 04:52:15 EDT 2020",    "CompileInfo" : "2015-06-29T06:04Z by jenkins from (detached from 15ecc87)",    "CorruptFiles" : "[]",    "DistinctVersionCount" : 1,    "DistinctVersions" : [ {      "key" : "2.7.1",      "value" : 1    } ],    "SoftwareVersion" : "2.7.1",    "RollingUpgradeStatus" : null,    "Threads" : 35,    "Total" : 62725787648  } ]}
2020-04-03 10:54:12.156 ERROR[🐳 [sequenceiq/hadoop-docker:2.7.1]     ] [main] - Could not start container
org.rnorth.ducttape.TimeoutException: java.util.concurrent.TimeoutException
	at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:70) ~[duct-tape-1.0.8.jar:?]
	at org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60) ~[duct-tape-1.0.8.jar:?]
	at org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:53) ~[testcontainers-1.13.0.jar:?]
	at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:876) ~[testcontainers-1.13.0.jar:?]
	at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:428) ~[testcontainers-1.13.0.jar:?]
	at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:317) ~[testcontainers-1.13.0.jar:?]
	at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81) [duct-tape-1.0.8.jar:?]
	at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:315) [testcontainers-1.13.0.jar:?]
	at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:302) [testcontainers-1.13.0.jar:?]
	[...]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_242]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) [?:1.8.0_242]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [?:1.8.0_242]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) [?:1.8.0_242]
	at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217) [junit-4.12.jar:4.12]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.createTest(SpringJUnit4ClassRunner.java:226) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner$1.runReflectiveCall(SpringJUnit4ClassRunner.java:289) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.12.jar:4.12]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.methodBlock(SpringJUnit4ClassRunner.java:291) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:246) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:97) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) [junit-4.12.jar:4.12]
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) [junit-4.12.jar:4.12]
	at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363) [junit-4.12.jar:4.12]
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:190) [spring-test-5.2.3.RELEASE.jar:5.2.3.RELEASE]
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) [junit-rt.jar:?]
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) [junit-rt.jar:?]
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) [junit-rt.jar:?]
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) [junit-rt.jar:?]
Caused by: java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:1.8.0_242]
	at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:65) ~[duct-tape-1.0.8.jar:?]
	... 34 more

i am not sure if i am doing something wrong or this is actually a bug. (the response does change after ~10seconds and does not include the safemode info anymore, so that the predicate should succeed)

otherwise after looking into source code i suspect that the retryUntilScucess has to be replaced with retryUntilTrue (at least from description the former only does until success)

贡献者指南