Storing large files in action results appears to be limited by MongoDB's 16 MB document limit
#3,600 建立於 2017年7月21日
描述
I am evaluating StackStorm as a replacement for an internal tool we use. Our primary motivation is being able to distribute actions across multiple machines by running several action runners. Our use case requires that certain actions be able to access the data that other actions generated, so I hoped that we could store the action data directly in the action result that StackStorm stores in MongoDB; however, it looks like actions are restricted by MongoDB's 16 MB limit on documents. e.g., if you run the core.http module that delivers a large file, the action will fail:
root@b0024465368b:/# st2 execution get 59691e1266941d00f54a7283
id: 59691e1266941d00f54a7283
status: failed
parameters:
url: http://foo.com/some_large_file
result:
error: command document too large
traceback: " File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2actions/worker.py", line 132, in _run_action
result = self.container.dispatch(liveaction_db)
File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2actions/container/base.py", line 68, in dispatch
action_db=action_db, liveaction_db=liveaction_db)
File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2actions/container/base.py", line 131, in _do_run
raise e
"
Since our use case also involves running actions on any of several machines, I'm not sure if it would work to just store the file somewhere on the action runner's file system, since other action runners would need access to it.
Some workaround ideas include:
- Using an NFS that is mounted on all action runners to store these payloads in.
- Having one action that downloads these files and stores them in S3, then puts a reference to that file in the action results to be used by downstream actions that process this data.
Both of these workarounds require actions to make assumptions about where and how shared data gets stored, and require additional actions that delete old data when action executions age out.
It would be great if StackStorm could handle large action payloads transparently. MongoDB's GridFS is a feature that was introduced to handle files larger than 16 MB, so maybe that is a viable option.