Automating Characterization Deployment in Distributed Data Stream Management Systems
Keywords:
RQS, SPS, Four level feature extraction, Optimal resource configuration, Candidate settings, SPS-StormAbstract
DDSMS composed of two layers: upper layer – Relational Query Systems (RQS) and lower layer – Stream
Processing Systems (SPS).After query submission to RQS, query planner needs to get converted into DAG consisting
tasks running on SPS.SPS configure different deployment strategies based on query requests and data stream
properties.Introducing four-level feature extraction, it uses different query workload as training sets to predict resource
usage.Select optimal resource configuration from candidate settings based on current query requests and stream
properties.Finally, validate the approach on open source SPS-Storm.