make stream processing easier!!!
An amazing framework makes stream processing easier.
🚀 What is StreamX
StreamX is an easy-to-use and comprehensive one-stop stream processing framework with a visual interface platform based on the web,
it is characterized by simplicity, fast, and convenience. The entire process of stream processing tasks will be extremely convenient by using StreamX,
which includes configuration, development, testing, deployment, monitoring, operation, and maintenance.
It not only simplifies the process of developing streaming tasks through a well-designed programming model but also supports complex projects and a large number of task management.
StreamX standardizes the entire process of a project from configuration to development, testing, deployment, monitoring, and operation and maintenance. It encourages functional programming, defines the best programming method, and provides a series of out-of-the-box Connectors. Moreover, StreamX provides two sets of APIs: scala and java.
Based on this, StreamX makes developers only need to care about the core business part with low learning cost and low development requirements.
*Excellent development assistance framework
- Supports multiple versions of Flink (1.11,x, 1.12.x, 1.13 )
- A range of out-of-the-box connectors
- Support project compilation(maven)
- Configuration parameters online
- Support to start in Applicaion mode and Yarn-Per-Job mode
- Quick and easy routine operations (task start, stop, savepoint, resume from savepoint)
- Support flame graphs
- Support Notebook (Development online )
- Project configuration and dependencies could be managed by version
- Support task backup and rollback (configuration rollback)
- Management of dependencies (maven pom) and custom jars online
- Support custom udf, connector, etc.
- Flink SQL Web IDE
- Support catalog、hive
- Send an alert email when the task fails to run
- Support restart and retry when the task fails
- Full link support from task development stage to deployment management
🏳🌈 Architecture of StreamX
The overall architecture of StreamX is shown in the following figure. Streamx consists of three parts, they are streamx-core, streamx-pump, and streamx-console.
The positioning of
streamx-core is a framework uesd while developing, it focuses on coding development, regulates configuration files, and develops in the convention over configuration guide.
streamx-core provides a development-time RunTime Content and a series of out-of-the-box Connectors. Cumbersome operations are simplified by extending
DataStream-related methods and integrating DataStream and
Flink sql api .
development efficiency and development experience will be highly improved because users can focus on the business.
streamx-pump is a component similar to
flinkx which is used for data extraction. It is developed based on various connectors provided in
streamx-core and will be integrated into
The purpose of developing streamx-pump is to create a convenient, fast, out-of-the-box extraction and migration component for real-time big data. We expect it could solve the real-time data source fetching problem.
streamx-pump is still in planning.
streamx-console is a comprehensive real-time
low code data platform that can manage
Flink tasks more convenient.
It integrates the experience of many best practices and integrates many functions such as project compilation, release,
parameter configuration, startup,
Flink SQL, monitoring, etc.,
which greatly simplifies the daily operation of Flink tasks and maintenance. The ultimate goal is to create a one-stop big data platform,
which can provide a solution that integrates flow and batch, and integrates lake and warehouse.
This platform uses technologies including but not limited to:
- Apache Flink
- Apache YARN
- Spring Boot
- Flame Graph
- Ant Design of Vue
- ANTD PRO VUE
- Monaco Editor
Thanks for the respect given by the above excellent open source projects and many unmentioned excellent open source projects, especially appreciate the support provided by Apache Apache Zeppelin and IntelliJ IDEA for their support, and inspiration and help given by fire-spark project in early stage.
👻 Why not ...❓
Apache Zeppelin is an excellent open source project that supports
notebook function allows developers to easily program
on-line and submit tasks quickly. At the language level, Zeppelin supports
python at the same time.Whether it is a DataStream task or a Flink SQL task, most tasks will go through the development stage, the testing stage, the packaging stage, the uploading server stage, and the starting task stage，
this is a process with a long link, the whole process takes a long time, and the user experience is very unfriendly: even if a symbol is modified, the task has to go through the above process from the completion of the modification to the launch.
We expect these steps could be completed by One-click solution—moving the mouse and have at least one task list, which can manage tasks conveniently、could clearly see which tasks are running, which tasks are stopped、the resource consumption of each task、one-click start or stop tasks and manage savePoint automatically.
These problems are also problems that developers will encounter in actual development.
streamx-console can solve these pain points very well. It is positioned as a one-stop real-time data platform and has more exciting functions (such as
Flink SQL WebIDE,
FlinkX is a distributed data synchronization tool based on Flink, which realizes efficient data migration between various heterogeneous data sources, and its positioning is relatively clear - it is specially used for data extraction and migration. It could be used as a service component. StreamX focuses on the management of the development stage and the post-task stage, the positioning is different. The streamx-pump module is also being planned. It is wished to solve data source extraction and migration problem, and will eventually be integrated into the streamx-console.