Introducing primeGraph: Workflow Automation Made Simple
Learn about our first open-source project: a Python library for building complex workflows using graphs as building blocks.
Introducing primeGraph: Workflow Automation Made Simple
Today, we're excited to introduce primeGraph, our first open-source contribution to the automation ecosystem. primeGraph is a powerful Python library designed to simplify the creation and execution of complex workflows using graphs as building blocks.
The Problem
Building complex data pipelines and workflows often involves:
- Managing dependencies between tasks
- Handling parallel execution
- Implementing proper error handling and recovery
- Monitoring progress and performance
- Maintaining state across different stages
Traditional solutions often require significant boilerplate code or lock you into specific platforms.
The primeGraph Solution
primeGraph provides a clean, intuitive API for defining workflows as directed graphs where:
- Nodes represent individual tasks or operations
- Edges define dependencies and data flow
- Built-in features handle the complexity for you
Key Features
- Parallel Execution: Automatically identifies and executes independent tasks concurrently
- State Management: Built-in checkpointing and recovery mechanisms
- Real-time Monitoring: Track progress and performance metrics
- Flexible Design: Works with any Python function or class
- Type Safety: Full TypeScript-style annotations for better development experience
Getting Started
Installing primeGraph is simple:
pip install primegraph
Here's a basic example:
from primegraph import Graph, Node # Define your workflow graph = Graph() # Add nodes data_ingestion = Node("ingest_data", func=load_data) data_processing = Node("process_data", func=clean_data) model_training = Node("train_model", func=train_model) # Define dependencies graph.add_edge(data_ingestion, data_processing) graph.add_edge(data_processing, model_training) # Execute the workflow result = graph.execute()
Advanced Configuration
For more complex workflows, you can configure additional options:
# Configure with custom settings graph = Graph( max_workers=4, retry_policy=RetryPolicy(max_attempts=3), checkpoint_dir="/tmp/checkpoints" )
Monitoring and Debugging
primeGraph includes built-in monitoring capabilities:
- Real-time progress tracking
- Performance metrics collection
- Error reporting and logging
You can enable monitoring with:
graph.enable_monitoring( metrics_backend="prometheus", log_level="INFO" )
What's Next?
We're continuously improving primeGraph based on community feedback. Upcoming features include:
Short-term Roadmap
- Enhanced visualization tools
- Integration with popular orchestration platforms
- Advanced scheduling capabilities
Long-term Goals
- Cloud-native optimizations
- Distributed execution support
- Machine learning pipeline templates
Check out the GitHub repository to get started, contribute, or share your feedback.
Happy automating!