Back to Blog
Open SourcePythonWorkflowsprimeGraph

Introducing primeGraph: Workflow Automation Made Simple

Learn about our first open-source project: a Python library for building complex workflows using graphs as building blocks.

Introducing primeGraph: Workflow Automation Made Simple

Today, we're excited to introduce primeGraph, our first open-source contribution to the automation ecosystem. primeGraph is a powerful Python library designed to simplify the creation and execution of complex workflows using graphs as building blocks.

The Problem

Building complex data pipelines and workflows often involves:

  • Managing dependencies between tasks
  • Handling parallel execution
  • Implementing proper error handling and recovery
  • Monitoring progress and performance
  • Maintaining state across different stages

Traditional solutions often require significant boilerplate code or lock you into specific platforms.

The primeGraph Solution

primeGraph provides a clean, intuitive API for defining workflows as directed graphs where:

  • Nodes represent individual tasks or operations
  • Edges define dependencies and data flow
  • Built-in features handle the complexity for you

Key Features

  • Parallel Execution: Automatically identifies and executes independent tasks concurrently
  • State Management: Built-in checkpointing and recovery mechanisms
  • Real-time Monitoring: Track progress and performance metrics
  • Flexible Design: Works with any Python function or class
  • Type Safety: Full TypeScript-style annotations for better development experience

Getting Started

Installing primeGraph is simple:

pip install primegraph

Here's a basic example:

from primegraph import Graph, Node

# Define your workflow
graph = Graph()

# Add nodes
data_ingestion = Node("ingest_data", func=load_data)
data_processing = Node("process_data", func=clean_data)
model_training = Node("train_model", func=train_model)

# Define dependencies
graph.add_edge(data_ingestion, data_processing)
graph.add_edge(data_processing, model_training)

# Execute the workflow
result = graph.execute()

Advanced Configuration

For more complex workflows, you can configure additional options:

# Configure with custom settings
graph = Graph(
    max_workers=4,
    retry_policy=RetryPolicy(max_attempts=3),
    checkpoint_dir="/tmp/checkpoints"
)

Monitoring and Debugging

primeGraph includes built-in monitoring capabilities:

  • Real-time progress tracking
  • Performance metrics collection
  • Error reporting and logging

You can enable monitoring with:

graph.enable_monitoring(
    metrics_backend="prometheus",
    log_level="INFO"
)

What's Next?

We're continuously improving primeGraph based on community feedback. Upcoming features include:

Short-term Roadmap

  • Enhanced visualization tools
  • Integration with popular orchestration platforms
  • Advanced scheduling capabilities

Long-term Goals

  • Cloud-native optimizations
  • Distributed execution support
  • Machine learning pipeline templates

Check out the GitHub repository to get started, contribute, or share your feedback.

Happy automating!