Big Data Automation

Orchestrate Big Data Analytics, ETL and Machine Learning

Big Compute for everyone !

As modern scientific and engineering problems grow in complexity, the computation time and memory requirements increase and parallel computing becomes a necessity. Proactive Big Data Automation is a packaged solution to run and govern end-to-end Big Data and ML processes. ProActive integrates with de-facto standards in scientific and engineering environments.

Key Values

Simple and Flexible

simple and flexible

Seamlessly parallelize your scientific models and programs from your favorite interactive environment (R Language, Matlab, Scilab). Use Spark and Hadoop platforms from ProActive.

Scientific workflows

scientific workflows

We know that a single language cannot fit for every use case, calling for interoperability between multiple languages and services. Create a workflow of multidisciplinary tasks.

Federate your resources

federate your resources

Adapt your infrastructure to your business. Proactive provide strong resources policies to federate multi, hybrid private & public cloud and connect elastic nodes to pay-as-you go

What for?

Access Big Compute

Enable business line users to access the cloud capacity through a user-friendly & powerful interface. Translate business needs and processes into expressive workflows and execute them at scale.

Legal & General
Capgemini
INRA
L'Oréal
Home Office
CNES

Create and govern end-to-end big data processes

Big data

Proactive continually create ready-to-use tasks, in particular in big data to let our user create big data workflows easily and take benefit of ProActive Workflow and Scheduling governance features:

  • Create and govern end-to-end big data processes using ProActive Workflow Scheduling
  • Kafka, Spark, Hadoop, SAS tasks and many others
  • operationnal serverless and Proactive Elastic nodes

Read More

Proactive Connectors

AWS S3
AWS S3
Azure Databricks
Azure Databricks
Azure Datalake
Azure Datalake
Azure Storage
Azure Storage
Elasticsearch
Elasticsearch
Greenplum
Greenplum
Hive
Hive
Kafka
Kafka
Kibana
Kibana
Logstash
Logstash
Mongodb
Mongodb
MySQL
MySQL
Oracle
Oracle
PostgreSQL
PostgreSQL
SQL server
SQL server
Storm
Storm
Swarm
Swarm
Talend
Talend
Visdom
Visdom
Zookeeper
Zookeeper
cassandra
cassandra
hadoop
hadoop
spark
spark
More connectors
Get Started

Success Stories

scientific lab
[INRA] Big Data HPC for Health Discovery

Distributed R environment for quantitative metagenomics platform and statistical analysis. more...

Big Compute made simple
Big compute made simple

Orchestration tools become more and more relevant to connect services, applications, multiple databases, compute over multi-clouds, custom analytics, automate and control processes. more...

Big data landscape
Big data landscape

In this white paper, we will introduce the state of the art of big data processing: from parallel processing with the two main types of parallelisms, to several famous big data processing platforms such as Hadoop, Spark and YARN, and some Stream processing platforms like Spark Streaming, S4, Storm, Flink, etc. more...

More resources