Maximize the value of your Apache Spark workloads

Autonomously and continuously eliminate application inefficiencies with no manual tuning, no need to apply recommendations, and no changes to application code.

  • money toolbox icon

    Immediately reduce instance hours and cost

    Only pay for what you use when CPU and memory are optimized in real time.

  • finops info lp icon 2

    Save engineering time and effort with no manual tuning

    Reclaim hours of engineering time that can be reallocated to GenAI and AgenticAI projects.

  • finops info lp icon3

    Autonomously eliminate in-application waste
    without code changes

    Autonomously eliminate in-application waste without code changes

    Spark applications are inherently wasteful. Pepperdata eliminates in-app waste in real time.

How Pepperdata optimizes Apache Spark clusters

V2 Continuous Intelligent Tuning section

No matter where you run Apache Spark—in the cloud, on prem, or in hybrid environments—Pepperdata Capacity Optimizer saves you money by:

  • (1) icon homepage pco spark clusters

    Automatically identifying where more jobs can be run in real time  

  • (1) icon homepage pco instance hours

    Enabling the scheduler to more fully utilize available resources before adding new nodes or pods

  • (1) icon homepage pco manual tuning

    Dynamically tuning the Cluster Autoscaler to respond to changing application workload needs

The result: Apache Spark CPU and memory are automatically optimized to reduce costs and increase utilization, enabling more apps to be launched for an average cost savings between 30-47%.

 

Autodesk reduced Spark costs by over 50% with
Pepperdata Capacity Optimizer

Autodesk reduced Spark costs by over 50% with Pepperdata Capacity Optimizer

Let us do the same for you.

 

Challenge

Autodesk experienced runaway costs as the team could not keep up with manually tuning its Spark on Amazon EMR workloads.

Solution

Pepperdata Capacity Optimizer autonomously tuned Autodesk’s Spark applications in real time for maximum resource utilization.

Results

Autodesk realized cost savings by over 50% for its Spark on EMR workloads, and automated manual tuning tasks to free the developer team for more innovative, high-growth projects.

TPC-DS benchmarks for Apache Spark

Reduced the total instance hours and related costs by 41.8% and enabled the entire workload to run 45.5% faster (October 2023)

Optimized resource utilization with a 157% increase in CPU utilization and a 38% increase in memory utilization (August 2021)

*TPC-DS is the Decision Support framework from the Transaction Processing Performance Council. TPC-DS is an industry-standard big data analytics benchmark. Pepperdata’s work is not an official audited benchmark as defined by TPC. TPC-DS benchmark results (Amazon EKS), 1 TB dataset, 500 nodes, 10 parallel applications with 275 executors per application.

Explore More

Looking for a safe, proven method to reduce waste and cost by up to 47% and maximize value for your cloud environment? Sign up now for a free cost optimization demo to learn how Pepperdata Capacity Optimizer can help you start saving immediately.