Skip to main content

Spark, Hadoop, Data Streaming, Data Engineering Solutions For AI

Supermicro and Cloudera Solutions

The Challenge

There is a tremendous amount of information driven by the ever changing applications, from structured, unstructured, to semi-structure data. Conventional IT infrastructure is not built to handle the variety, velocity and volume of the data produced by social media networks, mobile applications, machine sensors and scientific researches, etc. For Enterprises, utilizing big data analytics is no longer a question of when, it is a question of how. Spark, Hadoop and other open-source software, designed for the cost effective storage and processing of large volumes of data, are born for this purpose. It can linearly scale up to thousands of servers and petabytes of storage.

Cloudera integrates these open-source technologies and provides enterprise level support to help customers gain competitive edge from the large amount of data. To do so, Cloudera is deployed in scalable server clusters. Supermicro simplifies Cloudera cluster deployment with reliable systems that have both in-band and out-of-band management and by offering a wide choice of system platforms that fit into customers data centers.

The Solution and Supermicro Advantage

Supermicro server clusters support Cloudera Cloud Data Platform (CDP) with simplified deployment.

  • SYSTEM CHOICE: Customers can choose the best hardware platform to build clusters
    • Rack-mount CloudDC/Hyper systems or multi-node Twin servers, or Blade servers
    • Choice of CPU architectures, either Intel or AMD enterprise CPUs
    • Choice of GPU accelerators for applications such as Spark acceleration
    • Choice of disks from HDD to SSD to NVMe drives
    • Choice of network architectures, 10GbE to 400 GbE options
    • All managed by same IPMI / Redfish interfaces, and can be aggregated by the single-pane Supermicro Cloud Composer
    • Many of the deployment can be automated using Supermicro Super Cloud Orchestrator
  • IMPLEMENTATION CHOICE: Customers can deploy on either bare-metal implementation, Red Hat OpenShift, Kubernetes, or virtual machine implementations
  • SCALABILITY: Customers can start with the smallest cluster and scale by adding servers.
  • AUTOMATION: Supermicro can build the cluster, fully tested with guaranteed build quality, and delivery schedule. The software implementation can be deployed using the automation features of Supermicro Cloud Orchestrator

Example of Fully Integrated Cloudera CDP Cluster

Key Features and benefits:

  • Purpose built cluster configurations optimized for capacity, compute or IO performance
  • Choice of Intel Xeon Scalable or AMD EPYC CPUs - recommend using the same CPU architecture for the entire cluster
  • High availability Name Node design with no single point of failure
  • Large memory options designed specifically for Spark and other in memory, low latency computations
  • Hyper-Scale server platforms designed for extremely large deployments
  • High density compute, storage and memory design to achieve the best efficiency and lowest TCO
  • Flexible network switch options with 1 or 2x 10G / 25G / 100G or faster switches per rack.
  • Cost effective 14U rack design, ideal for Proof of Concept testing environment
  • Standard 42U rack design and flexible PDU options that meet any data center environment
  • Up to Titanium Level (96%+) Efficiency - Redundant Power Supplies with PMBus
  • Built in with IPMI and SMC OOB (out of band management) suite for automated cluster management
  • Fully integrated, fully configured and completely tested with Hadoop distributions of your choice
  • Proof of Concept testing cluster available for risk free purchasing experience
  • Cloudera Enterprise support, licensed from Cloudera
Supermicro fully integrated Hadoop cluster solution rack
  • 1 or 2x 48 port 10G SFP+ / 10GBase-T / 25GbE
    1 or 2x 32 port 100GbE, 1x 48 port Switch, GbE
  • 1x Management Node 1U Intel Xeon Scalable or AMD EPYC CPUs
  • 3x Name Nodes 1U DP Intel Xeon Scalable or AMD EPYC CPUs
  • Optimized Data nodes 2U SSG, 2U BigTwin or 4U FatTwin® with Intel Xeon Scalable or AMD EPYC CPUs
  • Standard 42U rack with metered PDUs, rack customization options available
  • Integration service includes full cluster Burn-in and testing, BIOS and FW update, networking configuration, Pre-install Cloudera CDP distribution of choice, and full cluster
Supermicro Server

Supermicro Servers for Bare metal or Kubernetes deployments

1U CloudDC servers or

Multi-node GrandTwin® servers

OR

Supermicro Servers for VMware deployment or using GPUs

2U Hyper Servers