What are some Vendors in Big Data

We have seen some technologies in big data now we will look at the major vendors that make up the Big Data market.

Big Data Vendors

Major vendors in Hadoop

There are numbers of vendors in Hadoop Distribution.

  • Cloudera
  • Hortonworks (HCatalog: Hive/Pig/MR Interop)
  • MapR (Network File System replaces HDFS)
  • IBM InfoSphere BigInsights
  • Microsoft

Hadoop and Cloud Computing

Amazon EMR

  • Amazon distribution or MapR M3, M5
  • It is very popular.
  • It is standardized and relatively easy to use.
  • Use SSH to query.

Hadoop on Azure

  • Very simple provisioning
  • Query in browser
  • Query from excel, other tools via Hive ODBC driver.

Google Compute Engine

  • MapR Platform as a Service

WHIRR

  • Libraries for running cloud services
  • Works on AWS EC2 and Rackspace
  • For Cloudera, one command to build a cluster
  • One more command to de-provision it

HDFS Products

  • Companies trying to make Hadoop work with enterprise storage instead of HDFS.
  • MapR is the big one.
  • Others are EMC,NetApp,Cleversafe,Symantec

NoSQL and NewSQL Vendors

NoSQL

  • MongoDB
  • CouchBase
  • Cloud (Amazon web services, Dynamo DB, Simple DB, Windows Azure Tables, Cloudent)

NewSQL

  • VoltDB
  • NuoDB
  • Just One DB

Massively Parallel Processing (MPP) products

  • TERADATA
  • Vertica (HP Company)
  • Greenplum
  • Netezza (IBM Company)

Hybrids

  • Teradata Aster
  • Hadapt
  • RainStar

Data Integration, Visualization and Analytics

Data Integration

  • Informatica
  • Syncsort
  • Talend

Data Visualization and Analytics

  • Tableau
  • Alteryx
  • Datameer

All of the above

  • Pervasive
  • Pentaho

Business Intelligence (BI) Vendors

  • Microsoft
  • IBM
  • SAP
  • ORACLE
  • MicroStrategy
  • SAS

Implementation planning

Now, we need to construct the questions that we need to consider, the decisions that we need to make, the pitfalls we need to avoid, and the roadmap that we can use to bring Big Data into our organizations.

You need to consider some point before implementing Big Data in your organization.

  • Are you data volumes truly ‘Big’? Are you collecting enough data?
  • Established technologies can handle lots of data.
  • Big Data technology or more conventional data warehouse.
  • Hadoop or MPP or Hybrid?
  • NoSQL or NewSQL?
  • On-premises or Cloud?

Hadoop Distribution Choices

  • Major or Minor
  • Cloudera or Hortonworks
  • Cloud: Amazon.MapR or Microsoft
  • MapR? Hadapt?

Tooling Choices

  • Microsoft – Microsoft’s browser-based tooling
  • Amazon – Elastic MapReduce
  • Cloudera, others – Premium Distros
  • Existing query clients – Via Hive, Via conventional DBs and Sqoop

Tutorial Funda

Tutorial Funda will keep you updated with the latest Programming Languages, Software Tools and related technologies used in Software Development.