• Apache Hadoop ( /həˈduːp/) is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving...
    49 KB (5,093 words) - 03:30, 18 May 2024
  • MapReduce (redirect from Hadoop map)
    implementation that has support for distributed shuffles is part of Apache Hadoop. The name MapReduce originally referred to the proprietary Google technology...
    46 KB (5,491 words) - 21:02, 10 May 2024
  • Thumbnail for Apache ZooKeeper
    Apache ZooKeeper (category Hadoop)
    large distributed systems (see Use cases). ZooKeeper was a sub-project of Hadoop but is now a top-level Apache project in its own right. ZooKeeper's architecture...
    8 KB (714 words) - 15:45, 24 October 2023
  • Apache Parquet (category Hadoop)
    storage format in the Apache Hadoop ecosystem. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most...
    10 KB (851 words) - 09:27, 22 June 2024
  • the benefits of dimensional models on Hadoop and similar big data frameworks. However, some features of Hadoop require us to slightly adapt the standard...
    13 KB (1,656 words) - 19:36, 17 January 2024
  • Thumbnail for Data lake
    enterprises were "starting to extract and place data for analytics into a single, Hadoop-based repository." Many companies use cloud storage services such as Google...
    9 KB (1,047 words) - 14:19, 12 September 2024
  • Thumbnail for Hue (software)
    Hue (software) (redirect from Hue (Hadoop))
    Hue (Hadoop User Experience) is an open-source SQL Cloud Editor, licensed under the Apache License 2.0. Hue is an open-source SQL Assistant for querying...
    2 KB (119 words) - 17:42, 17 May 2023
  • Thumbnail for Apache Hive
    Apache Hive (category Hadoop)
    Hive is a data warehouse software project. It is built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface...
    21 KB (2,300 words) - 14:27, 2 July 2024
  • Sqoop (category Hadoop)
    interface application for transferring data between relational databases and Hadoop. The Apache Sqoop project was retired in June 2021 and moved to the Apache...
    6 KB (439 words) - 19:04, 17 July 2024
  • Thumbnail for Pentaho
    learning algorithms implemented on Hadoop Apache Cassandra - a column-oriented database that supports access from Hadoop HPCC - LexisNexis Risk Solutions...
    30 KB (1,051 words) - 18:33, 3 September 2024