Hadoop deployment from their organization to extract more information may be found through a relatively little-known Tajo open source data warehousing software, available for commercial use of the Apache Software Foundation has a strong help.
The new version of Tajo, Apache software for running a data warehouse over Hadoop data sets, has been updated to provide greater connectivity to Java programs and third party databases such as Oracle and PostGreSQL.
Gruter, in Korea, started leading big data infrastructure development of Tajo. From Intel, Etsy, United States National Aeronautics and Space Administration, engineers also contributed to the project of Hortonworks is discussed.
Perhaps because of its home base, the software is not very extensive Korea known elsewhere in the world, compared with other open source Hadoop hive based on SQL or packages of the Impala.
As with most benchmarks, results may vary according to the specific workload. New editions of Hive and Impala may have also closed the speed gap as well.
SK Telecom uses the software in production duties, as does Korea University and NASA’s Jet Propulsion Laboratory. The Korean music streaming service Melon uses the software for analytical processing, and has found that Tajo executes ETL jobs 1.5 to 10 times faster than Hive.