Apache Hive
Compute statistics for partitions
Compute statistics for partitions
- Get a list of partitions into a file:
hdfs dfs -ls /hive/dabase/path/table/ >partitions.txt
Leave only partition names in the file,remove paths and dates
- Generate statements
cat partitions.txt | while read line
do
echo "analyze mydb.mytable partition(${line}) compute statistics;
done >stats.sql
- Run the resultant script in hive
top of the page