i have csv
file, , wanna remove columns have less 5 different values. e.g
a b c; 1 1 1; 1 2 2; 1 3 4; 2 4 5; 1 6 7;
then wanna remove column a
since has 2 different values (1,2). how this?
a solution using arrays:
infile="infile.txt" different=5 rows=0 while read -a line ; data+=( ${line[@]/;/} ) # remove semicolons ((rows++)) done < "$infile" cols=$(( ${#data[@]}/rows )) # calculate number of rows result=() (( cntr1=0; cntr1<cols; cntr1+=1 )); cnt=() save=( ${data[cntr1]} ) # add column header (( cntr2=cols; cntr2<${#data[@]}; cntr2+=cols )); cnt[${data[cntr1+cntr2]}]=1 save+=( ${data[cntr1+cntr2]} ) # add column data done if [ ${#cnt[@]} -eq $different ] ; # choose column? result+=( ${save[@]} ) # add column result fi done cols=$((${#result[@]}/rows)) # recalculate number of columns (( cntr1=0; cntr1<rows; cntr1+=1 )); (( cntr2=0; cntr2<${#result[@]}; cntr2+=rows )); printf " %s" "${result[cntr1+cntr2]}" done printf ";\n" done
the output:
b c; 1 1; 2 2; 3 4; 4 5; 6 7;
Comments
Post a Comment