勝手に添削

勝手に人のコードを添削。

sub get_chisquare($$$$) {

    my ( $s_real, $s_prediction, $b_real, $b_prediction ) = @_;
    my $sum  = $s_real + $s_prediction + $b_real + $b_prediction;
    my @Data = ( $s_real, $s_prediction, $b_real, $b_prediction );

    my ( $s, $b, $real, $prediction, $temp, $row, $chi );
    my (@Temp);

    $chi = 0;

    $s          = $s_real + $s_prediction;
    $b          = $b_real + $b_prediction;
    $real       = $s_real + $b_real;
    $prediction = $s_prediction + $b_prediction;

    $row = $s / $sum;
    push( @Temp, $row * $real );
    push( @Temp, $row * $prediction );

    $row = $b / $sum;
    push( @Temp, $row * $real );
    push( @Temp, $row * $prediction );

    for ( my $i = 0 ; $i < 4 ; $i++ ) {
        $chi +=
          ( $Data[$i] - $Temp[$i] ) * ( $Data[$i] - $Temp[$i] ) / $Temp[$i];
    }

    return $chi;
}

適当にスタイルを変えつつ、一時変数をインライン展開してやる。多少計算量は増えるが気にしないのが吉。

sub get_chisquare {
    my ( $s_real, $s_pred, $b_real, $b_pred ) = @_;
    my @data = ( $s_real, $s_pred, $b_real, $b_pred );
    my $sum  = $s_real + $s_pred + $b_real + $b_pred;
    my @tmp  = (
        ( $s_real + $s_pred ) * ( $s_real + $b_real ) / $sum,
        ( $s_real + $s_pred ) * ( $s_pred + $b_pred ) / $sum,
        ( $b_real + $b_pred ) * ( $s_real + $b_real ) / $sum,
        ( $b_real + $b_pred ) * ( $s_pred + $b_pred ) / $sum,
    );
    my $chi = 0;
    for my $i ( 0 .. $#tmp ) {
        $chi += ( $data[$i] - $tmp[$i] )**2 / $tmp[$i];
    }
    return $chi;
}

どうすれば良いか大体見えたので、計算量を落としつつもうちょいスマートに。

sub get_chisquare {
    my ( $s_real, $s_pred, $b_real, $b_pred ) = @_;
    my $sum = $s_real + $s_pred + $b_real + $b_pred;
    my @rows = ( $s_real + $s_pred, $b_real + $b_pred );
    my @cols = ( $s_real + $b_real, $s_pred + $b_pred );
    my $data = [ [ $s_real, $s_pred ], [ $b_real, $b_pred ] ];

    my $chi = 0;
    for my $i ( 0 .. 1 ) {
        for my $j ( 0 .. 1 ) {
            my $tmp = $rows[$i] * $cols[$j] / $sum;
            $chi += ( $data->[$i]->[$j] - $tmp )**2 / $tmp;
        }
    }
    return $chi;
}

まぁ、そもそもこんなのがあるっぽいけど。

追記

計算量を落としつつ、といっても重複しているのはただの加算なので、2重の for ループの方がコストが高い気もする。ベンチマークとったほうが確実だけどとりあえずパス。