ゼロから作るDeep Learning ③ のPython実装をRubyで作り直してみる(ステップ13/ステップ14)

作者:斎藤康毅
発売日: 2020/04/20
メディア: 単行本（ソフトカバー）

ゼロから作るDeep Learning ③を買った。DezeroのPython実装をRubyに移植する形で独自に勉強している。次はステップ13とステップ14。

ステップ13：逆伝搬における可変長引数のサポート

前回に続いて可変長引数時の逆伝搬をサポートする。backward()に以下の修正を加える。

step13.rb

class Variable
...
  def backward()
    if @grad == nil then
      @grad = @data.clone.fill(1.0)
    end
    funcs = [@creator]
    while funcs != [] do
      f = funcs.pop
      gys = f.outputs.map{|x| x.grad}
      gxs = f.backward(*gys)
      if not gxs.is_a?(Array) then
        gxs = [gxs]
      end
      f.inputs.zip(gxs).each{|x, gx|
        x.grad = gx
        if x.creator != nil then
          funcs.push(x.creator)
        end
      }
    end
  end

要点としてはf.outputsとf.inputsが可変長引数を受け取ることができるように拡張されている。f.outputsに対してx.gradの要素を取り出すためにmapを使ったりして、backward(gys)に対して任意の長さの配列が受け付けられるようになっている。

これに伴ってSquareクラスの実装を変更した。

step13.rb

class Square < Function
...
  def backward(gy)
    x = @inputs[0].data
    gx = x.zip(gy).map{|i0, i1| i0 * i1 * 2.0}
    return gx
  end

inputsが配列へと変更されたため@inputs[0]として先頭の値を取り出すように変更されている。

これまでと同じようにテストを作って実行してみる。

def add(x0, x1)
  return Add.new().call(x0, x1)
end

x = Variable.new([2.0])
y = Variable.new([3.0])
z = add(square(x), square(y))
z.backward()
puts(z.data)
puts(x.grad)
puts(y.grad)

$z = x^{2} + y^{2}$ の微分を行うテストだ。 $x$ と $y$ について微分を行う。

13.0
4.0
6.0

上手くできているようだ。

ステップ14：同じ変数を繰り返し使う場合の考慮

現状の実装では同じ変数を売り返して使う場合の考慮がなされていない。複数の場所で同じ変数を使用した場合には、backwardにおいて微分した値を累積する必要がある。

step14.rb

class Variable
...
  def backward()
    if @grad == nil then
      @grad = @data.clone.fill(1.0)
    end
    funcs = [@creator]
    while funcs != [] do
...
      f.inputs.zip(gxs).each{|x, gx|
        if x.grad === nil then
          x.grad = gx
        else
          x.grad = [(x.grad + gx).sum]
        end
        if x.creator != nil then
          funcs.push(x.creator)
        end
      }
...

テストを行う。

begin
  x = Variable.new([3.0])
  y = add(x, x)
  y.backward()
  puts(x.grad)
end

begin
  x = Variable.new([3.0])
  y = add(add(x, x),x)
  y.backward()
  puts(x.grad)
end

2.0
3.0

それともう一つ、微分の値をリセットするためのメソッドを定義しておいた。

class Variable
...
  def cleargrad()
    @grad = nil
  end
...

  x = Variable.new([3.0])
  y = add(x, x)
  y.backward()
  puts(x.grad)

  x = Variable.new([3.0])
  y = add(add(x, x),x)
  y.backward()
  puts(x.grad)

2.0
3.0

こちら問題なさそう。

FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

ゼロから作るDeep Learning ③ のPython実装をRubyで作り直してみる(ステップ13/ステップ14)