【torch.nn.init】初始化参数方法解读

慈云数据 2024-04-27 技术支持 38 0

 可参考:torch.nn.init - 云+社区 - 腾讯云

【torch.nn.init】初始化参数方法解读
(图片来源网络,侵删)

一. torch.nn.init.constant_(tensorval)  

1. 作用

        常数分布: 用值val填充向量。

2. 参数

  • tensor – an n-dimensional torch.Tensor
  • val – the value to fill the tensor with

    3. 实例

    import torch
    form torch from nn
    w = torch.empty(3, 5)
    print(w)
    print(nn.init.constant_(w, 0.3))
    -------------------------------------
    tensor([[6.4069e+02, 2.7489e+20, 1.5444e+25, 1.6217e-19, 7.0062e+22],
            [1.6795e+08, 4.7423e+30, 4.7393e+30, 9.5461e-01, 4.4377e+27],
            [1.7975e+19, 4.6894e+27, 7.9463e+08, 3.2604e-12, 2.6209e+20]])
    tensor([[0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
            [0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
            [0.3000, 0.3000, 0.3000, 0.3000, 0.3000]])

    二. torch.nn.init.normal_(tensor, mean=0, std=1)

    1. 作用:

            正态分布:从给定均值和标准差的正态分布N(mean, std)中生成值,填充输入的张量或变量

    【torch.nn.init】初始化参数方法解读
    (图片来源网络,侵删)

    2. 参数:

    • tensor – n维的torch.Tensor
    • mean – 正态分布的均值
    • std – 正态分布的标准差

      3. 实例:

      import torch
      from torch import nn
      w = torch.empty(3, 5)
      print(w)
      print(torch.nn.init.normal_(w))
      ----------------------------------------------
      tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
              [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
              [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
      tensor([[-1.1406, -0.1720, -1.4460,  0.5305, -0.0854],
              [ 0.8992,  0.3495, -0.8262, -1.4641, -0.6426],
              [ 0.7404,  0.7124, -0.3902,  0.0625,  0.6256]])

      三. torch.nn.init.uniform_(tensor, a=0.0, b=1.0)

      1.作用:

             均匀分布: 从均匀分布N(a,b)中生成值,填充输入的张量或变量。

      2. 参数:

      • tensor – n 维的torch.Tensor
      • a – 均匀分布的下界
      • b – 均匀分布的上界

        3. 实例:

        import torch
        from torch import nn
        w = torch.empty(3, 5)
        print(w)
        print(nn.init.uniform_(w))
        -----------------------------------------------
        tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
        tensor([[0.6653, 0.9605, 0.2208, 0.0140, 0.9672],
                [0.4201, 0.5819, 0.8383, 0.4334, 0.0673],
                [0.1246, 0.4066, 0.3413, 0.1231, 0.0463]])

        四. torch.nn.init.ones_(tensor)

        1.作用:

          全1分布:用标量值 1 填充输入张量。

        2. 参数:

        • tensor – n 维的torch.Tensor

          3. 实例: 

          import torch
          from torch import nn
          w = torch.empty(3, 5)
          print(w)
          print(nn.init.ones_(w))
          ---------------------------------------
          tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                  [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                  [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
          tensor([[1., 1., 1., 1., 1.],
                  [1., 1., 1., 1., 1.],
                  [1., 1., 1., 1., 1.]])

          五. torch.nn.init.zeros_(tensor)

          1.作用:

                  全0分布:用全0填充张量。

          2. 参数: 

          • tensor – n 维的torch.Tensor

            3. 实例: 

            import torch
            from torch import nn
            w = torch.empty(3, 5)
            print(w)
            print(nn.init.zeros_(w))
            -------------------------------------------------
            tensor([[-4.2990e-27,  4.5701e-41, -4.2990e-27,  4.5701e-41,         nan],
                    [ 4.5699e-41,  7.6194e+31,  1.5564e+28,  4.7984e+30,  6.2121e+22],
                    [ 1.8370e+25,  1.4603e-19,  6.4069e+02,  2.7489e+20,  1.5444e+25]])
            tensor([[0., 0., 0., 0., 0.],
                    [0., 0., 0., 0., 0.],
                    [0., 0., 0., 0., 0.]])

            六. torch.nn.init.eye_(tensor)

            1.作用:

              对角分布:用单位矩阵来填充2维输入张量或变量。

            2. 参数:

            • tensor – 2维的torch.Tensor 或 autograd.Variable

              3. 实例: 

              import torch
              from torch import nn
              w = torch.empty(3, 5)
              print(w)
              print(nn.init.eye_(w))
              -------------------------------------------
              tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                      [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                      [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
              tensor([[1., 0., 0., 0., 0.],
                      [0., 1., 0., 0., 0.],
                      [0., 0., 1., 0., 0.]])

              七. torch.nn.init.dirac_(tensor, groups=1)

              1.作用:

                dirac分布:用Dirac δ函数来填充{3, 4, 5}维输入张量或变量。在卷积层尽可能多的保存输入通道特性。

              2. 参数:

              • tensor – {3, 4, 5}维的torch.Tensor 或 autograd.Variable

                3. 实例: 

                import torch
                from torch import nn
                w = torch.empty(3, 16, 5, 5)
                print(w.shape)
                print(nn.init.dirac_(w).shape)
                z = torch.empty(3, 24, 5, 5)
                print(z.shape)
                print(nn.init.dirac_(z, 3).shape)
                ---------------------------------------------
                torch.Size([3, 16, 5, 5])
                torch.Size([3, 16, 5, 5])
                torch.Size([3, 24, 5, 5])
                torch.Size([3, 24, 5, 5])

                八. torch.nn.init.xavier_uniform_(tensor, gain=1.0)

                1. 作用: 

                        xavier_uniform分布:用一个均匀分布生成值,填充输入的张量或变量。

                2. 参数: 

                • tensor – n维的torch.Tensor
                • gain – 可选的缩放因子

                   3.实例:

                  import torch
                  from torch import nn
                  w = torch.empty(3, 5)
                  print(w)
                  print(nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu')))
                  ----------------------------------------------------------
                  tensor([[6.4069e+02, 2.7489e+20, 1.5444e+25, 1.6217e-19, 7.0062e+22],
                          [1.6795e+08, 4.7423e+30, 4.7393e+30, 9.5461e-01, 4.4377e+27],
                          [1.7975e+19, 4.6894e+27, 7.9463e+08, 3.2604e-12, 2.6209e+20]])
                  tensor([[-0.9562, -0.6834,  0.7449, -0.2484, -0.7638],
                          [-1.0150, -0.2982, -0.2133, -1.1132, -1.0273],
                          [ 0.5228,  0.9122, -0.5077, -0.2911,  0.1625]])

                   九. torch.nn.init.xavier_normal_(tensor, gain=1.0)

                  1. 作用: 

                           xavier_normal 分布:用一个正态分布生成值,填充输入的张量或变量。

                   2. 参数:  

                  • tensor – n维的torch.Tensor
                  • gain – 可选的缩放因子

                     3. 实例:

                    import torch
                    from torch import nn
                    w = torch.empty(3, 5)
                    print(w)
                    print(nn.init.xavier_normal_(w))
                    --------------------------------------------
                    tensor([[4.7984e+30, 6.2121e+22, 1.8370e+25, 1.4603e-19, 6.4069e+02],
                            [2.7489e+20, 1.5444e+25, 1.6217e-19, 7.0062e+22, 1.6795e+08],
                            [4.7423e+30, 4.7393e+30, 9.5461e-01, 4.4377e+27, 1.7975e+19]])
                    tensor([[ 0.3654,  0.4767,  0.1407, -0.4990,  0.2799],
                            [ 0.0545,  0.5941, -0.3611,  0.5469,  0.0781],
                            [-0.0393,  0.1817, -0.0407, -0.2593, -0.2736]])

                     十. torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

                    1. 作用:

                            kaiming_uniform 分布:用一个均匀分布生成值,填充输入的张量或变量。 

                    2. 参数: 

                    • tensor – n维的torch.Tensor或autograd.Variable;
                    • a – 这层之后使用的rectifier的斜率系数(ReLU的默认值为0);
                    • mode – 可以为“fan_in”(默认)或 “fan_out”;
                    • “fan_in” – 保留前向传播时权值方差的量级;
                    • “fan_out” – 保留反向传播时的量级;
                    • nonlinearity=‘leaky_relu’ – 非线性函数 建议“relu”或“leaky_relu”(默认值)使用。

                      3. 实例: 

                      import torch
                      from torch import nn
                      w = torch.empty(3, 5)
                      print(w)
                      print(nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu'))
                      -------------------------------------------------
                      tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                              [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                              [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
                      tensor([[ 0.6771, -0.7587,  0.6915, -0.7163,  0.0840],
                              [-1.0694, -0.4790, -0.4019, -0.8439,  0.5794],
                              [-0.9363, -0.0655, -0.0506, -0.1419,  0.5395]])

                      十一. torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

                      1. 作用: 

                              kaiming_normal 分布:用一个正态分布生成值,填充输入的张量或变量。 

                      2. 参数: 

                      • tensor – n维的torch.Tensor或autograd.Variable;
                      • a – 这层之后使用的rectifier的斜率系数(ReLU的默认值为0);
                      • mode – 可以为“fan_in”(默认)或 “fan_out”fan_in保留前向传播时权值方差的量级fan_out保留反向传播时的量级。

                        3. 实例: 

                        import torch
                        from torch import nn
                        w = torch.empty(3, 5)
                        print(w)
                        print(nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu'))
                        -------------------------------------------------
                        tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                                [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                                [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
                        tensor([[-0.2421,  1.3102, -0.0506,  0.5099, -0.1017],
                                [-1.2707, -0.9636, -0.4539,  1.1167,  0.6717],
                                [ 0.1898,  0.6261, -1.1114, -0.4440,  0.5798]])

                        十二. torch.nn.init.orthogonal_(tensor, gain=1) 

                         1. 作用:

                                正交矩阵:用一个(半)正交矩阵填充输入张量。 

                        2. 参数: 

                        • tensor– 一个n维的tensor,其中 n≥2
                        • gain– 可选比例系数

                          3. 实例: 

                          import torch
                          from torch import nn
                          w = torch.empty(3, 5)
                          print(w)
                          print(nn.init.orthogonal_(w))
                          ------------------------------------------------
                          tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                                  [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                                  [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
                          tensor([[-0.2146, -0.8764, -0.3447, -0.1060,  0.2363],
                                  [-0.1957,  0.2711,  0.0974, -0.6438,  0.6813],
                                  [ 0.6258, -0.3716,  0.6203, -0.2903, -0.0353]]

                          十二. torch.nn.init.sparse_(tensor, sparsity, std=0.01) 

                          1. 作用: 

                                  稀疏矩阵:将2D输入张量填充为稀疏矩阵,其中非零元素将从正态分布N ( 0 , 0.01 ) N(0,0.01)N(0,0.01)中提取。

                           2. 参数:

                          • tensor– 一个n维的torch.tensor张量
                          • sparsity– 每一列中元素的比例设置为零
                          • std– 用于产生非零值的正态分布的标准差

                             3. 实例:

                             

                            import torch
                            from torch import nn
                            w = torch.empty(3, 5)
                            print(w)
                            print(nn.init.sparse_(w, sparsity=0.1))
                            ------------------------------------------
                            tensor([[9.5461e-01, 4.4377e+27, 1.7975e+19, 4.6894e+27, 7.9463e+08],
                                    [3.2604e-12, 2.6209e+20, 4.1641e+12, 1.9434e-19, 3.0881e+29],
                                    [6.3828e+28, 1.4603e-19, 7.7179e+28, 7.7591e+26, 3.0357e+32]])
                            tensor([[ 0.0112,  0.0000, -0.0055,  0.0000,  0.0000],
                                    [ 0.0026, -0.0009,  0.0000, -0.0044, -0.0012],
                                    [ 0.0000,  0.0176,  0.0022, -0.0037, -0.0035]])
微信扫一扫加客服

微信扫一扫加客服

点击启动AI问答
Draggable Icon