hi,你好!欢迎访问本站!登录
本站由网站地图腾讯云宝塔系统阿里云强势驱动
当前位置:首页 - 教程 - 杂谈 - 正文 君子好学,自强不息!

TensorFlow Object Detection API中的Faster R-CNN /SSD模子参数调解

2019-11-18杂谈搜奇网30°c
A+ A-

关于TensorFlow Object Detection API设置,能够参考之前的文章https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73

在本文中,我将议论怎样变动预练习模子的设置。本文的目的是您能够依据您的应用顺序设置TensorFlow/models,而API将不再是一个黑盒!

本文的概述:

  • 相识协定缓冲区和proto文件。
  • 应用proto文件学问,我们怎样相识模子设置文件
  • 遵照3个步骤来更新模子的参数
  • 其他示例:
  1. 变动分量初始值设定项
  2. 变动体重优化器
  3. 评价预练习模子

协定缓冲区

要修正模子,我们须要相识它的内部机制。TensorFlow对象检测API运用协定缓冲区Protocol Buffers),这是与言语无关,与平台无关且可扩大的机制,用于序列化结构化数据。就像XML范围较小,但更快,更简朴。API运用协定缓冲区言语proto2版本。我将尝试诠释更新预设置模子所需的言语。有关协定缓冲区言语的更多详细信息,请参阅此文档Python教程

协定缓冲区的事情可分为以下三个步骤:

  • .proto文件中定义音讯花样该文件的行动就像一切音讯的蓝图一样,它显现音讯所接收的一切参数是什么,参数的数据范例应当是什么,参数是必须的照样可选的,参数的标暗号是什么,什么是参数的默认值等。API的protos文件可在此处找到为了明白,我运用grid_anchor_generator.proto文件。
  • syntax = "proto2";
    
    package object_detection.protos;
    
    // Configuration proto for GridAnchorGenerator. See
    // anchor_generators/grid_anchor_generator.py for details.
    message GridAnchorGenerator {
       // Anchor height in pixels.
      optional int32 height = 1 [default = 256];
    
      // Anchor width in pixels.
      optional int32 width = 2 [default = 256];
    
      // Anchor stride in height dimension in pixels.
      optional int32 height_stride = 3 [default = 16];
    
      // Anchor stride in width dimension in pixels.
      optional int32 width_stride = 4 [default = 16];
    
      // Anchor height offset in pixels.
      optional int32 height_offset = 5 [default = 0];
    
      // Anchor width offset in pixels.
      optional int32 width_offset = 6 [default = 0];
    
      // At any given location, len(scales) * len(aspect_ratios) anchors are
      // generated with all possible combinations of scales and aspect ratios.
    
      // List of scales for the anchors.
      repeated float scales = 7;
    
      // List of aspect ratios for the anchors.
      repeated float aspect_ratios = 8;
    }

    它是从线30-33的参数明白scales,并aspect_ratios是强制性的音讯GridAnchorGenerator,而参数的其余部份都是可选的,假如不经由过程,将采用默认值。

    • 定义音讯花样后,我们须要编译协定缓冲区。该编译器将从文件生成类.proto文件。在装置API的过程当中,我们运转了以下敕令,该敕令将编译协定缓冲区:
    • # From tensorflow/models/research/
      protoc object_detection/protos/*.proto --python_out=.
      • 在定义和编译协定缓冲区以后,我们须要运用Python协定缓冲区API来写入和读取音讯。在我们的例子中,我们能够将设置文件视为协定缓冲区API,它能够在不斟酌TensorFlow API的内部机制的情况下写入和读取音讯。换句话说,我们能够经由过程适当地变动设置文件来更新预练习模子的参数。
      • 相识设置文件

        显著,设置文件能够协助我们依据须要变动模子的参数。弹出的下一个题目是怎样变动模子的参数?本节和下一部份将回覆这个题目,在这里proto文件的学问将很轻易。出于演示目的,我正在运用faster_rcnn_resnet50_pets.config文件。

      • # Faster R-CNN with Resnet-50 (v1), configured for Oxford-IIIT Pets Dataset.
        # Users should configure the fine_tune_checkpoint field in the train config as
        # well as the label_map_path and input_path fields in the train_input_reader and
        # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
        # should be configured.
        
        model {
          faster_rcnn {
            num_classes: 37
            image_resizer {
              keep_aspect_ratio_resizer {
                min_dimension: 600
                max_dimension: 1024
              }
            }
            feature_extractor {
              type: 'faster_rcnn_resnet50'
              first_stage_features_stride: 16
            }
            first_stage_anchor_generator {
              grid_anchor_generator {
                scales: [0.25, 0.5, 1.0, 2.0]
                aspect_ratios: [0.5, 1.0, 2.0]
                height_stride: 16
                width_stride: 16
              }
            }
            first_stage_box_predictor_conv_hyperparams {
              op: CONV
              regularizer {
                l2_regularizer {
                  weight: 0.0
                }
              }
              initializer {
                truncated_normal_initializer {
                  stddev: 0.01
                }
              }
            }
            first_stage_nms_score_threshold: 0.0
            first_stage_nms_iou_threshold: 0.7
            first_stage_max_proposals: 300
            first_stage_localization_loss_weight: 2.0
            first_stage_objectness_loss_weight: 1.0
            initial_crop_size: 14
            maxpool_kernel_size: 2
            maxpool_stride: 2
            second_stage_box_predictor {
              mask_rcnn_box_predictor {
                use_dropout: false
                dropout_keep_probability: 1.0
                fc_hyperparams {
                  op: FC
                  regularizer {
                    l2_regularizer {
                      weight: 0.0
                    }
                  }
                  initializer {
                    variance_scaling_initializer {
                      factor: 1.0
                      uniform: true
                      mode: FAN_AVG
                    }
                  }
                }
              }
            }
            second_stage_post_processing {
              batch_non_max_suppression {
                score_threshold: 0.0
                iou_threshold: 0.6
                max_detections_per_class: 100
                max_total_detections: 300
              }
              score_converter: SOFTMAX
            }
            second_stage_localization_loss_weight: 2.0
            second_stage_classification_loss_weight: 1.0
          }
        }
        
        train_config: {
          batch_size: 1
          optimizer {
            momentum_optimizer: {
              learning_rate: {
                manual_step_learning_rate {
                  initial_learning_rate: 0.0003
                  schedule {
                    step: 900000
                    learning_rate: .00003
                  }
                  schedule {
                    step: 1200000
                    learning_rate: .000003
                  }
                }
              }
              momentum_optimizer_value: 0.9
            }
            use_moving_average: false
          }
          gradient_clipping_by_norm: 10.0
          fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
          from_detection_checkpoint: true
          # Note: The below line limits the training process to 200K steps, which we
          # empirically found to be sufficient enough to train the pets dataset. This
          # effectively bypasses the learning rate schedule (the learning rate will
          # never decay). Remove the below line to train indefinitely.
          num_steps: 200000
          data_augmentation_options {
            random_horizontal_flip {
            }
          }
          max_number_of_boxes: 50
        }
        
        train_input_reader: {
          tf_record_input_reader {
            input_path: "PATH_TO_BE_CONFIGURED/pet_train.record"
          }
          label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
        }
        
        eval_config: {
          num_examples: 2000
          # Note: The below line limits the evaluation process to 10 evaluations.
          # Remove the below line to evaluate indefinitely.
          max_evals: 10
        }
        
        eval_input_reader: {
          tf_record_input_reader {
            input_path: "PATH_TO_BE_CONFIGURED/pet_val.record"
          }
          label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
          shuffle: false
          num_readers: 1
        }

        第7至10行示意这num_classesfaster_rcnnmessage 的参数之一,而后者又是message的参数model一样,optimizer是父train_config音讯的子音讯,而message的batch_size另一个参数train_config我们能够经由过程签出响应的protos文件来考证这一点

      • syntax = "proto2";
        
        package object_detection.protos;
        
        import "object_detection/protos/anchor_generator.proto";
        import "object_detection/protos/box_predictor.proto";
        import "object_detection/protos/hyperparams.proto";
        import "object_detection/protos/image_resizer.proto";
        import "object_detection/protos/losses.proto";
        import "object_detection/protos/post_processing.proto";
        
        // Configuration for Faster R-CNN models.
        // See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py
        //
        // Naming conventions:
        // Faster R-CNN models have two stages: a first stage region proposal network
        // (or RPN) and a second stage box classifier.  We thus use the prefixes
        // `first_stage_` and `second_stage_` to indicate the stage to which each
        // parameter pertains when relevant.
        message FasterRcnn {
        
          // Whether to construct only the Region Proposal Network (RPN).
          optional int32 number_of_stages = 1 [default=2];
        
          // Number of classes to predict.
          optional int32 num_classes = 3;
          
          // Image resizer for preprocessing the input image.
          optional ImageResizer image_resizer = 4;

        从第20行和第26行能够显著看出,这num_classesoptional音讯参数之一faster_rcnn我愿望到目前为止的议论有助于明白设置文件的构造。如今,是时刻准确更新模子的参数之一了。

      • 步骤1:确定要更新的参数

        假定我们须要更新fast_rcnn_resnet50_pets.config文件的image_resizer第10行中提到的参数

        步骤2:在存储库中搜刮给定参数

        目的是找到proto参数文件。为此,我们须要在存储库中搜刮。

      •  

         我们须要搜刮以下代码:

      • parameter_name path:research/object_detection/protos
        #in our case parameter_name="image_resizer" thus,
        image_resizer path:research/object_detection/protos

        在此path:research/object_detection/protos限定搜刮域。此处能够找到有关怎样在GitHub上搜刮的更多信息搜刮的输出image_resizer path:research/object_detection/protos以下所示:

      •  

        从输出中很显著,要更新image_resizer参数,我们须要剖析image_resizer.proto文件。

        步骤3:剖析proto档案

         

        syntax = "proto2";
        
        package object_detection.protos;
        
        // Configuration proto for image resizing operations.
        // See builders/image_resizer_builder.py for details.
        message ImageResizer {
          oneof image_resizer_oneof {
            KeepAspectRatioResizer keep_aspect_ratio_resizer = 1;
            FixedShapeResizer fixed_shape_resizer = 2;
          }
        }
        
        // Enumeration type for image resizing methods provided in TensorFlow.
        enum ResizeType {
          BILINEAR = 0; // Corresponds to tf.image.ResizeMethod.BILINEAR
          NEAREST_NEIGHBOR = 1; // Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
          BICUBIC = 2; // Corresponds to tf.image.ResizeMethod.BICUBIC
          AREA = 3; // Corresponds to tf.image.ResizeMethod.AREA
        }
        
        // Configuration proto for image resizer that keeps aspect ratio.
        message KeepAspectRatioResizer {
          // Desired size of the smaller image dimension in pixels.
          optional int32 min_dimension = 1 [default = 600];
        
          // Desired size of the larger image dimension in pixels.
          optional int32 max_dimension = 2 [default = 1024];
        
          // Desired method when resizing image.
          optional ResizeType resize_method = 3 [default = BILINEAR];
        
          // Whether to pad the image with zeros so the output spatial size is
          // [max_dimension, max_dimension]. Note that the zeros are padded to the
          // bottom and the right of the resized image.
          optional bool pad_to_max_dimension = 4 [default = false];
        
          // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
          optional bool convert_to_grayscale = 5 [default = false];
        
          // Per-channel pad value. This is only used when pad_to_max_dimension is True.
          // If unspecified, a default pad value of 0 is applied to all channels.
          repeated float per_channel_pad_value = 6;
        }
        
        // Configuration proto for image resizer that resizes to a fixed shape.
        message FixedShapeResizer {
          // Desired height of image in pixels.
          optional int32 height = 1 [default = 300];
        
          // Desired width of image in pixels.
          optional int32 width = 2 [default = 300];
        
          // Desired method when resizing image.
          optional ResizeType resize_method = 3 [default = BILINEAR];
        
          // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
          optional bool convert_to_grayscale = 4 [default = false];
        }

        从第8-10行能够看出,我们能够运用keep_aspect_ratio_resizer调解图象的大小fixed_shape_resizer在剖析行23-44,我们能够观察到的音讯keep_aspect_ratio_resizer有参数:min_dimensionmax_dimensionresize_methodpad_to_max_dimensionconvert_to_grayscale,和per_channel_pad_value另外,fixed_shape_resizer有参数:heightwidthresize_method,和convert_to_grayscaleproto文件中提到了一切参数的数据范例因而,要变动image_resizer范例,我们能够在设置文件中变动以下几行。

      • #before
        image_resizer {
        keep_aspect_ratio_resizer {
        min_dimension: 600 
        max_dimension: 1024
            }
        }
        #after
        image_resizer {
        fixed_shape_resizer {
        height: 600
        width: 500
        resize_method: AREA
          }
        }

        上面的代码将运用AREA调解大小要领将图象调解为500 * 600。TensorFlow中可用的种种调解大小的要领能够在这里找到

      • 其他例子

        我们能够运用上一节中议论的步骤更新/增加任何参数。我将在此处演示一些常常运用的示例,然则上面议论的步骤能够有助于更新/增加模子的任何参数。

        变动分量初始化器

        • 决议变动fast_rcnn_resnet50_pets.config文件的initializer第35行的参数
        • initializer path:research/object_detection/protos在存储库中搜刮依据搜刮效果,很显著我们须要剖析hyperparams.proto文件。
          • hyperparams.proto文件中的第68–74行说清楚明了initializer设置。
          • message Initializer {
              oneof initializer_oneof {
                TruncatedNormalInitializer truncated_normal_initializer = 1;
                VarianceScalingInitializer variance_scaling_initializer = 2;
                RandomNormalInitializer random_normal_initializer = 3;
              }
            }

            我们能够运用random_normal_intializer替代truncated_normal_initializer,由于我们须要剖析hyperparams.proto文件中的第99–102行

          • message RandomNormalInitializer {
            optional float mean = 1 [default = 0.0];
            optional float stddev = 2 [default = 1.0];
            }
          • 显著random_normal_intializer有两个参数meanstddev我们能够将设置文件中的以下几行变动为use random_normal_intializer
          • #before
            initializer {
                truncated_normal_initializer {
                    stddev: 0.01
                   }
            }
            #after
            initializer {
                random_normal_intializer{
                   mean: 1 
                   stddev: 0.5
                   }
            }

            变动体重优化器

            • 决议变动faster_rcnn_resnet50_pets.config文件的第87行momentum_optimizer的父音讯的参数optimizer
            • optimizer path:research/object_detection/protos在存储库中搜刮依据搜刮效果,很显著我们须要剖析optimizer.proto文件。
              • optimizer.proto文件中的9-14行,诠释optimizer设置。

               

              message Optimizer {
                oneof optimizer {
                  RMSPropOptimizer rms_prop_optimizer = 1;
                  MomentumOptimizer momentum_optimizer = 2;
                  AdamOptimizer adam_optimizer = 3;
                }

              显著,替代momentum_optimizer我们能够运用adam_optimizer已被证实是优越的优化顺序。为此,我们须要在f aster_rcnn_resnet50_pets.config文件中举行以下变动

           

          #before
          optimizer {  
            momentum_optimizer: {
                learning_rate: {
                     manual_step_learning_rate {
                    initial_learning_rate: 0.0003
                    schedule {
                      step: 900000
                      learning_rate: .00003
                    }
                    schedule {
                      step: 1200000
                      learning_rate: .000003
                    }
                  }
                }
                momentum_optimizer_value: 0.9
              }
          #after
          optimizer {
            adam_optimizer: {
                learning_rate: {
                 manual_step_learning_rate {
                    initial_learning_rate: 0.0003
                    schedule {
                      step: 900000
                      learning_rate: .00003
                    }
                    schedule {
                      step: 1200000
                      learning_rate: .000003
                    }
                  }
                }
              }

          评价预练习模子

          Eval守候300秒,以搜检练习模子是不是已更新!假如您的GPU不错,那末您能够同时举行练习和评价!一般,资本将被耗尽。为了战胜这个题目,我们能够先练习模子,将其保存在目次中,然后再评价模子。为了稍后举行评价,我们须要在设置文件中举行以下变动:

        • #Before
          eval_config: {
            num_examples: 2000
            # Note: The below line limits the evaluation process to 10 evaluations.
            # Remove the below line to evaluate indefinitely.
            max_evals: 10
          }
          #after
          eval_config: {
          num_examples: 10
          num_visualizations: 10
          eval_interval_secs: 0
          }

          num_visualizations应当即是要评价的数目!可视化的数目越多,评价所需的时候就越多。假如您的GPU具有充足的才能同时举行练习和评价,则能够保存eval_interval_secs: 300此参数决议运转评价的频次。我根据上面议论的3个步骤得出了这个结论。

          简而言之,协定缓冲区的学问协助我们明白了模子参数是以音讯情势通报的,而且能够更新我们能够援用的.proto文件的参数议论了3个简朴的步骤来找到.proto用于更新参数的准确文件。

          请在解释的设置文件中说起您要更新/增加的任何参数。

        • 关注【OpenCV与AI深度进修】取得更多资讯

          扫描下面二维码即可关注

  选择打赏方式
微信赞助

打赏

QQ钱包

打赏

支付宝赞助

打赏

  移步手机端
TensorFlow Object Detection API中的Faster R-CNN /SSD模子参数调解

1、打开你手机的二维码扫描APP
2、扫描左则的二维码
3、点击扫描获得的网址
4、可以在手机端阅读此文章
未定义标签

本文来源:搜奇网

本文地址:https://www.sou7.cn/282333.html

关注我们:微信搜索“搜奇网”添加我为好友

版权声明: 本文仅代表作者个人观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。请记住本站网址https://www.sou7.cn/搜奇网。

发表评论

选填

必填

必填

选填

请拖动滑块解锁
>>