TensorFlow Object Detection API中的Faster R-CNN /SSD模子参数调解
2019-11-18杂谈搜奇网37°c
A+ A-关于TensorFlow Object Detection API设置,能够参考之前的文章https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73
在本文中,我将议论怎样变动预练习模子的设置。本文的目的是您能够依据您的应用顺序设置TensorFlow/models,而API将不再是一个黑盒!
本文的概述:
- 相识协定缓冲区和
proto
文件。 - 应用
proto
文件学问,我们怎样相识模子的设置文件 - 遵照3个步骤来更新模子的参数
- 其他示例:
- 变动分量初始值设定项
- 变动体重优化器
- 评价预练习模子
协定缓冲区
要修正模子,我们须要相识它的内部机制。TensorFlow对象检测API运用协定缓冲区(Protocol Buffers),这是与言语无关,与平台无关且可扩大的机制,用于序列化结构化数据。就像XML范围较小,但更快,更简朴。API运用协定缓冲区言语的proto2版本。我将尝试诠释更新预设置模子所需的言语。有关协定缓冲区言语的更多详细信息,请参阅此文档和Python教程。
协定缓冲区的事情可分为以下三个步骤:
- 在
.proto
文件中定义音讯花样。该文件的行动就像一切音讯的蓝图一样,它显现音讯所接收的一切参数是什么,参数的数据范例应当是什么,参数是必须的照样可选的,参数的标暗号是什么,什么是参数的默认值等。API的protos文件可在此处找到。为了明白,我运用grid_anchor_generator.proto文件。 -
syntax = "proto2"; package object_detection.protos; // Configuration proto for GridAnchorGenerator. See // anchor_generators/grid_anchor_generator.py for details. message GridAnchorGenerator { // Anchor height in pixels. optional int32 height = 1 [default = 256]; // Anchor width in pixels. optional int32 width = 2 [default = 256]; // Anchor stride in height dimension in pixels. optional int32 height_stride = 3 [default = 16]; // Anchor stride in width dimension in pixels. optional int32 width_stride = 4 [default = 16]; // Anchor height offset in pixels. optional int32 height_offset = 5 [default = 0]; // Anchor width offset in pixels. optional int32 width_offset = 6 [default = 0]; // At any given location, len(scales) * len(aspect_ratios) anchors are // generated with all possible combinations of scales and aspect ratios. // List of scales for the anchors. repeated float scales = 7; // List of aspect ratios for the anchors. repeated float aspect_ratios = 8; }
它是从线30-33的参数明白
scales
,并aspect_ratios
是强制性的音讯GridAnchorGenerator
,而参数的其余部份都是可选的,假如不经由过程,将采用默认值。- 定义音讯花样后,我们须要编译协定缓冲区。该编译器将从文件生成类
.proto
文件。在装置API的过程当中,我们运转了以下敕令,该敕令将编译协定缓冲区: -
# From tensorflow/models/research/ protoc object_detection/protos/*.proto --python_out=.
- 在定义和编译协定缓冲区以后,我们须要运用Python协定缓冲区API来写入和读取音讯。在我们的例子中,我们能够将设置文件视为协定缓冲区API,它能够在不斟酌TensorFlow API的内部机制的情况下写入和读取音讯。换句话说,我们能够经由过程适当地变动设置文件来更新预练习模子的参数。
-
相识设置文件
显著,设置文件能够协助我们依据须要变动模子的参数。弹出的下一个题目是怎样变动模子的参数?本节和下一部份将回覆这个题目,在这里
proto
文件的学问将很轻易。出于演示目的,我正在运用faster_rcnn_resnet50_pets.config文件。 -
# Faster R-CNN with Resnet-50 (v1), configured for Oxford-IIIT Pets Dataset. # Users should configure the fine_tune_checkpoint field in the train config as # well as the label_map_path and input_path fields in the train_input_reader and # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that # should be configured. model { faster_rcnn { num_classes: 37 image_resizer { keep_aspect_ratio_resizer { min_dimension: 600 max_dimension: 1024 } } feature_extractor { type: 'faster_rcnn_resnet50' first_stage_features_stride: 16 } first_stage_anchor_generator { grid_anchor_generator { scales: [0.25, 0.5, 1.0, 2.0] aspect_ratios: [0.5, 1.0, 2.0] height_stride: 16 width_stride: 16 } } first_stage_box_predictor_conv_hyperparams { op: CONV regularizer { l2_regularizer { weight: 0.0 } } initializer { truncated_normal_initializer { stddev: 0.01 } } } first_stage_nms_score_threshold: 0.0 first_stage_nms_iou_threshold: 0.7 first_stage_max_proposals: 300 first_stage_localization_loss_weight: 2.0 first_stage_objectness_loss_weight: 1.0 initial_crop_size: 14 maxpool_kernel_size: 2 maxpool_stride: 2 second_stage_box_predictor { mask_rcnn_box_predictor { use_dropout: false dropout_keep_probability: 1.0 fc_hyperparams { op: FC regularizer { l2_regularizer { weight: 0.0 } } initializer { variance_scaling_initializer { factor: 1.0 uniform: true mode: FAN_AVG } } } } } second_stage_post_processing { batch_non_max_suppression { score_threshold: 0.0 iou_threshold: 0.6 max_detections_per_class: 100 max_total_detections: 300 } score_converter: SOFTMAX } second_stage_localization_loss_weight: 2.0 second_stage_classification_loss_weight: 1.0 } } train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0003 schedule { step: 900000 learning_rate: .00003 } schedule { step: 1200000 learning_rate: .000003 } } } momentum_optimizer_value: 0.9 } use_moving_average: false } gradient_clipping_by_norm: 10.0 fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" from_detection_checkpoint: true # Note: The below line limits the training process to 200K steps, which we # empirically found to be sufficient enough to train the pets dataset. This # effectively bypasses the learning rate schedule (the learning rate will # never decay). Remove the below line to train indefinitely. num_steps: 200000 data_augmentation_options { random_horizontal_flip { } } max_number_of_boxes: 50 } train_input_reader: { tf_record_input_reader { input_path: "PATH_TO_BE_CONFIGURED/pet_train.record" } label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt" } eval_config: { num_examples: 2000 # Note: The below line limits the evaluation process to 10 evaluations. # Remove the below line to evaluate indefinitely. max_evals: 10 } eval_input_reader: { tf_record_input_reader { input_path: "PATH_TO_BE_CONFIGURED/pet_val.record" } label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt" shuffle: false num_readers: 1 }
第7至10行示意这
num_classes
是faster_rcnn
message 的参数之一,而后者又是message的参数model
。一样,optimizer
是父train_config
音讯的子音讯,而message的batch_size
另一个参数train_config
。我们能够经由过程签出响应的protos文件来考证这一点。 -
syntax = "proto2"; package object_detection.protos; import "object_detection/protos/anchor_generator.proto"; import "object_detection/protos/box_predictor.proto"; import "object_detection/protos/hyperparams.proto"; import "object_detection/protos/image_resizer.proto"; import "object_detection/protos/losses.proto"; import "object_detection/protos/post_processing.proto"; // Configuration for Faster R-CNN models. // See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py // // Naming conventions: // Faster R-CNN models have two stages: a first stage region proposal network // (or RPN) and a second stage box classifier. We thus use the prefixes // `first_stage_` and `second_stage_` to indicate the stage to which each // parameter pertains when relevant. message FasterRcnn { // Whether to construct only the Region Proposal Network (RPN). optional int32 number_of_stages = 1 [default=2]; // Number of classes to predict. optional int32 num_classes = 3; // Image resizer for preprocessing the input image. optional ImageResizer image_resizer = 4;
从第20行和第26行能够显著看出,这
num_classes
是optional
音讯的参数之一faster_rcnn
。我愿望到目前为止的议论有助于明白设置文件的构造。如今,是时刻准确更新模子的参数之一了。 -
步骤1:确定要更新的参数
假定我们须要更新fast_rcnn_resnet50_pets.config文件的
image_resizer
第10行中提到的参数。步骤2:在存储库中搜刮给定参数
目的是找到
proto
参数文件。为此,我们须要在存储库中搜刮。 我们须要搜刮以下代码:
-
parameter_name path:research/object_detection/protos #in our case parameter_name="image_resizer" thus, image_resizer path:research/object_detection/protos
在此
path:research/object_detection/protos
限定搜刮域。在此处能够找到有关怎样在GitHub上搜刮的更多信息。搜刮的输出image_resizer path:research/object_detection/protos
以下所示: -
从输出中很显著,要更新
image_resizer
参数,我们须要剖析image_resizer.proto
文件。步骤3:剖析
proto
档案syntax = "proto2"; package object_detection.protos; // Configuration proto for image resizing operations. // See builders/image_resizer_builder.py for details. message ImageResizer { oneof image_resizer_oneof { KeepAspectRatioResizer keep_aspect_ratio_resizer = 1; FixedShapeResizer fixed_shape_resizer = 2; } } // Enumeration type for image resizing methods provided in TensorFlow. enum ResizeType { BILINEAR = 0; // Corresponds to tf.image.ResizeMethod.BILINEAR NEAREST_NEIGHBOR = 1; // Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR BICUBIC = 2; // Corresponds to tf.image.ResizeMethod.BICUBIC AREA = 3; // Corresponds to tf.image.ResizeMethod.AREA } // Configuration proto for image resizer that keeps aspect ratio. message KeepAspectRatioResizer { // Desired size of the smaller image dimension in pixels. optional int32 min_dimension = 1 [default = 600]; // Desired size of the larger image dimension in pixels. optional int32 max_dimension = 2 [default = 1024]; // Desired method when resizing image. optional ResizeType resize_method = 3 [default = BILINEAR]; // Whether to pad the image with zeros so the output spatial size is // [max_dimension, max_dimension]. Note that the zeros are padded to the // bottom and the right of the resized image. optional bool pad_to_max_dimension = 4 [default = false]; // Whether to also resize the image channels from 3 to 1 (RGB to grayscale). optional bool convert_to_grayscale = 5 [default = false]; // Per-channel pad value. This is only used when pad_to_max_dimension is True. // If unspecified, a default pad value of 0 is applied to all channels. repeated float per_channel_pad_value = 6; } // Configuration proto for image resizer that resizes to a fixed shape. message FixedShapeResizer { // Desired height of image in pixels. optional int32 height = 1 [default = 300]; // Desired width of image in pixels. optional int32 width = 2 [default = 300]; // Desired method when resizing image. optional ResizeType resize_method = 3 [default = BILINEAR]; // Whether to also resize the image channels from 3 to 1 (RGB to grayscale). optional bool convert_to_grayscale = 4 [default = false]; }
从第8-10行能够看出,我们能够运用
keep_aspect_ratio_resizer
或调解图象的大小fixed_shape_resizer
。在剖析行23-44,我们能够观察到的音讯keep_aspect_ratio_resizer
有参数:min_dimension
,max_dimension
,resize_method
,pad_to_max_dimension
,convert_to_grayscale
,和per_channel_pad_value
。另外,fixed_shape_resizer
有参数:height
,width
,resize_method
,和convert_to_grayscale
。proto
文件中提到了一切参数的数据范例。因而,要变动image_resizer
范例,我们能够在设置文件中变动以下几行。 -
#before image_resizer { keep_aspect_ratio_resizer { min_dimension: 600 max_dimension: 1024 } } #after image_resizer { fixed_shape_resizer { height: 600 width: 500 resize_method: AREA } }
上面的代码将运用AREA调解大小要领将图象调解为500 * 600。TensorFlow中可用的种种调解大小的要领能够在这里找到。
-
其他例子
我们能够运用上一节中议论的步骤更新/增加任何参数。我将在此处演示一些常常运用的示例,然则上面议论的步骤能够有助于更新/增加模子的任何参数。
变动分量初始化器
- 决议变动fast_rcnn_resnet50_pets.config文件的
initializer
第35行的参数。 initializer path:research/object_detection/protos
在存储库中搜刮。依据搜刮效果,很显著我们须要剖析hyperparams.proto
文件。-
- hyperparams.proto文件中的第68–74行说清楚明了
initializer
设置。 -
message Initializer { oneof initializer_oneof { TruncatedNormalInitializer truncated_normal_initializer = 1; VarianceScalingInitializer variance_scaling_initializer = 2; RandomNormalInitializer random_normal_initializer = 3; } }
我们能够运用
random_normal_intializer
替代truncated_normal_initializer
,由于我们须要剖析hyperparams.proto文件中的第99–102行。 - message RandomNormalInitializer {
optional float mean = 1 [default = 0.0];
optional float stddev = 2 [default = 1.0];
} - 显著
random_normal_intializer
有两个参数mean
和stddev
。我们能够将设置文件中的以下几行变动为userandom_normal_intializer
。 -
#before initializer { truncated_normal_initializer { stddev: 0.01 } } #after initializer { random_normal_intializer{ mean: 1 stddev: 0.5 } }
变动体重优化器
- 决议变动faster_rcnn_resnet50_pets.config文件的第87行
momentum_optimizer
的父音讯的参数。optimizer
optimizer path:research/object_detection/protos
在存储库中搜刮。依据搜刮效果,很显著我们须要剖析optimizer.proto
文件。-
- optimizer.proto文件中的9-14行,诠释
optimizer
设置。
message Optimizer { oneof optimizer { RMSPropOptimizer rms_prop_optimizer = 1; MomentumOptimizer momentum_optimizer = 2; AdamOptimizer adam_optimizer = 3; }
显著,替代
momentum_optimizer
我们能够运用adam_optimizer
已被证实是优越的优化顺序。为此,我们须要在f aster_rcnn_resnet50_pets.config文件中举行以下变动。 - optimizer.proto文件中的9-14行,诠释
- 决议变动faster_rcnn_resnet50_pets.config文件的第87行
#before optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0003 schedule { step: 900000 learning_rate: .00003 } schedule { step: 1200000 learning_rate: .000003 } } } momentum_optimizer_value: 0.9 } #after optimizer { adam_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0003 schedule { step: 900000 learning_rate: .00003 } schedule { step: 1200000 learning_rate: .000003 } } } }
评价预练习模子
Eval守候300秒,以搜检练习模子是不是已更新!假如您的GPU不错,那末您能够同时举行练习和评价!一般,资本将被耗尽。为了战胜这个题目,我们能够先练习模子,将其保存在目次中,然后再评价模子。为了稍后举行评价,我们须要在设置文件中举行以下变动:
- hyperparams.proto文件中的第68–74行说清楚明了
-
#Before eval_config: { num_examples: 2000 # Note: The below line limits the evaluation process to 10 evaluations. # Remove the below line to evaluate indefinitely. max_evals: 10 } #after eval_config: { num_examples: 10 num_visualizations: 10 eval_interval_secs: 0 }
num_visualizations
应当即是要评价的数目!可视化的数目越多,评价所需的时候就越多。假如您的GPU具有充足的才能同时举行练习和评价,则能够保存eval_interval_secs: 300
。此参数决议运转评价的频次。我根据上面议论的3个步骤得出了这个结论。简而言之,协定缓冲区的学问协助我们明白了模子参数是以音讯情势通报的,而且能够更新我们能够援用的
.proto
文件的参数。议论了3个简朴的步骤来找到.proto
用于更新参数的准确文件。请在解释的设置文件中说起您要更新/增加的任何参数。
-
关注【OpenCV与AI深度进修】取得更多资讯
扫描下面二维码即可关注
- 决议变动fast_rcnn_resnet50_pets.config文件的
- 定义音讯花样后,我们须要编译协定缓冲区。该编译器将从文件生成类