美文网首页
2018-08-10 CNN-convoutional

2018-08-10 CNN-convoutional

作者: 镜中无我 | 来源:发表于2019-02-18 17:15 被阅读0次

compared to fully-connected neural networks, convolutional ones performs better in image recognition due to the different structures between adjacent layers

structure:

input: the primitive pixels of image which are 3-D matrice
output: the confidence of classification

  • input layer: pixel matrice with RGB depth
  • convolutional layer: multiple nodes with input of a block in past layer,which means to extracting deeper feature
  • pooling: won't change the depth of last layer but will tighten the scale
  • fully-connected: for classifying by feeding the given features
  • softmax: for classifying via obtaining the probability of every class

convolutional layer(filter or kernel)

processed block is with the same scale as filter's
filter depth
output matrice scale :
out_length=(in_length-fil_length+1)/stride_length
out_width=(in_width-fil_width+1)/stride_width
filter_parameter_amounts=fil_widthfil_length_in_depthfil_depth

fil_weights=tf.get_variable('weigths',[fil_length,fil_width,in_depth,fil_depth],initializer=tf...)
biases=tf.get_variable('biases',[fil_depth],initializer=tf...)
#conv2d is used for forward-prob.
conv=tf.nn.conv2d(input,filter_weight,strides=[1,len_stride,wid_stride,1], padding='SAME') #'VAILD' means no zeros-
 padding
#bias_add is used for adding biases,note do not directly add
bias=tf.nn.bias_add(conv,bias)
activation=tf.nn.relu(bias)

pooling layer

usage: make the scale shrink to enhance the computational speed and avoid over-fitting

max pooling

simply maximizing

#similarly to convolutional operation, you have to set the strides and padding,but stride in depth is valid for pooling #compared with convolutional layer. ksize is the scale of filter 
pool=tf.nn.max_pool(activation,ksize=[1,fil_len,fil-wid,1], stride=[1,len_stride,wid_stride,1], padding ='SAME')

average pooling

classical models

LeNet-5
  • first layer: convolutional layer
    input: 32321
    filter:556,padding='VALID'
    stride: [1,1,1,1]
    output:28286
  • second layer: pooling
    input : output of convolutional layer
    filter:[1,2,2,1]
    stride:[1,2,2,1]
    output:14146
  • third layer:convoluted layer
    input: output of last layer
    filter:5516,padding='VALID'
    stride:[1,1,1,1]
    output: 101016
  • fourth layer: pooling layer
    input: output of last layer
    filter:22
    stride:[1,2,2,1]
    output: 5
    5*16
  • fifth layer : fully-connected(similar to convoluted layer)
    input: output of last layer
    filter:55
    output:120
    para.:5
    516120+120
  • sixth layer: fully-connected
    input: output of last layer
    output: 84
    para.:120*84+84
  • seventh layer: fully-connected
    input: output of last layer
    output: 10
    para.:84*10+10
xs=tf.palceholder(tf.float32,[Batch_size,mnist_inference.IMAGE_SIZE,mnist_inference.IMAGE_SIZE,mnist_inference.NUM_CHANNELS],name='x-input')
reshaped_xs=np.reshape(xs,(Batch_size,mnist_inference.IMAGE_SIZE,mnist_inference.IMAGE_SIZE,mnist_inference.NUM_CHANNELS))
def inference(tensor,train,regularizer)
     with tf.variable_scope('layer1-conv1'):
            conv1_weights=tf.get_variable('weights',[CONV1_SIZE,CONV1_SIZE,NUM_CHANNELS,CONV1_DEEP],initializer=...)
            conv1_biases=tf...
            conv1=tf.nn.conv2d(...)
            relu1=tf.nn.relu(tf.nn.bias_add(...))
     with tf.name_scope('layer2-pool1'):
            pool1=tf.nn.max_pool(relu1,ksize=...,strides=...,padding=...)
     with tf.variable_scope(layer3-conv2):
            ...
     with tf.name_scope(...):
            pool2=...
     # reshape the data form to prepare for next fully-connected layer
     pool_shape=pool2.get_shape().as_list()
     nodes=pool_shape[1]*pool_shape[2]*pool_shape[3]
     reshaped=tf.reshape(pool2,pool_shape[0],nodes])
     with tf.variable_scope('layer5-fc1'):
            fc1_weights=tf.get_variable('weights',[nodes,FC_SIZE],initializer=...)
            if regularizer!=None:
                tf.add_to_collection('losses',regularizer(fc1_weights))
            fc1_biases=tf.get_variable('bias',[FC_SIZE],initializer=...)
            fc1=tf.nn.relu(...)
            if train:fc1=tf.nn.dropout(fc1,0.5)
     with ...
            ...
            logit=tf.matnul(fc1,fc2_weights)+fc2_biases
     return logit

note:input->(convoluted+->pooling?)->fully-connected->softmax->out

Inception-v3

core method
in convoluted layer, three different kernels are provided to simultaneously process the input and then accumulate them together
for that,we set the stride to 1 and padding to 'SAME'

# predetermine  the para. of some methods
with slim.arg_scope([slim.conv2d,slim.max_pool2d,slim.avg_pool2d],stride=1,padding='SAME'):
       # inception module namespace
       with tf.variable_scope('...'):
              #for every path
              with tf.variable_scope('...1'):
              with tf.variable_scope('...2'):
              with tf.variable_scope('...3'):
       net=tf.concat(3,[...1,...2,...3])

相关文章

网友评论

      本文标题:2018-08-10 CNN-convoutional

      本文链接:https://www.haomeiwen.com/subject/ifelbftx.html