NLP-使用tensorflow构建神经网络——嵌入层细节部分说明

xiaoxiao2021-02-28 29

背景介绍

本文内容是使用CNN进行文本分类（垃圾邮件分类），代码来源：https://github.com/dennybritz/cnn-text-classification-tf，github上还有许多相似代码，可自行查找。主要记录文本分类问题中使用tensorflow进行神经网络构建的内容。

一、初始化

def __init__( self, sequence_length, num_classes, vocab_size, embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.01):#定义需要输入的参数 #占位符，表示计算图中的某个节点，但没有具体的值 self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x") self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y") self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob") # Keeping track of l2 regularization loss (optional) l2_loss = tf.constant(0.0)#l2正则初始化

备注：tf.constant()

tf.constant(value, dtype=None, shape=None, name='Const') tensorflow中用于初始化常量tensor，可以使用shape指定形状，使用name进行命名。

import tensorflow as tf a= tf.constant([1, 2, 3, 4, 5, 6, 7]) b= tf.constant(2, shape=[2, 3],dtype=tf.int64) c= tf.constant(1.0, dtype=tf.float32,name='ccc') d= tf.constant([2,3],shape=[3,4]) with tf.Session() as sess: print(sess.run(a)) print(sess.run(b)) print(sess.run(c)) print(sess.run(d)) #结果： [1 2 3 4 5 6 7] [[2 2 2] [2 2 2]] 1.0 [[2 3 3 3] [3 3 3 3] [3 3 3 3]]

二、嵌入层

# 嵌入层 with tf.device('cpu:0'), tf.name_scope("embedding"): #使用设备cpu封装一个名称为’embedding‘的模块 self.W = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),name="W") #生成一个vocab_size*embedding_size大小的随机矩阵，范围在-1和1之间，用于最后的词向量 self.embedded_chars = tf.nn.embedding_lookup(self.W,self.input_x) #得到与输入input_x对应的词向量 self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)#增加一个维度，使得输入的词向量分割开

备注：tf.random_uniform()；tf.embedding_lookup()；tf.expand_dims()

tf.random_uniform(shape,minval=0,maxval=None,dtype,seed=None,name=None)：返回一个形状为shape的tensor，其中的元素服从minval和maxval之间的均匀分布。

import tensorflow as tf a= tf.random_uniform([3,2],-1,1) with tf.Session() as sess: print(sess.run(a)) #输出： [[-0.01047087 0.12174773] [ 0.3610773 -0.50015736] [-0.23407054 -0.6105659 ]]

tf.nn.embedding_lookup(params,ids,partition_strategy=’mod’,name=None, validate_indices=True,max_norm=None)：params表示嵌入张量，ids表示要在params中查找的id，partition_strategy表示分区策略，partition_strategy =mod表示每个 id 按照 p = id % len(params) 分割，partition_strategy=div表示每个id连续地进行分割，默认是mod。

简单的示例： import tensorflow as tf import numpy as np a = np.random.random([4, 2]) print(a) b=tf.nn.embedding_lookup(a,[0,2]) with tf.Session() as sess: print(sess.run(b)) #输出： [[0.69736108 0.11053311] [0.02707362 0.33784743] [0.43863287 0.28340953] [0.44733158 0.42539279]] --- [[0.69736108 0.11053311] [0.43863287 0.28340953]] 通过会话喂数据： a = np.random.random([4, 2]) input_ids=tf.placeholder(dtype=tf.int32) b=tf.nn.embedding_lookup(a,input_ids) with tf.Session() as sess: print(sess.run(b,feed_dict={input_ids:[0,2]}))

tf.expand_dims(a, axis)：增加一个维度，a是待增加维度的数组，axis对应维度增加的位置。

针对一维数组 import numpy as np a=np.array([1,2,3]) b=np.expand_dims(a,0) c=np.expand_dims(a,1)#与axis=-1的结果相同 d=np.expand_dims(a,-1)#-1表示最后一个位置 print(a) print(a.shape) print(b) print(b.shape) print(c) print(c.shape) print(d) print(d.shape) #输出: [1 2 3] (3,) [[1 2 3]] (1, 3) [[1] [2] [3]] (3, 1) [[1] [2] [3]] (3, 1) 针对二维数组 import numpy as np a=np.array([[1,2,3],[4,5,6]]) b=np.expand_dims(a,0) c=np.expand_dims(a,1) d=np.expand_dims(a,-1) print(a) print(a.shape) print(b) print(b.shape) print(c) print(c.shape) print(d) print(d.shape) #输出： [[1 2 3] [4 5 6]] (2, 3) [[[1 2 3] [4 5 6]]] (1, 2, 3) [[[1 2 3]] [[4 5 6]]] (2, 1, 3) [[[1] [2] [3]] [[4] [5] [6]]] (2, 3, 1)

下一篇内容：NLP-使用tensorflow构建神经网络——卷积层和池化层细节说明

转载请注明原文地址: https://www.6miu.com/read-2613239.html

技术

最新回复(0)