matlab&nbsp;中的textscan

xiaoxiao2021-02-28 104

高大上的matlab，c 跪了原文地址：matlab 中的textscan 作者：研究僧一群

下面开始textscan函数，其实基本和textread差不多，但是其加入了更多的参数，有了很多优势

一下是小小区别（来自百度知道）：

textscan更适合读入大文件； textscan可以从文件的任何位置开始读入，而textread 只能从文件开头开始读入； textscan也可以从上一次textscan结束的位置开始读入，而textread不能； textscan只返回一个数组（其实应该说是一个细胞矩阵），而textread要返回多个数组（嗯嗯）； textscan提供更多转换读入数据的选择； textscan提供给用户更多的配置参数。

咳咳，下面就是我自己的内容了，虽然原始版本是matlab帮助文件中的，但是鉴于现在没有完全的汉化版，好歹也是我自己翻译然后运行的，还有更加详细的讲解和我自己的理解内容，所以转载请注明出处

http://blog.sina.com.cn/s/blog_9e67285801010buf.html

textscan函数

功能：读格式的数据从文本文件或字符串。适用于具有多行文字说明文本数据文件。 ———————————————————————————————————————— 基本用法： C = textscan(fid, 'format') 　　 C = textscan(fid, 'format', N) 　　 C = textscan(fid, 'format', 'param', value) 　　 C = textscan(fid, 'format', N, 'param', value) 　　 C = textscan(str, ...) 　　 [C, position] = textscan(...) —————————————————————————————————————— 输入参数

fid 为fopen命令返回的文件标识符，这也是和textread的最大不同之处需要注意的一点是，fid类似一个指针，其指向的位置会随着textscan的操作而改变，参见例9

format 是一个字符串变量，表示读取数据及数据转换的规则，具体见format.txt

N 读取N次，一般为行数

输出参数

输出一个细胞数组C

基本与textread语法相同 ............................................................................. 例1： 'mydata1.txt文件如下 Sally Level1 12.34 45 1.23e10 inf Nan Yes 5.1+3i Joe Level2 23.54 60 9e19 -inf 0.001 No 2.2-.5i Bill Level3 34.90 12 2e5 10 100 No 3.1+.1i 读入：

>> fid = fopen('mydata1.txt'); C = textscan(fid, '%s%s�2�%u%f%f%s%f'); fclose(fid); >> C

C =

Columns 1 through 5

{3x1 cell} {3x1 cell} [3x1 single] [3x1 int8] [3x1 uint32]

Columns 6 through 9

[3x1 double] [3x1 double] {3x1 cell} [3x1 double] 输出C为1*9的细胞数组，每个数组中存放每列的数据 .................................................................. 例2：可设置读取长度，具体是，在%和格式符之间插入数字N，代表你要读入几个数值（有点问题）

如： >> dd='let us go'; >> ddd=textscan(dd,'')

ddd =

'let u' 对于myfileli6.txt

Sally Type1 12.34 45 Yes

Joe Type2 23.54 60 No

Bill Type1 34.90 12 No

>> fid=fopen('myfileli6.txt'); >> data=textscan(fid,'%s%s%f%f%s',3); %正常读入数据

....................注意对于数据省略操作应该跳过被省略部分.........

例： str = '0.41 8.24 3.57 6.24 9.27'; C = textscan(str, '%3.1f '); C会出现这种情况 C= 0.400000000000000 1 8.20000000000000 4 3.50000000000000 7 6.20000000000000 4 9.20000000000000 7

而 >> C = textscan(str, '%3.1f %*1d'); >> C{1}

ans =

0.4000 8.2000 3.5000 6.2000 9.2000 ...................................................... 例3：读取不同格式的数据

scan1.txt如下 09/12/2005 Level1 12.34 45 1.23e10 inf Nan Yes 5.1+3i 10/12/2005 Level2 23.54 60 9e19 -inf 0.001 No 2.2-.5i 11/12/2005 Level3 34.90 12 2e5 10 100 No 3.1+.1i

现在把scan1.txt读入 fid = fopen('scan1.dat'); C = textscan(fid, '%s %s �2 � %u %f %f %s %f'); fclose(fid);

输出C为一个1*9的细胞矩阵 C{1} = {'09/12/2005'; '10/12/2005'; '11/12/2005'} class cell C{2} = {'Level1'; 'Level2'; 'Level3'} class cell C{3} = [12.34; 23.54; 34.9] class single C{4} = [45; 60; 12] class int8 C{5} = [4294967295; 4294967295; 200000] class uint32 C{6} = [Inf; -Inf; 10] class double C{7} = [NaN; 0.001; 100] class double C{8} = {'Yes'; 'No'; 'No'} class cell C{9} = [5.1+3.0i; 2.2-0.5i; 3.1+0.1i] class double

C{5}中的4294967295指的是32位系统无符号整型的最大值2^32-1 也可以把C{1}中的内容分别读入 >> fid = fopen('scan1.txt'); >> C = textscan(fid, '%f/%f/%f %s �2 � %u %f %f %s %f'); >>fclose(fid); >> C{1}'

ans =

9 10 11 >> C{2}'

ans =

12 12 12 >> C{3}'

ans =

2005 2005 2005

........一下相同.....

............................................................ 例4：移除字符串

对于上述scan1.txt 如果想忽略Level而直接读取数字

>> fid = fopen('scan1.txt'); >> C = textscan(fid, '%s Level%u8 �2 � %u %f %f %s %f'); >>fclose(fid); >> C{2}'

ans =

1 2 3 ....................................................................

例5：读取某列

>> fid = fopen('scan1.txt'); dates = textscan(fid, '%s %*[^n]'); fclose(fid); >> dates{1}

ans =

'09/12/2005' '10/12/2005' '11/12/2005' dates是一个1*1的细胞矩阵

%[^n] 就是一直读到行尾。

如： >>fid = fopen('scan1.txt'); >>dates = textscan(fid, '%s %[^n]'); >>fclose(fid); >>dates{1}' ans =

'09/12/2005' '10/12/2005' '11/12/2005' >> dates{2}

ans =

'Level1 12.34 45 1.23e10 inf Nan Yes 5.1+3i' 'Level2 23.54 60 9e19 -inf 0.001 No 2.2-.5i' 'Level3 34.90 12 2e5 10 100 No 3.1+.1i'

%*[^n] 就是从当前直接跳到行尾。 % *是一个跳过符号，表示跳过该位 ............................................................................ 例6：处理存在空数据实用分节符delimiter 和空值符EmptyValue

对于exm5.txt

1, 2, 3, 4, , 6 7, 8, 9, , 11, 12

读取数据，空数据用-inf替换

>> fid = fopen('exm5.txt'); C = textscan(fid, '%f %f %f %f %f %f', 'delimiter', ',','EmptyValue', -Inf); fclose(fid); >> data=cell2mat(C)

data =

1 2 3 4 -Inf 6 7 8 9 -Inf 11 12

...............................................................................

例7 跳过所有注释，选择性的把某些输入置空

exm6.txt文件如下：

abc, 2, NA, 3, 4 // Comment Here def, na, 5, 6, 7

现在我们想要第二行的注释，并且把其中的 NA na 置为NAN

>> fid = fopen('exm6.txt'); >>C = textscan(fid, '%s %n %n %n %n', 'delimiter', ',', 'treatAsEmpty', {'NA', 'na'}, 'commentStyle', '//'); >>fclose(fid);

>> C{1}

ans =

'abc' 'def' >> C{2:5}

ans =

2 NaN

ans =

NaN 5

ans =

3 6

ans =

4 7 .................................................................................

例8：

处理重复的分隔符，把重复分隔符认为成一个分隔符

exm8.txt如下：

1,2,3,,4 5,6,7,,8

现在我们想把重复分隔符合并认为成单个分隔符，我们采用MultipleDelimsAsOne参数把其设置为1 % multiple 多个 delims 分隔符 as one

>> clear >> fid = fopen('exm8.txt'); C = textscan(fid, '%f %f %f %f', 'delimiter', ',', 'MultipleDelimsAsOne', 1); fclose(fid); >> data=cell2mat(C)

data =

1 2 3 4 5 6 7 8

..........................................................................

例9： CollectOutput Switch的应用

CollectOutput switch的默认值是0（false）textscan函数把每列的数据放在一个细胞矩阵中

>> clear

>> fid = fopen('grades.txt');

C_text = textscan(fid, '%s', 4, 'delimiter', '|'); % read column headers >> C_data0 = textscan(fid, '%d %f %f %f') % read numeric data 在此fid指向了第二行

C_data0 =

[4x1 int32] [4x1 double] [4x1 double] [4x1 double] >> C_data0{1:4}

ans =

1 2 3 4

ans =

91.5000 88.0000 76.3000 96.4000

ans =

89.2000 67.8000 78.1000 81.2000

ans =

77.3000 91.0000 92.5000 84.6000

现在我们利用CollectOutput switch置为1，这样就可以把同类数据放在同一个细胞数组下

>> frewind(fid); %把fid指到文档开始位置

C_text = textscan(fid, '%s', 4, 'delimiter', '|');

C_data1 = textscan(fid, '%d %f %f %f','CollectOutput', 1)

C_data1 =

[4x1 int32] [4x3 double]

>> C_data1{1}

ans =

1 2 3 4

>> C_data1{2}

ans =

91.5000 89.2000 77.3000 88.0000 67.8000 91.0000 76.3000 78.1000 92.5000 96.4000 81.2000 84.6000

......................................................................................

其实还有两个对字符串的操作，鉴于不经常用，故此略去，欢迎留言补充。

转载请注明原文地址: https://www.6miu.com/read-32606.html

技术

最新回复(0)

matlab&amp;nbsp;中的textscan

技术

matlab 中的textscan