OnWorks Linux and Windows Online WorkStations

Logo

Free Hosting Online for WorkStations

< Previous | Contents | Next >

uniq

Compared to sort, the uniq program is a lightweight. uniq performs a seemingly trivial task. When given a sorted file (or standard input), it removes any duplicate lines and sends the results to standard output. It is often used in conjunction with sort to clean the output of duplicates.


image

Tip: While uniq is a traditional Unix tool often used with sort, the GNU version of sort supports a -u option, which removes duplicates from the sorted output.


image

Let’s make a text file to try this out:



[me@linuxbox ~]$ cat > foo.txt a

b c a b c

[me@linuxbox ~]$ cat > foo.txt a

b c a b c


Remember to type Ctrl-d to terminate standard input. Now, if we run uniq on our text file:



[me@linuxbox ~]$ uniq foo.txt

a b c a b c

[me@linuxbox ~]$ uniq foo.txt

a b c a b c


the results are no different from our original file; the duplicates were not removed. For

uniq to do its job, the input must be sorted first:


[me@linuxbox ~]$ sort foo.txt | uniq

a b c

[me@linuxbox ~]$ sort foo.txt | uniq

a b c


This is because uniq only removes duplicate lines which are adjacent to each other.

uniq has several options. Here are the common ones:


Table 20-2: Common uniq Options


Option Description

Option Description

-c Output a list of duplicate lines preceded by the number of times the line occurs.


image

-d Only output repeated lines, rather than unique lines.


image

-f n Ignore n leading fields in each line. Fields are separated by whitespace as they are in sort; however, unlike sort, uniq has no option for setting an alternate field separator.


image

-i Ignore case during the line comparisons.


image

-s n Skip (ignore) the leading n characters of each line.


image

-u Only output unique lines. Lines with duplicates are ignored.


image


Here we see uniq used to report the number of duplicates found in our text file, using the -c option:


[me@linuxbox ~]$ sort foo.txt | uniq -c

2 a

2 b

2 c

[me@linuxbox ~]$ sort foo.txt | uniq -c

2 a

2 b

2 c


Top OS Cloud Computing at OnWorks: