Skip to main content

Sorting a file in unix

 

Sorting a file  in Unix

Sort command in unix or linux system is used to order the elements or text. Sort command has the capability of sorting numerical values and strings. The sort command can order the lines in a text file.

The syntax of sort command is:


sort [options] filename


The options are:


-b : Ignores leading spaces in each line

-d : Uses dictionary sort order. Conisders only spaces and alphanumeric characters in sorting

-f : Uses case insensitive sorting.

-M : Sorts based on months. Considers only first 3 letters as month. Eg: JAN, FEB

-n : Uses numeric sorting

-R : Sorts the input file randomly.

-r : Reverse order sorting

-k : Sorts file based on the data in the specified field positions.

-u : Suppresses duplicate lines

-t : input field separator


Sort Command Examples:

Before practicing the examples create the below two files in your unix system:


> cat order.txt

Unix distributed 05 server

Linux virtual 3 server

Unix distributed 05 server

Distributed processing 6 system

 

> cat delim_sort.txt

Mayday|4

Janmon|1

Declast|12


1. Sorting lines of text

The default sort command uses alphabetical order (ASCII order) to sort the file. It treats each line as a string and then sorts the lines.


> sort order.txt

Distributed processing 6 system

Linux virtual 3 server

Unix distributed 05 server

Unix distributed 05 server


2. Sorting based on the field positions.

You can specify the field postions using the -k option of sort command. The sort command uses the space or tab as the default delimiter. To sort based on the data in the second field, run the below command:


> sort -k2 order.txt

Unix distributed 05 server

Unix distributed 05 server

Distributed processing 6 system

Linux virtual 3 server


You can also pecify more than field with k option as a comma separated list. The below command uses the second and fourth fields to sort the data.


> sort -k2,4 order.txt


3. Numeric sorting

Instead of the default alphabetical sorting order, you can make the sort command to sort in numeric order using the -n option. This is shown below:


> sort -nk3 order.txt

Linux virtual 3 server

Unix distributed 05 server

Unix distributed 05 server

Distributed processing 6 system


4. Sort in reverse order

By default, the sort command sorts the data in ascending order. You can change this to descending order using the -r option.


> sort -nrk3 order.txt

Distributed processing 6 system

Unix distributed 05 server

Unix distributed 05 server

Linux virtual 3 server


5. Suppressing duplicates or Print only unique values

You can produce only unique values in the output using the - u option of the sort command.


> sort -u order.txt

Distributed processing 6 system

Linux virtual 3 server

Unix distributed 05 server


Another way is piping the output of sort command to uniq command.


> sort order.txt | uniq


6. Delimited file input

In the second, third and fourth examples we have sorted the data based on the field positions. Here the fields are separted by space or tab character. What if the fields are specifed by any other character? In such cases, we have to specify the input delimiter with the -t option. An example is shown below:


> sort -t'|' -nrk2 delim_sort.txt

Declast|12

Mayday|4

Janmon|1


7. Sorting on months.

We can sort the data in the monthwise using the -M option of the sort command. This is shown below:


> sort -M delim_sort.txt

Janmon|1

Mayday|4

Declast|12


Treats the first 3 characters in the string as month and then sorts in months order.

 

Comments

Popular posts from this blog

Oracle Merge Statement Delta Detection

The MERGE Syntax by Ajay Nerumati Delta Detection in Oracle SQL Posted on  8. October 2016 Delta detection is a common task in every Data Warehouse. It compares new data from a source system with the last versions in the Data Warehouse to find out whether a new version has to be created. There are several ways to implement this in Oracle. Your source system delivers a full extraction every night, and you have to load only the changed rows into your Core Data Warehouse? You receive incremental loads from another source system every few minutes, but only a few columns are loaded into the Data Warehouse. In all these situations, you need a delta detection mechanism to identify the rows that have to be inserted or updated in your Data Warehouse tables. In this blog post, I want to show different methods in Oracle SQL that provide the subset of rows of a source table that were changed since the last load. All these methods are set-based, i.e. they can be executed in one SQL sta...

FOREIGN KEY

Pre-requisits : General understanding of integrity constraints. Constrains are used to prevent invalid data into the table. Integrity constrains are business rules enforced on the data in a table. The enforced rules trigger whenever a row is inserted, updated, or deleted from that table and prevent data that does not meet the set rules. FOREIGN KEY :  Establishes and enforces a relationship between two tables or to the same table itself. The foreign key or referential integrity constraint, designates a column or combination of columns as a foreign key and establishes a relationship between a primary key or a unique key in the same table or a different table.  In the example below   Emp_child  table's deptno   has been defined as the foreign key ; it references the deptno column of the Dept_parent table. Implementation: Step 1 Create parent table DEPT_PARENT as follows: Create table DEPT_PARENT ( Deptno n...

RDBMS MINI PROJECT

  Capgemini class room training.   RDBMS MINI PROJECT ( SPRINT ) LIBRARY MANAGEMENT SYSTEM   Table of Contents Serial No. Topic Name Content Page No. 1.   Introduction 1.1 Setup checklist for mini project 3     1.2 Instructions 3 2.   Problem statement   2.1 Objective 4     2.2 Abstract of the project 4     2.3 Functional components of the project 4     2.4 Technology used 5 3.   Implementation in RDBMS LOT 3.1 Guidelines on the functionality to be built 6 4.   Evaluation 4.1 Evaluation 7   ...