Linux Blog

Using cut – Shellscript string manipulation

Filed under: Shell Script Sundays — TheLinuxBlog.com at 1:21 am on Sunday, August 26, 2007

This post is designed to be a refresher, reference or quick intro into how to manipulate strings with the cut command in bash.

Some times its useful to take the output of a command and reformat it. I sometimes do this for aesthetic purposes or tor format for use as input into another command.
Cut has options to cut by bytes (-b), characters (-c) or fields (-f). I normally cut by character or field but byte can come in handy some times.
The options to cut by are below.

N          N’th byte, character or field, counted from 1
N-         from N’th byte, character or field, to end of line
N-M      from N’th to M’th (included) byte, character or field
-M         from first to M’th (included) byte, character or field

The options pretty much explain themselves but I have included some simple examples below:
Cutting by characters (command on top, output below)

echo "123456789" | cut -c -5
12345

echo "123456789" | cut -c 5-
56789

echo "123456789" | cut -c 3-7
34567

echo "123456789" | cut -c 5
5

Sometimes output from a command is delimited so a cut by characters will not work. Take the example below:

echo -e "1\t2\t3\t4\t5" |cut -c 5-7
3 4

To echo a tab you have to use the -e switch to enable echo to process back slashed characters. If the desired output is 3\t4 then this would work great if the strings were always 1 character but if anywhere before field 3 a character was added the output would be completely changed as followed:

echo -e "1a\t2b\t3c\t4d\t5e" | cut -c 5-7
b 3

This is resolved by cutting by fields.
Cutting by fields

The syntax to cut by fields is the same as characters or bytes. The two examples below display different output but are both displaying the same fields (Fields 3 Through to the end of line.)

echo -e "1\t2\t3\t4\t5" | cut -f 3-
3 4 5

echo -e "1a\t2a\t3a\t4a\t5a" | cut -f 3-
3a 4a 5a

The default delimiter is a tab, if the output is delimited another way a custom delimiter can be specified with the -d option. It can be just about any printable character, just make sure that the character is escaped (back slashed) if needed. In the example below I cut the string up using the pipe as the delimiter.

echo "1|2|3|4|5" | cut -f 3- -d \|
3|4|5

One great feature of cut is that the delimiter that was used for input can be changed by the output of cut. In the example below I change the format of the string from a dash delimited output and change it to a comma.

echo -e "1a-2a-3a-4a-5a" | cut -f 3- -d – --output-delimiter=,
3a,4a,5a

Formatting with Cut Example

Sometimes certain Linux applications such as uptime do not have options to format the output. Cut can be used to pull out the information that is desired.
Normal up-time Command:

owen@the-linux-blog:~$ uptime
19:18:40 up 1 day, 22:15, 4 users, load average: 0.45, 0.10, 0.03

Time with up-time displayed:

owen@the-linux-blog:~$ uptime |cut -d , -f 1,2 | cut -c 2-
19:19:36 up 1 day, 22:22

For the above example I pipe the output of uptime to cut and tell it I want to split it with a comma , delimiter. I then choose fields 1 and 2. The output from that cut is piped into another cut that removes the spaces in front of the output.
Load averages extracted from uptime:

owen@the-linux-blog:~$ uptime |cut -d , -f 4- | cut -c 3-
load average: 0.42, 0.10, 0.03

This is about the same as the previous example except the fields changed. Instead of fields 1 and 2 I told it to display fields 4 through the end. The output from that is piped to another cut which removes the three spaces that were after the comma in "4 users, " by starting at the 3rd character.
The great thing about cutting by fields is that no matter if the field length changes the data stays the same. Take the example below. I now have 17 users logged in which would have broke the output if I had used -c (since there is an extra character due to a double digit number of users being logged in.)

owen@the-linux-blog:~$ uptime
19:25:11 up 1 day, 22:28, 17 users, load average: 0.00, 0.06, 0.04

owen@the-linux-blog:~$ uptime |cut -d , -f 4- | cut -c 3-
load average: 0.00, 0.06, 0.04

That just about covers everything for the cut command. Now you know about it you can use cut to chop up all types of strings. It is one of the many great tools available for string manipulation in bash. If you can remember what cut does it will make your shell scripting easier, you don’t need to memorize the syntax because all of the information on how to use cut is available here, in the man pages and all over the web.

Man Pages for commands in this post »

echo
cut
uptime

18 Comments »

Comment by Min li

February 19, 2009 @ 6:37 pm

The best tutorial for “Cut” I have seen. Thanks…

Comment by Mike

March 24, 2009 @ 8:03 pm

Cool introduction to cut. Better than the programs own manual, which gives no examples. Now I can use cut on a daily basis :)

Pingback by The Linux Blog » Last 50 Characters of Each line

May 24, 2009 @ 4:18 pm

[…] He had found my site while searching for ‘cut from end of line Linux’ and landed on the Using cut – shellscript string manipulation article. I haven’t received a lot of feedback on it, but am happy with the feedback I have […]

Comment by rockford

July 2, 2009 @ 3:32 pm

Hello,

very nice tutorial for cut. I stumpled in here while searching for cut with –output-delimiter=tab unfortunatly I did not find a solution. I have a file which is comma seperated eg: a,b,c,d
and i want to output it with tab as output-delimiter –output-delimiter=\t does not work. Any ideas/suggestion?

rockford

Comment by TheLinuxBlog.com

July 3, 2009 @ 11:33 am

Hey @ROCKFORD, if you want to replace all comma’s with tabs, you can use | tr , \\t
this will work, however it would replace ALL comma’s for tabs. I’m not sure how you’d go about doing it if there are commas within the fields that have been cut.

Comment by rockford

July 3, 2009 @ 2:57 pm

Hello,

thx for the answer, this indeed works!

I’ll think I’ll visit this site regulary from now on :-)
As former windows user I hope I can learn much more about linux.

Greetings
rockford

Comment by Roger

October 5, 2009 @ 4:48 pm

In bash why can’t I split the expression parfile=/tmp/abc.par into its 2 components where the delimiter is an = sign:

#echo parfile=/tmp/abc.par | cut -f 1-2 -d “=”
parfile=/tmp/abc.par

I tried various versions of it (with the -e to the echo, field list specified as 1,2 as well as 1-2, space after the f, no space after the f).

Nothing is working. Can anyone shed some light?

Thanks

Comment by Roger

October 5, 2009 @ 5:06 pm

Please ignore my previous post. Should have read the manual before posting !

In bash, the default output delimiter is the same as the input delimiter.

–output-delimiter=STRING
use STRING as the output delimiter the default is to use the input delimiter

Comment by Maga

January 29, 2010 @ 10:54 am

Awesome tutorial on cut!!! Now it’s part of my bookmarks… :)

I still have one question… I have the name of a file stored in a variable and want to get only the name, not the extension…
I’ve tried
echo ${files[$i]}| cut -d ‘.’ -f 1 and it doesn’t work…

could anyone tell me what am missing here?

thx!!

Comment by Maga

January 29, 2010 @ 10:57 am

oops, I forgot to say that I want to store the output in a variable, which was my question….

so I’ve tried:

name=echo “${files[$i]}”| cut -d ‘.’ -f 1

and I don’t get anything stored in the variable….

Comment by TheLinuxBlog.com

January 29, 2010 @ 12:22 pm

Hey Maga, assuming the filename in VAR doesn’t have any periods in it you can use:

NAME=`echo $VAR | cut -d \. -f 1`

Here is another way subtracting 5 characters (extension + blankspace) from the $VAR and using the character cut:
echo $VAR | cut -c -$((`echo $VAR| wc -c` – 5))

However, there are probably better ways of doing it.

Comment by yaker

December 12, 2010 @ 10:29 am

it will be more comfortable if the “-f” param can accept fields such as “first/last”

Comment by TheLinuxBlog.com

December 12, 2010 @ 11:46 am

@yaker agreed, I’d like to see the last option, the first is always easy since it should be “1”. To get the last field, you’d have to count the number of fields and specify that, or reverse the string and use cut to get the first (which was last) again.

like: cat test.txt | rev | cut -d , -f 1

Comment by Bobo

May 28, 2011 @ 1:11 pm

Brilliant. You made it quick, simple, and easy.
If only all tutorials could be this way. ;)
Thanks!

Comment by challaa

June 1, 2011 @ 3:01 am

super example for cut command
thanks
challaa

Comment by supertone44

June 8, 2011 @ 12:58 pm

First off, great tutorial and examples. Found exactly what I was needing.

Can anyone explain to me what the difference is between the ` and ‘ characters. For example, in comment 3109 by thelinuxblog.com they suggested:

NAME=`echo $VAR | cut -d \. -f 1`

I previously had tried wrapping it with the single quote character, which just treated the statement as a string. Can anyone shed some light on the significance of the ` character?

Thanks!

Comment by pmjc

March 29, 2012 @ 11:34 am

This was a greate tutorial for the cut command. I’ve bookmarked your page for a reference

Comment by Jay Welly

July 22, 2012 @ 5:25 pm

Perfect tutorial, very complete. Thanks!

RSS feed for comments on this post. TrackBack URI

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>