Skip to main content Link Search Menu Expand Document (external link)

Manipulating text

cat

Means concatenate

Can be used to print contents of a file

cat file

Can also be used to concatenate files together like

cat file1 file2 #puts the content of file2 after the ones from file1 into stdout
cat file1 file2 > newFile #Same as above but saves to a new file (overwrite)
cat file1 file2 >> newFile #Same as above but appends to a file
cat > file #All lines typed will be written in the file until Ctrl+D
cat >> file #Same as above but with append
cat > file << EOF
text
EOF
# Allows to signal a file ending without Ctrl+D

There is an inverse command tac which prints the lines in reverse order, but has the same functionalities

echo

Simply displays a string Can be used to put strings into:

  • stdout
  • a new file with >
  • an existing file with >>

Using the -e flag, allows for characters like newline (\n) and tab(\t) in the string.

It is widely used to print environment variables (value)

Example:

echo $USERNAME # Print the value of the env variable
echo hello > newFile #Puts hello into newFile (overwrites)
echo hello >> file # Appends hello to file

Manipulating Large files

  • Using less is more efficient than an editor (does not try to load everything into memory)
  • head allows to see only the first n lines
  • tail allows to see only the last n lines (with -f will continuously monitor)

Viewing compressed files

There are utilities that work directly on compressed files.

For gzip compressed files zcat, zless, zgrep, zdiff There are also for the other formats bzcat, bzless xzcat, xzless

sed

  • Stream editor

  • Modifies the content of an input stream (stdin, file, etc)

  • Data from input is moved to a working space, there all operations/modifications are made and moved to the stdoutor output stream (file, etc.)

  • When used with the -e flag, more than one editing command can be passed

  • The -f flag allows to pass a scriptfile with sed commands

Substitution

One big use case is substituting strings

sed s/pattern/replace_string/ file # Substitutes first occurrence of pattern in every line
sed s/pattern/replace_string/g file #All occurrences of pattern in every line
sed 1,2s/pattern/replace_string/g file #All occurrences in a range of lines 1-3
sed -i s/pattern/replace_string/g file #Rewrites the file with the changes (Not recommended)

Delimiting character can be chosen!

awk

  • Used to extract and print information from files
  • It works well with fields (containing a single piece of data, essentially a column) and records (a collection of fields, essentially a line in a file).
  • Also has the -f flag to provide a scriptfile with commands
  • When dealing with simple fields:
    awk '{ print $0 }' file # Prints the whole file
    awk -F: '{ print $1 $7 }' file #Sets the field separator to : and prints the first and seventh fields
    

    File Manipulating

sort

  • Allows to sort the lines of a file (based on a key)
  • By default sorts alphabetically

Examples:

sort file #Sorts alphabetically with the first character on each line
cat file1 file2 | sort #Sorts the contents of two files and prints
sort -r file #Sorts in reverse
sort -k 3 file #Sorts by the 3rd field in each line (not the beginning)
sort -u file #SOrts and gets rid of repeated lines (same as sort | uniq)

uniq

  • Simplifies files by eliminating repeated lines that are consecutive
  • Normally used with sort first, because of consecutive
  • sort -u does both in one command

paste

  • Used to join files “horizontally”
  • It is based on delimiters
  • Default delimiter is \t but can be changed with the -d flag
  • Tee -s flag allow for serial manner (That is a line per file, separated by delimiters)

Examples:

This files

Colombia
Argentina
Suiza
Alemania
Italia
Australia
Bogota
Buenos Aires
Berna
Berlin
Roma
Canberra

give out the following:

$ paste file1 file2 -d ":"
Colombia:Bogota
Argentina:Buenos Aires
Suiza:Berna
Alemania:Berlin
Italia:Roma
Australia:Canberra
$ paste file1 file2 -d ":" -s
Colombia:Argentina:Suiza:Alemania:Italia:Australia
Bogota:Buenos Aires:Berna:Berlin:Roma:Canberra

join

  • Enhanced version of paste
  • Joins two files that have a common field

split

  • Used to split large file into smaller ones
  • By default breaks it into files with 1000 lines (changed with the -l flag)
  • By default creates file with
    • Prefix: Default is x, can be set with split file some_prefix
    • Sufix: Default is aa, ab, etc. With -d can be numeric
  • Can also split by size (-b 16 will result in pieces of 16 bytes) or amount of files (-n)

grep

  • Scan files with matching regex
  • Returns lines that match by default

Some examples:

grep "^some_pattern$" file
grep -v "^some_pattern$" file #Returns line that do not match
grep -C 3 "^some_pattern$" file #Returns 3 lines(before and after) of context and the one that matches
grep -e "^some_pattern$" -e "some_other" file #Multiple patterns

strings can be used to extract strings from binaries as well