Manipulating text
cat
Means concatenate
Can be used to print contents of a file
cat file
Can also be used to concatenate files together like
cat file1 file2 #puts the content of file2 after the ones from file1 into stdout
cat file1 file2 > newFile #Same as above but saves to a new file (overwrite)
cat file1 file2 >> newFile #Same as above but appends to a file
cat > file #All lines typed will be written in the file until Ctrl+D
cat >> file #Same as above but with append
cat > file << EOF
text
EOF
# Allows to signal a file ending without Ctrl+D
There is an inverse command
tac
which prints the lines in reverse order, but has the same functionalities
echo
Simply displays a string Can be used to put strings into:
- stdout
- a new file with
>
- an existing file with
>>
Using the -e
flag, allows for characters like newline (\n
) and tab(\t
) in the string.
It is widely used to print environment variables (value)
Example:
echo $USERNAME # Print the value of the env variable
echo hello > newFile #Puts hello into newFile (overwrites)
echo hello >> file # Appends hello to file
Manipulating Large files
- Using
less
is more efficient than an editor (does not try to load everything into memory) head
allows to see only the first n linestail
allows to see only the last n lines (with-f
will continuously monitor)
Viewing compressed files
There are utilities that work directly on compressed files.
For gzip
compressed files zcat
, zless
, zgrep
, zdiff
There are also for the other formats bzcat
, bzless
xzcat
, xzless
sed
-
Stream editor
-
Modifies the content of an input stream (stdin, file, etc)
-
Data from input is moved to a working space, there all operations/modifications are made and moved to the stdoutor output stream (file, etc.)
-
When used with the
-e
flag, more than one editing command can be passed -
The
-f
flag allows to pass a scriptfile with sed commands
Substitution
One big use case is substituting strings
sed s/pattern/replace_string/ file # Substitutes first occurrence of pattern in every line
sed s/pattern/replace_string/g file #All occurrences of pattern in every line
sed 1,2s/pattern/replace_string/g file #All occurrences in a range of lines 1-3
sed -i s/pattern/replace_string/g file #Rewrites the file with the changes (Not recommended)
Delimiting character can be chosen!
awk
- Used to extract and print information from files
- It works well with fields (containing a single piece of data, essentially a column) and records (a collection of fields, essentially a line in a file).
- Also has the
-f
flag to provide a scriptfile with commands - When dealing with simple fields:
awk '{ print $0 }' file # Prints the whole file awk -F: '{ print $1 $7 }' file #Sets the field separator to : and prints the first and seventh fields
File Manipulating
sort
- Allows to sort the lines of a file (based on a key)
- By default sorts alphabetically
Examples:
sort file #Sorts alphabetically with the first character on each line
cat file1 file2 | sort #Sorts the contents of two files and prints
sort -r file #Sorts in reverse
sort -k 3 file #Sorts by the 3rd field in each line (not the beginning)
sort -u file #SOrts and gets rid of repeated lines (same as sort | uniq)
uniq
- Simplifies files by eliminating repeated lines that are consecutive
- Normally used with
sort
first, because of consecutive sort -u
does both in one command
paste
- Used to join files “horizontally”
- It is based on delimiters
- Default delimiter is
\t
but can be changed with the-d
flag - Tee
-s
flag allow for serial manner (That is a line per file, separated by delimiters)
Examples:
This files
Colombia
Argentina
Suiza
Alemania
Italia
Australia
Bogota
Buenos Aires
Berna
Berlin
Roma
Canberra
give out the following:
$ paste file1 file2 -d ":"
Colombia:Bogota
Argentina:Buenos Aires
Suiza:Berna
Alemania:Berlin
Italia:Roma
Australia:Canberra
$ paste file1 file2 -d ":" -s
Colombia:Argentina:Suiza:Alemania:Italia:Australia
Bogota:Buenos Aires:Berna:Berlin:Roma:Canberra
join
- Enhanced version of
paste
- Joins two files that have a common field
split
- Used to split large file into smaller ones
- By default breaks it into files with 1000 lines (changed with the
-l
flag) - By default creates file with
- Prefix: Default is
x
, can be set withsplit file some_prefix
- Sufix: Default is
aa
,ab
, etc. With-d
can be numeric
- Prefix: Default is
- Can also split by size (
-b 16
will result in pieces of 16 bytes) or amount of files (-n
)
grep
- Scan files with matching regex
- Returns lines that match by default
Some examples:
grep "^some_pattern$" file
grep -v "^some_pattern$" file #Returns line that do not match
grep -C 3 "^some_pattern$" file #Returns 3 lines(before and after) of context and the one that matches
grep -e "^some_pattern$" -e "some_other" file #Multiple patterns
strings
can be used to extract strings from binaries as well