Linux command (e.g. cat) read the specified number of characters

Is there a command like cat in linux that can return a specified number of characters from a file?

For example, I have a text file, for example:

 Hello world this is the second line this is the third line 

And I want something that will return the first 5 characters that will be "hello".

thank

+78
command-line linux
Oct 20 '08 at 15:52
source share
9 answers

head also works:

 head -c 100 file # returns the first 100 bytes in the file 

.. will extract the first 100 bytes and return them.

What is nice to use head for this is that the syntax for tail matches:

 tail -c 100 file # returns the last 100 bytes in the file 
+131
Oct 20 '08 at 15:59
source share

You can use dd to extract arbitrary pieces of bytes.

For example,

 dd skip=1234 count=5 bs=1 

will copy bytes from 1235 to 1239 from its input to its output and discard the rest.

To simply get the first five bytes from standard input, do:

 dd count=5 bs=1 

Note that if you want to specify the name of the input file, dd has an old-fashioned parsing of the arguments, so you should:

 dd count=5 bs=1 if=filename 

Note also that dd announces in detail what it did, so to discard this, do:

 dd count=5 bs=1 2>&- 

or

 dd count=5 bs=1 2>/dev/null 
+38
Oct 20 '08 at 17:08
source share

head :

Name

head - displays the first part of files

Description

head [ OPTION ] ... [ FILE ] ...

Description

Print the first 10 lines of each FILE file to standard output. With more than one FILE, a file name is passed in front of each header. Without FILE or when FILE - -, read standard input.

Mandatory arguments for long options are also required for short options.
-c , - bytes = [-] N type the first N bytes of each file; with a leading "-", type everything except the last N bytes of each file

+10
Oct 20 '08 at 15:55
source share
 head -Line_number file_name | tail -1 |cut -c Num_of_chars 

this script gives the exact number of characters from a particular string and location, for example:

 head -5 tst.txt | tail -1 |cut -c 5-8 

gives characters in line 5 and characters 5-8 of line 5,

Note : tail -1 used to select the last line displayed by the head.

+3
Jul 05 '09 at 11:35
source share

head or tail can do this:

head -c X

Prints the first X bytes (not necessarily characters if it is a UTF-16 file) of the file. tail will do the same except for the last X bytes.

This (and incision) is portable.

+2
Oct. 20 '08 at 15:58
source share

you can also align the string and then cut it, such as:

grep 'text' filename | cut -c 1-5

+2
Oct 20 '08 at 15:59
source share

I know that the answer answers the question posed 6 years ago ...

But I searched for something like this for several hours, and then found out that: cut -c does just that, with an added bonus, that you can also specify an offset.

cut -c 1-5 will return Hello and cut -c 7-11 will return the world , There is no need for any other command

+1
Oct 23 '14 at 9:22
source share

Although this answer was accepted / accepted many years ago, the currently accepted answer is only valid for single-byte encodings such as iso-8859-1, or for single-byte subsets of multi-byte character sets (like Latin characters in UTF-8) . Even using multibyte splice instead will only work for fixed multibyte encodings such as UTF-16. Given that UTF-8 is now on its way to the universal standard and when viewing this list of languages ​​by the number of native speakers and this list of the top 30 languages ​​using native / secondary use , it is important to specify a simple variable byte character (not byte)) using cut -c and tr / sed with feature classes.

Compare the following, which doubly fails due to two common Latin errors / presumptions regarding the problem with bytes and characters (one of them is head vs. cut , the other is [az][AZ] vs. [:upper:][:lower:] ).

 $ printf 'Πού μπορώ να μάθω σανσκριτικά;\n' | \ $ head -c 1 | \ $ sed -e 's/[AZ]/[az]/g' [[unreadable binary mess, or nothing if the terminal filtered it]] 

to this (note: this worked fine on FreeBSD, but both cut and tr on GNU / Linux still crippled Greek in UTF-8):

 $ printf 'Πού μπορώ να μάθω σανσκριτικά;\n' | \ $ cut -c 1 | \ $ tr '[:upper:]' '[:lower:]' π 

Another later answer already suggested "cut", but only because of a side problem that it can be used to indicate arbitrary offsets, and not because of a problem directly related to the character and bytes.

If your cut does not handle -c with variable byte encodings correctly, for the "first X characters" (replace X with your number) you can try:

  • sed -E -e '1 s/^(.{X}).*$/\1/' -eq - which is limited to the first line, though
  • head -n 1 | grep -E -o '^.{X}' head -n 1 | grep -E -o '^.{X}' - which is limited to the first line and combines the two commands, although
  • dd - which was already suggested in other answers, but really cumbersome
  • A sophisticated sed script with a sliding window buffer to handle characters distributed across multiple lines, but this is probably more cumbersome / fragile than just using something like dd

If your tr does not handle character classes with variable byte encodings, you can try:

  • sed -E -e 's/[[:upper:]]/\L&/g (specific to GNU)
+1
Jul 01 '16 at 11:47
source share

Here is a simple script that completes using the dd approach mentioned here:

extract_chars.sh

 #!/usr/bin/env bash function show_help() { IT=" extracts characters X to Y from stdin or FILE usage: XY {FILE} eg 2 10 /tmp/it => extract chars 2-10 from /tmp/it EOF " echo "$IT" exit } if [ "$1" == "help" ] then show_help fi if [ -z "$1" ] then show_help fi FROM=$1 TO=$2 COUNT=`expr $TO - $FROM + 1` if [ -z "$3" ] then dd skip=$FROM count=$COUNT bs=1 2>/dev/null else dd skip=$FROM count=$COUNT bs=1 if=$3 2>/dev/null fi 
0
Sep 08 '17 at 17:02 on
source share



All Articles