How to convert * .txt file to Unicode

I have a requirement when the client will provide the file in ANSI encoding, but my system can only successfully read the file in UNICODE. So how do I solve this problem? I know, when I “save as” a file as UNICODE encoding, the file is matched. It is difficult to get the client to fulfill our request. Can I run a batch program for this folder to convert this file to UNICODE and then pick it up?

+5
source share
8 answers

iconv can do this:

Usage: iconv [OPTION...] [FILE...]
Convert encoding of given files from one encoding to another.

 Input/Output format specification:
  -f, --from-code=NAME       encoding of original text
  -t, --to-code=NAME         encoding for output

 Information:
  -l, --list                 list all known coded character sets

 Output control:
  -c                         omit invalid characters from output
  -o, --output=FILE          output file
  -s, --silent               suppress warnings
      --verbose              print progress information

  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
+17
source

ANSI, Unicode . ANSI Unicode (UTF8 UTF16 - LE BE), (, iconv)

+11
+5

python:

inf = open("infile.txt")
data = inf.read().decode("latin1")
inf.close()

outf = open("outfile.txt", "w")
outf.write(data.encode("utf-8"))
outf.close()
+4

Powershell

$lines = gc "pathToFile"
$lines | out-file -enconding Unicode
+3

script (txt_convert.sh <infile> <outfile>):

#!/bin/sh

iconv -f `file -b --mime-encoding "$1"` -t utf8 "$1" -o "$2"

:

iconv -f `file -b --mime-encoding "<infile>"` -t utf8 "<infile>" -o "<outfile>"

: "" , "iconv", utf8 ( utf-8 , , iconv (. iconv -l)

+2

, , .

Windows.

, .

, , .

Notepad2 Notepad. , / ..

: D

+1

Ruby oneliner, fwiw:

ruby -e 'STDOUT.write STDIN.read.force_encoding(Encoding::WINDOWS_1252).encode!(Encoding::UTF_8)' < infile.csv > outfile.csv

If your input file is terrible, you may need to bind STDIN.binmode; STDOUT.binmode;on the front of the Ruby script.

0
source

All Articles