Android Get Text From PDF

I want to read the text from a PDF file present on the SD card. How can we get text from a PDF file that is stored on an SD card?

I tried:

public class MainActivity extends ActionBarActivity implements TextToSpeech.OnInitListener { private TextToSpeech tts; private String line = null; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); tts = new TextToSpeech(getApplicationContext(), this); final TextView text1 = (TextView) findViewById(R.id.textView1); findViewById(R.id.button1).setOnClickListener(new OnClickListener() { private String[] arr; @Override public void onClick(View v) { File sdcard = Environment.getExternalStorageDirectory(); // Get the text file File file = new File(sdcard, "test.pdf"); // ob.pathh // Read text from file StringBuilder text = new StringBuilder(); try { BufferedReader br = new BufferedReader(new FileReader(file)); // int i=0; List<String> lines = new ArrayList<String>(); while ((line = br.readLine()) != null) { lines.add(line); // arr[i]=line; // i++; text.append(line); text.append('\n'); } for (String string : lines) { tts.speak(string, TextToSpeech.SUCCESS, null); } arr = lines.toArray(new String[lines.size()]); System.out.println(arr.length); text1.setText(text); } catch (Exception e) { e.printStackTrace(); } } }); } @Override public void onInit(int status) { if (status == TextToSpeech.SUCCESS) { int result = tts.setLanguage(Locale.US); if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) { Log.e("TTS", "This Language is not supported"); } else { // speakOut(); } } else { Log.e("TTS", "Initilization Failed!"); } } } 

Note: it works fine if the file is a text file (test.txt) but does not work for pdf (test.pdf)

But here the text is not obtained from PDF as is, it becomes like a byte code. How can I achieve this?

Thanks in advance.

+5
source share
2 answers

I have a solution with iText.

Gradle

 compile 'com.itextpdf:itextg:5.5.10' 

Java

  try { String parsedText=""; PdfReader reader = new PdfReader(yourPdfPath); int n = reader.getNumberOfPages(); for (int i = 0; i <n ; i++) { parsedText = parsedText+PdfTextExtractor.getTextFromPage(reader, i+1).trim()+"\n"; //Extracting the content from the different pages } System.out.println(parsedText); reader.close(); } catch (Exception e) { System.out.println(e); } 
+5
source

The PDF format is not your regular text file. You need to do a little more research in PDF files, this is the best answer you will get. How to read pdf in an Android application?

+2
source

All Articles