Need help creating a PDF from HTML using itextsharp

Question

Need help creating a PDF from HTML using itextsharp

I am trying to extract a PDF file from an HTML page. The CMS I use is EPiServer.

This is my code:

    protected void Button1_Click(object sender, EventArgs e)
    {
        naaflib.pdfDocument(CurrentPage);
    }


    public static void pdfDocument(PageData pd)
    {
        //Extract data from Page (pd).
        string intro = pd["MainIntro"].ToString(); // Attribute
        string mainBody = pd["MainBody"].ToString(); // Attribute

        // makae ready HttpContext
        HttpContext.Current.Response.Clear();
        HttpContext.Current.Response.ContentType = "application/pdf";

        // Create PDF document
        Document pdfDocument = new Document(PageSize.A4, 80, 50, 30, 65);
        //PdfWriter pw = PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
        PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);  

        pdfDocument.Open();
        pdfDocument.Add(new Paragraph(pd.PageName));
        pdfDocument.Add(new Paragraph(intro));
        pdfDocument.Add(new Paragraph(mainBody));
        pdfDocument.Close();
        HttpContext.Current.Response.End();
    }

This displays the contents of the article name, in-text, and body. But it does not match the HTML that is in the text of the article, and there is no layout.

I tried to take a look at http://itextsharp.sourceforge.net/tutorial/index.html without trying to become more wise.

Any pointers in the right direction are welcome :)

+5

c # .net itextsharp

Steven Apr 7 '10 at 14:09

source share

1 answer

Jay Riggs · Accepted Answer · 2010-04-08T00:19:20+0000

For later versions of iTextSharp:

Using iTextSharp, you can use the method iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList()to create PDF from HTML.

ParseToList() TextReader ( ) HTML, , StringReader StreamReader ( TextReader ). StringReader PDF . HTML, -, , . -, (http://black.ea.com/), 'head' PDF, , HTMLWorker.ParseToList() HTML, .

, :

// Download content from a very, very simple "Hello World" web page.
string download = new WebClient().DownloadString("http://black.ea.com/");

Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
    using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
        PdfWriter.GetInstance(document, fs);
        using (StringReader stringReader = new StringReader(download)) {
            ArrayList parsedList = HTMLWorker.ParseToList(stringReader, null);
            document.Open();
            foreach (object item in parsedList) {
                document.Add((IElement)item);
            }
            document.Close();
        }
    }

} catch (Exception exc) {
    Console.Error.WriteLine(exc.Message);
}

, HTML- HTMLWorker.ParseToList(); , , . , .

iTextSharp: iTextSharp.text.html.HtmlParser.Parse PDF html.

, :

Document document = new Document(PageSize.A4, 80, 50, 30, 65); 
try  {
   using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
      PdfWriter.GetInstance(document, fs);
      HtmlParser.Parse(document, "YourHtmlDocument.html");
   }
} catch(Exception exc)  { 
   Console.Error.WriteLine(exc.Message); 
}

( ) , HTML XHTML.

!

Need help creating a PDF from HTML using itextsharp

More articles: