• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Fast Extraction of Plain Text Support & Improved Font Substitution in Word Docs using Java/.NET

 
Ranch Hand
Posts: 714
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What is new in this release?

Aspose development team is happy to announce the monthly release of Aspose.Words for Java &.NET 16.2.0 . This month’s release contains over 103 useful new features, enhancements and bug fixes to the Aspose.Words products. Here is a look at just a few of the biggest features and API changes in this month’s release, Support of Fast Extraction of Plain Text from flow-format Documents, Support of Axis Logarithmic Scale, Introduced PageSavingCallback for all fixed-page Save Formats, introduced Paragraph.IsFormatRevision and Inline.IsFormatRevision Properties, introduced Shape.Title Property to Get-Set Alt Text Title Property of Shape and improved Font substitution. It has introduced static Document.ExtractText methods to extracts text information from flow-format document. Starting from this version, Aspose.Words supports axis logarithmic scale upon rendering DML chart. It also introduced PageSavingCallback property for all fixed-Page based save formats. PageSavingCallback allows to control how separate pages are saved when a document is exported to fixed-page based save formats. You are able to control PageFileName for each separate page. You can also specify the stream where the document page will be saved using PageStream property. We have introduced Paragraph.IsFormatRevision and Inline.IsFormatRevision properties to check either the formatting of the object was changed in Microsoft Word while change tracking was enabled or not. Starting from 16.2.0 version, Aspose.Words starts supporting Alt Text Title. In case of older format conversion MS Word formats following string “Title: titleText – Description: descText”, if there is no description “Title: titleText”, if there is no title then just “descText”. Aspose.Words does the same now. Font substitution improved to mimic MS Word in case when font info in the document doesn’t contains the PANOSE. Previously, in this case, Aspose.Words used to substitute fonts with FontSettings.DefaultFontName. In case when PANOSE is specified in font info, Aspose.Words still uses FontSettings.DefaultFontName. The list of new and improved features added in this release are given below

- EMF format is now supported
- Aspose.Words for Java passes the Veracode Security Scan
- Performance of PNG encoding/decoding improved x 1.5 times
- Implemented auto-fit table grid calculation for several classes of tables
- EMF+ images now can be rendered without the GDI+
- Comments rendering improved
- Word 6.0 binary DOC files are supported now
- Added public methods for inserting signature lines
- Added ability to configure document hyphenation options
- Added capability to get mail merge regions hierarchy
- Added more public classes and methods to work with fields in a document
- Font sources can now be specified for each document instance
- Implemented line counting (Document.BuiltInDocumentProperties.Lines)
- BuiltInDocumentProperties.Lines return incorrect value
- Display warnings for fonts replaced using FontSettings.addFontSubstitutes
- A big negative value of 'margin-left' causes image part hiding after exporting from DOCX to HTML.
- Improper Restriction of XML External Entity Reference (XXE).
- Tables misaligned in rendered PDF
- Enhancement in WORDSNET-1432
- Consider providing Aspose.Words.dll built for .NET 4.0
- Table column widths are calculated incorrectly during rendering
- Office Math does not display correctly after saving Rtf to Html
- WORDSJAVA-1219 is not resolved in v15.12.0
- Check if it is possible to create the entire DataSet from Java's ResultSet
- Expose of internal ThreadLocal<Locale> to public api.
- Add feature to set ThreadLocal Locale for Document
- Docx to PNG performance issue
- Text is overlapped after conversion from Docx to Pdf
Inserting one document into another causes indent issue
- ODT to Pdf conversion issue with frame layout
- BuiltInDocumentProperties.Lines returns incorrect number of lines
- /noExtraLineSpacing + suppressBottomSpacing/ Word Compatibility Options lost when creating PDF
- Content cutting from left edge, Table disappears from top and unwanted rows appear in PDF
- /noExtraLineSpacing + suppressBottomSpacing/ Docx to Pdf conversion issue with text position
- RTF to Html conversion issue with Shape's text
- /FloaterList.GetIndexAtOrAfterComparer.IsThisOrder/ - System.InvalidOperationException during saving to PDF
- HeaderFooter contents are not exported to Html
- MailMerge.CleanupOptions change TOC into Hyperlink fields
- Aspose.Words loads text with two fonts into one Run
- TestDefect3873 contrast sidebar is rendered incorrectly.
- TestDefect3873 shadow size is incorrect
- Table header row truncates in PDF
- Image Shadow lost during open/save a RTF
- Shape size is different in output RTF and Docx
- Position of shapes are changed in output Pdf
- Newline characters inside URLs prevent HTML import from loading resources
- Content moves to previous page and behind picture in PDF
- Unable to open cloned document in MS Office
- Set ShapeBase.HRef to empty string does not remove hyperlink
- Incorrect series interval values in charts in exported HTML
- Incorrect series interval values and Primary and Secondary vertical Axis Titles are replaced in charts in exported HTML
- Incorrect inline shape width returned by ActualBounds
- Incorrect font on wml to docx conversion
- Incorrect spacing after paragraph on wml to docx conversion
- Rtf to Pdf conversion issue with font rendering
- Table header getting truncated in the output PDF
- Shape (text) rotation is lost after conversion from Docx to Pdf
Chines text overlaps in same paragraph in output HtmlFixed
- UpdateFields produces TOC that is different to Word
- Docx to Doc/Pdf conversion issue with Vertical axis of chart
- Horizontal axis of chart are changed after saving Docx to Doc/Pdf
- Position of Korean text is changed in output Pdf
- Html to Docx/Pdf conversion issue with text rendering
- System.InvalidOperationException in GetStartPageIndex
- Shape.SizeInPoints changes on GetShapeRenderer() call
- Content is scattered in Aspose.Words generated DOC
- /FloaterList.GetIndexAtOrAfterComparer.IsThisOrder/ Aspose.Words generates a corrupt PDF file
- Text effect is lost after conversion from Docx to HtmlFixed
- Document.WarningCallback returns incorrect output
- Table's row width is increased after removing another row
- /noExtraLineSpacing/ Docx to Pdf conversion issue with text position according to line numbers
- Page numbers are not visible after DOCM to PDF
- Incorrect behaviour for text:start-value attribute of list item
- Font is changed during open/save.
- Comments are rendering incorrectly in PDF
- Doc to ePub conversion, incorrect output

Other most recent bug fixes are also included in this release

Newly added documentation pages and articles

Some new tips and articles have now been added into Aspose.Words for .NET documentation that may guide you briefly how to use Aspose.Words for performing different tasks like the followings.

- How to Use Control Characters
- How to Extract Content using DocumentVisitor

Overview: Aspose.Words

Aspose.Words is a word processing component that enables .NET, Java & Android applications to read, write and modify Word documents without using Microsoft Word. Other useful features include document creation, content and formatting manipulation, mail merge abilities, reporting features, TOC updated/rebuilt, Embedded OOXML, Footnotes rendering and support of DOCX, DOC, WordprocessingML, HTML, XHTML, TXT and PDF formats (requires Aspose.Pdf). It supports both 32-bit and 64-bit operating systems. You can even use Aspose.Words for .NET to build applications with Mono.

More about Aspose.Words

- Homepage of Aspose.Words for .NET
- Homepage of Aspose.Words for Java
- Download Aspose.Words for .NET
- Demos for Aspose.Words for .NET

Contact Information
Aspose Pty Ltd
Suite 163, 79 Longueville Road
Lane Cove, NSW, 2066
Australia
Aspose - Your File Format Experts
sales@aspose.com
Phone: 888.277.6734
Fax: 866.810.9465
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic