15 Mar 2024

Documaster: Transforming Files with Aspose Wizardry

Dimitar Ouzounov

Dimitar Ouzounov

Documaster is a cloud-based records and document management system that helps both public and private sector organizations store, organize and secure their electronic documents, which come from various sources such as Office 365, specialized business systems, and scanned paper archives. One of the key components of Documaster is the document processing system (DPS), which, among other things, scans documents for barcodes and QR codes, OCRs images, converts the most popular file formats (Word, Excel, PowerPoint, e-mails, web pages and PDFs) to PDF/A, and generates previews of large documents. 

Since its inception in 2013, the DPS has relied solely on open-source tools for document processing. In the autumn of 2014, we stumbled upon Aspose and evaluated it as an alternative to the open-source technology we were using at the time. Back then it turned out that Aspose had a slightly worse success rate when it comes to conversion to PDF/A. Conversion speed was also a bit of an issue, so we ended up not using Aspose in our products. 

Ten years later, however, the tables have turned, and we recently started using Aspose for converting documents to PDF/A-2a format, which preserves the accessibility features of source documents. There is currently no open-source tool or a commercial solution that does this job as well as Aspose. We also spent significant time researching how well Aspose handles all our other use cases, while taking into account the issues we had seen back in 2014. The results convinced us to start gradually replacing existing libraries and tools in the DPS with Aspose and eventually make it an integral part of our tech stack. 

Below is our take on the Aspose pros and cons (we use Apose.Total for Java): 

 Pros 

  • Converts various file formats (most importantly, Word, Excel and PowerPoint) to PDF/A with a very good success rate. The quality of the converted documents is better compared to everything we have used so far. 
  • It has a ton more functionalities for processing many different types of documents, and we believe we have so far only scratched the surface of what Aspose is capable of. 
  • It has relatively low requirements for hardware resources and is quite fast, reaching processing speeds comparable to the speed of the tools we’ve been using so far, and some of these tools are written in C. 
  • Communication with the Aspose support team is easy – they are polite and responsive. 
  • The documentation is pretty good and there are a lot of code examples that helped us get started. So we were pretty quick to develop a solution to the problem we had at hand. 
  • There is also a public forum where one can ask questions and check if particular issues have been discussed in the past, which can be of tremendous help. 
  • Being a Java library with no dependencies, Aspose can be directly plugged into our Java applications. This approach is much more flexible than using the tools we have utilized in the past – they had a bunch of dependencies, including OS dependencies, which occasionally made updating to the cutting edge version of a tool somewhat difficult. 
  • It offers long trial periods which allow anybody to spend a sufficient amount of time to experiment with the library. 

 Cons 

  • The APIs for the different sub-products (Aspose.Words, Aspose.PDF, Aspose.Email) are not unified, so one has to use a different API to process the respective type of document. 
  • There are some conversion features which are supported for one file format, but not for another, which can be a shortcoming occasionally. 
  • Minor bugs are sometimes left without resolution for a long time, and we cannot help fixing these the way we’ve done previously with open-source projects. 

Highlighted text

Highlighted text

Highlighted text

List with icons

List with icons

List with icons

Accordion

Description

Accordion

Description

Accordion

Description