Good Archive Solution
Digitize your documents
Archive-Go, created by GoodCode is the reliable, affordable solution for digitising historical archives and complex publishing documents.
GoodCode understands the challenges behind digitising complex publishing documents and has created a solution to overcome the limits of current OCR systems, by developing a geometric algorithm that can accurately detect the structure of articles.
With archive-go you can automatically extract newspaper articles from images (.pdf or .jpeg), to ensure indexing and retrieval of information at an article level.
- OCR technology implementation
- Article segmentation
- Reassembly of article text
- Correct reading order (from one column to another, front to back and interrupted texts)
- Identification of tables, titles, subtitles and advertising
- Interruption detection
- Recognition of special characters and captions
Optical recognition of letters on the page
Although OCR technology is widely used to extract text from images, it does not always guarantee accurate results. Parts of text are often not fully integrated, it sometimes does not distinguish fonts, identify special characters or read text within images.
GoodCode’s algorithm to refine text reconstruction and reassembly overcomes these issues to reconstruct the page accurately.
Segmentation and correct reading order
Newspaper articles can be extremely complex and difficult to analyse. Several articles and descriptions, captions, quotations and even advertisements appear on one page.
Archive-Go API creates the correct reading order, even with complex layouts, as it can distinguish the logical sections in an article with extreme precision.
The digitization workflow
Archive-Go is the ideal solution for digitising large volumes with complex layouts quickly.
It takes 19 seconds on average to digitise a newspaper page.
Flexibility and customized solutions
We offer flexible solutions customised to your needs. Archive-Go can adapt to all areas of operation, wherever you need to digitise large volumes and documents with complex layouts like newspapers, transport documents and historical archives.
Archive-Go is a reliable, tried and tested service. Large publishing groups have used us to digitise years of their publishing history. We have digitised 750,000 newspaper pages, starting with historical editions from the early 20th century.
Being able to retrieve any document quickly and efficiently from a digital archive is the great added value that Archive-Go now offers companies that have to manage large volumes of documents.
By extracting the logical structure of the page and all its metadata, Archive-Go can identify the various articles that appear there (even if they cover different topics), retrieve information and create a very efficient file search engine.
Articles can therefore be indexed by both word search within the article and edition search. You can then search for an article by edition, author or title.
Various algorithms are applied to detect variants of a specific word in a given language.
What our users say
"Goodcode gave us a turnkey deliverable without hassles nor delays."
Gruppo Corriere del Ticino SA, CEO
Request a consultation
Tell us about your project
Our solutions are customizable and adaptable to your needs.
Together we will find the solution tailored to your requirements.