To copy text from PDF to a page, select all text in PDF, and then press CTRL+C.
Drag over text as usual. Note: sometimes, the selection in Acrobat misbehaves. On the following screenshot, the selection was dragged from bottom of the text to the top. As you can see, the selection skipped the headline, and also it skipped the part of the byline. This is an internal PDF problem.
Moreover, in some situations in the pasted text all words may be concatenated one to each other, with no spaces between. This may be due to the inappropriate Acrobat or Reader version, or the lack in the PDF file if special information that is crucial for successful extraction of text from them. Basically, such files do not contain glyph-to-character mapping information. Such files will be displayed and printed just fine (because shapes of the characters are properly defined), but text from them can't be properly copied / extracted (because there is no information about meaning of used glyphs/shapes).For example, Distiller produces such files when "Smallest File Size" preset is used. You may try to select the text you wish to copy. Right-click and then choose the option "Export Selection as", if present in your Acrobat (it is in Acrobat Pro DC, but not in all versions of the Acrobat Pro X), In the dialog box, choose a file name and save the new file as DOC. Open DOC to see your text. |
Paste text in a frame or in an article by pressing CTRL+V. All end-of-lines may appear as end-of-paragraphs (EOPs). This is due to the Acrobat inability to distinguish the end-of-lines of the end-of-paragraphs - it treats them all as end-of-paragraphs. |
In Fred4 or in Ted4, select all the pasted text, right-click, point to Find and then hold Shift while clicking on click Replace EOPS with spaces. This will first remove all end of paragraphs, and then re-create them on each full stop. The operation does not restore the original text flow, as it gets definitively lost in transfer from PDF, but it makes it readable. |
To restore the original text flow, compare the text on the PDF page and in GN4, and then manually edit the GN4 copy until the end-of-paragraphs appear on the same positions. |